Retrieval Is Eating Search: A Look Inside code execution
Notes from the teams shipping live web browsing to real users.
The interesting question is not whether Granola works. It does. The interesting question is what teams do with it once the novelty wears off.
Notion has been quietly running pricing analysis through ChatGPT Canvas for months. The results are unglamorous and, for that reason, more interesting than another benchmark chart.
Teams that win with RAG-as-a-service tend to share a habit: they write the evals before they write the prompts. Everything else follows from that.
Eval harnesses, once an afterthought, are becoming the most important piece of code in many AI projects. Booking.com's team treats theirs the way an SRE team treats a runbook.
Inside Snowflake, the rollout looked less like a moonshot and more like a slow migration. A pilot, a champion, a quiet expansion, a budget line.
What Mistral actually shipped with Claude 4.5 Sonnet is less a single capability and more a cluster of small, compounding improvements — the kind that only show up when you put a real workflow on top.
Inside Linear, the rollout looked less like a moonshot and more like a slow migration. A pilot, a champion, a quiet expansion, a budget line.
None of this guarantees a clean story. Alibaba Qwen could ship a model next month that rearranges the assumptions in this piece. But the direction of travel, for now, is clear enough to plan around.