Caroline Bishop
Feb 17, 2026 18:34
Claude’s new dynamic filtering function cuts enter tokens by 24% whereas bettering search accuracy. Opus 4.6 hits 61.6% on BrowseComp benchmark.
Anthropic has rolled out a major improve to Claude’s net search capabilities, with the AI assistant now writing and executing code on the fly to filter search outcomes earlier than processing them. The advance delivers a median 11% accuracy achieve whereas consuming 24% fewer enter tokens, in accordance with the corporate’s inside benchmarks.
The replace, launched alongside Claude Opus 4.6 and Sonnet 4.6, addresses a persistent problem in AI-powered net search: context window bloat. Conventional search instruments pull whole HTML information into reminiscence, a lot of it irrelevant noise that degrades response high quality and burns by way of tokens.
How Dynamic Filtering Works
Slightly than reasoning over uncooked HTML dumps, Claude now dynamically generates code to post-process question outcomes. The system retains related knowledge and discards the remaining earlier than something hits the context window. Consider it because the AI constructing its personal customized search scraper in real-time.
Anthropic examined the strategy on two business benchmarks. On BrowseComp—which measures an agent’s capability to search out intentionally hard-to-find info throughout a number of web sites—Opus 4.6 jumped from 45.3% to 61.6% accuracy. Sonnet 4.6 climbed from 33.3% to 46.6%.
DeepsearchQA, which assessments systematic multi-step analysis with many appropriate solutions, confirmed related good points. Opus 4.6’s F1 rating rose from 69.8% to 77.3%, whereas Sonnet 4.6 improved from 52.6% to 59.4%.
Actual-World Validation
Quora’s Poe platform, which serves hundreds of thousands of customers throughout 200+ AI fashions, has already examined the improve internally. “The mannequin behaves like an precise researcher, writing Python to parse, filter, and cross-reference outcomes reasonably than reasoning over uncooked HTML in context,” mentioned Gareth Jones, the corporate’s Product and Analysis Lead. Quora discovered Opus 4.6 with dynamic filtering achieved the very best accuracy in opposition to different frontier fashions on their inside evaluations.
Token Economics Get Difficult
Price implications fluctuate by use case. Value-weighted tokens decreased for Sonnet 4.6 throughout each benchmarks, however really elevated for Opus 4.6—the extra highly effective mannequin generally writes extra advanced filtering code. Anthropic recommends builders benchmark in opposition to their particular question patterns earlier than deployment.
Dynamic filtering ships enabled by default for the brand new net search and net fetch instruments on the Claude API. The corporate additionally graduated a number of associated instruments to normal availability: code execution sandboxes, persistent reminiscence throughout conversations, programmatic instrument calling, and dynamic instrument discovery.
For builders constructing search-heavy purposes—suppose analysis assistants, quotation verification instruments, or aggressive intelligence bots—the improve may meaningfully reduce operational prices whereas bettering output high quality. The API documentation is reside now on Claude’s developer platform.
Picture supply: Shutterstock


