Daily Digest — 2026-05-25

Sunday, May 24, 2026 · 2 items · model: deepseek/deepseek-chat

2 items · 2 industry media

🏛️ Research Labs

No new items today.

📜 arXiv Papers

No new items today.

📰 Industry Media (2)

Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4’s 33.5%

MarkTechPost · Asif Razzaq · 2026-05-24

Microsoft Research introduces Webwright, a terminal-native web agent framework that replaces traditional action-by-action browser control with programmatic script generation. The system employs a single-agent loop (Runner, Model Endpoint, Environment) where models generate Playwright code and bash commands, storing reusable artifacts in a local workspace. Evaluated on Online-Mind2Web and Odysseys benchmarks, Webwright achieves 60.1% accuracy (35.1% relative improvement over SOTA) with GPT-5.4, while maintaining cost efficiency ($2.37/task vs Claude Opus 4.7's $6.09). The 1,000-line open-source framework demonstrates smaller models like Qwen3.5-9B can achieve 66.2% accuracy when augmented with tool scripts.

webwrightplaywrightterminal-nativeself-reflection configodysseys benchmark

NVIDIA AI Releases Gated DeltaNet-2: A Linear Attention Layer That Decouples Erase and Write in the Delta Rule

MarkTechPost · Asif Razzaq · 2026-05-24

NVIDIA AI introduces Gated DeltaNet-2, a linear attention layer that decouples erase and write operations in the delta rule, addressing the bottleneck of editing compressed memory without disrupting existing associations. The model employs channel-wise erase and write gates, trained on 100B FineWeb-Edu tokens at 1.3B parameters. It outperforms Mamba-2, Gated DeltaNet, KDA, and Mamba-3 across benchmarks, achieving a recurrent average of 53.11 on language modeling and commonsense reasoning, and significant gains in long-context retrieval tasks.

linear attentiondelta rulechannel-wise gatesfineweb-edurecurrent state


Generated automatically at 2026-05-24 20:05 UTC. Summaries and keywords are produced by an LLM and may contain inaccuracies — always consult the original article.