The afternoon briefing.
Microsoft pushes into AI super apps, Anthropic faces China bans over Claude Code, and new research reveals AI benchmarks underestimate agent capabilities.
AI Agents & Capabilities. Microsoft is intensifying the AI super app race, planning to merge its consumer and enterprise Copilot apps by August and introduce new "AutoPilot" agents for background tasks. This strategic move aims to streamline user experience and expand AI's utility across various sectors, positioning Microsoft to compete directly with Anthropic and OpenAI. Concurrently, new research from the UK's AI Security Institute reveals that standard benchmarks systematically underestimate AI agent capabilities, with success rates on software engineering tasks jumping significantly when compute budgets are increased.
AI Security & Policy. Anthropic's Claude Code is facing a complex situation in China, with the company attempting to block access while Chinese firms find workarounds through VPNs and overseas subsidiaries. Adding to the concerns, Alibaba has banned its employees from using Claude Code due to discovered hidden code that could potentially identify Chinese users. This comes as the broader AI landscape sees a dramatic increase in security vulnerability reports, with AI models themselves contributing to a surge in identified bugs and vulnerabilities.
AI Applications & Industry Shifts. Anthropic is also venturing into new domains, launching Claude Science as an AI workbench designed to integrate fragmented tools and datasets, generate figures, and accelerate scientific discovery and drug development. This expansion highlights AI's growing impact beyond traditional software development. Meanwhile, the integration of AI is leading to significant workforce restructuring, as seen with Starling Bank's decision to cut 130 jobs amid an AI push. Google DeepMind is exploring creative applications, announcing a unique research partnership with independent entertainment company A24 to push boundaries in both fields.
Microsoft enters AI super app race with overhauled Copilot and AutoPilot agents
Microsoft plans to merge its consumer and enterprise Copilot apps into a single platform by August, introducing new AI agents called "AutoPilot" to handle background tasks for an extra fee. This move positions Microsoft to compete with Anthropic and OpenAI in the evolving AI super app market.
Security vulnerability reports explode as AI models hunt for bugs
Epoch AI reports a significant increase in security vulnerability reports, with 21 organizations reporting about 1,500 high-severity and critical CVEs in June 2026, a 3.5x jump from previous records. This surge correlates with the introduction of AI-powered bug-hunting programs.
UK's AI Security Institute finds benchmarks underestimate AI agent capabilities
The UK's AI Security Institute found that standard AI evaluations systematically underestimate agent capabilities by capping compute budgets. Success rates on software engineering tasks increased by 25% when token budgets were tenfold, suggesting frontier progress is 60% steeper than previously measured.
Claude Code faces bans and spyware concerns in China
Anthropic is attempting to block Chinese companies from accessing Claude Code, but firms like ByteDance are circumventing restrictions. Meanwhile, Alibaba has banned its employees from using the tool after hidden code, potentially identifying Chinese users, was discovered.
Anthropic wants to develop its own drugs with Claude Science
Anthropic has launched Claude Science, an AI workbench for scientists designed to integrate fragmented tools and datasets, generate figures, and accelerate scientific discovery. The company aims to leverage AI's potential to advance healthcare interventions, including drug development.
Apple turns Safari into something AI agents can control
Apple's WebKit team has shipped Safari Technology Preview 247 with a built-in Model Context Protocol (MCP) server, enabling AI agents to control Safari. This development marks a significant step towards deeper integration of AI capabilities within the browser environment.
Contrastive Decoding Diffing recovers verbatim finetuning data from LLM logits
Researchers have developed Contrastive Decoding Diffing (CDD), a method to recover verbatim finetuning data from narrowly finetuned LLMs using only grey-box logit access, without needing full weight access. This builds on previous work showing detectable traces in activation differences.
UK parents warned over posting children's images amid AI sexual abuse fears
The National Crime Agency and Internet Watch Foundation have issued guidance warning UK parents against publicly sharing images of their children online. This recommendation comes amidst a rising threat of AI-generated sexual abuse material.
Starling Bank to cut 130 jobs amid AI push
Starling Bank, a prominent digital bank, is reportedly planning to cut 130 jobs as it increases its focus on AI integration. This move reflects a broader trend of automation impacting the financial sector workforce.
Google DeepMind and A24 announce first-of-its-kind research partnership
Google DeepMind has partnered with A24, the independent entertainment company, for a new research collaboration. This marks a unique intersection of advanced AI research and creative industries.
Amazon updated 2023's Fire HD 10 tablet with 4GB of RAM
Amazon has quietly updated its 2023 Fire HD 10 tablet, with the 32GB version now shipping with 4GB of RAM, up from 3GB. This refresh also includes a small price increase from $139.99 to $154.99.
Valve open-sources Steam Machine e-ink screen for DIY projects
Valve has open-sourced the e-ink screen technology used in its Steam Machine, allowing enthusiasts to create their own custom devices. This move could foster innovation and customization within the gaming hardware community.
Startup targets datacenters with 3D-printed nuclear reactor module
A startup is developing a 3D-printed thorium microreactor designed to provide up to 30 MWe of power for datacenters for up to 30 years. This innovative approach aims to offer a long-term, high-capacity energy solution for demanding computing infrastructure.
First known congressional SpaceX stock buys surface after record IPO
The first known instances of congressional members purchasing SpaceX stock have emerged following the company's record IPO. These disclosures highlight the deepening ties between Elon Musk's company and federal contracting.