Anthropic Sues the Pentagon, Meta's AI Flops, xAI Falls Apart

Week in Review | The week Anthropic sued the Pentagon and launched a research institute in the same breath, Meta's next-gen model flopped before it shipped, nine of eleven xAI co-founders were gone, and two rival labs each raised a billion dollars to build world models.

Watch our video: on Youtube or Rumble

The Big Story: Anthropic's Triple Play

Anthropic had the most consequential week of any AI company in 2026 so far — three major moves in four days that together redraw the company's position in the industry.

The lawsuit. On March 9, Anthropic filed two federal lawsuits against the Department of Defense, challenging the "supply chain risk" designation that Defense Secretary Pete Hegseth imposed after the company refused to grant unrestricted military access to Claude. In a 48-page filing in the Northern District of California, Anthropic argued the designation is "unprecedented and unlawful" — retaliation for protected speech, not a legitimate security finding. The supply-chain-risk label, normally reserved for firms tied to foreign adversaries, blocks Anthropic from doing business with any Pentagon contractor. The first court hearing is set for March 24.

The institute. Two days later, Anthropic launched the Anthropic Institute, a dedicated research arm led by co-founder Jack Clark. The institute merges three existing teams — Frontier Red Team, Societal Impacts, and Economic Research — into an interdisciplinary group of ML engineers, economists, and social scientists. Notable hires include Matt Botvinick (formerly Google DeepMind, now leading AI-and-rule-of-law research) and Anton Korinek (University of Virginia, leading economic transformation research). Anthropic is also opening a D.C. policy office this spring.

The partner network. On March 12, Anthropic committed $100 million to launch the Claude Partner Network — free to join, with funds going to training, sales enablement, and co-marketing for organizations deploying Claude in enterprises. The company is scaling its partner-facing team fivefold and introduced the Claude Certified Architect credential for solution architects. More than 25% of the Fortune 500 are already using Promptfoo (now OpenAI's — more on that below), so the partner play is clearly aimed at locking in the enterprise channel before competitors do.

Why it matters for practitioners: The lawsuit sets a precedent that will define how every AI company negotiates government contracts going forward. The institute signals Anthropic is investing in understanding downstream effects of its own technology — including recursive self-improvement, which is directly relevant to our W11 tutorial work. And the partner network means Claude integrations are about to get significantly more support in enterprise environments. If you're building on Claude's API, the certified architect program is worth watching.

Meta's Avocado: $135 Billion and Nothing to Show (Yet)

Meta delayed the launch of its next-generation AI model, codenamed Avocado, pushing it from March to at least May after internal testing revealed it underperforms the competition.

The numbers are embarrassing for a company spending $115–135 billion on AI infrastructure this year. Avocado performs between Google's Gemini 2.5 and Gemini 3.0 — meaning it trails both Gemini 3.0 and Anthropic's Claude on logical reasoning, coding, and writing tasks. That's not frontier; that's last quarter.

The most telling detail: Meta's leadership discussed temporarily licensing Google's Gemini to power certain Meta products while Avocado catches up. The company that open-sourced Llama as a competitive weapon against closed-model providers is now considering renting a rival's model to fill the gap.

Meta stock fell over 3% on the news, settling around $618.

Meta has aggressively poached talent across the industry — reportedly spending billions on researcher salaries — but the Avocado delay suggests that hiring alone doesn't produce frontier models. Training infrastructure, data quality, and institutional knowledge matter at least as much as headcount.

Why it matters for practitioners: If you've been building on Llama models and counting on Meta to keep pace with the frontier, the Avocado delay is a yellow flag. The open-source ecosystem Meta has cultivated remains strong (Llama 4 and the existing Qwen 3.5 series are excellent), but the gap between Meta's open-source releases and the true frontier appears to be widening, not closing. Plan accordingly if you're making architectural bets on Meta's model roadmap.

xAI Implodes: 9 of 11 Co-Founders Gone, Musk Starts Over

The slow-motion collapse at xAI accelerated this week. Nine of the company's eleven original co-founders have now departed, with six leaving since January alone — Toby Pohlen, Jimmy Ba, Tony Wu, Greg Yang, Zihang Dai, and Guodong Zhang. Only Manuel Kroiss and Ross Nordeen remain from the founding team.

Musk's public response was characteristically blunt: "xAI was not built right first time around, so is being rebuilt from the foundations up." He ordered sweeping layoffs focused on the coding division after expressing frustration with Grok's inability to compete with Claude Code and OpenAI Codex.

The rebuild strategy: poach from the company that's winning. xAI hired Andrew Milich and Jason Ginsberg from Cursor, the AI coding startup that hit $2 billion in annualized revenue. Both report directly to Musk, with a mandate to rebuild Grok's coding capabilities from scratch.

This follows Musk's prediction from an all-hands meeting two weeks ago that AI will bypass traditional coding entirely by the end of 2026. The irony is hard to miss: the man betting that coding is dead is spending aggressively to build a coding product.

Why it matters for practitioners: The xAI story is a cautionary tale about organizational stability in AI companies. Losing 82% of your founding team in under three years — including key researchers behind architectures like Tensor Programs (Greg Yang) and training optimization — means institutional knowledge walks out the door. If you're evaluating Grok for production use, the leadership churn is a reliability risk independent of the model's technical capabilities.

World Models: Two Billion-Dollar Bets, Two Visions

The world models space just became a two-horse race between two of the most influential figures in computer vision.

AMI Labs — founded by Turing Award winner Yann LeCun after leaving Meta — raised $1.03 billion at a $3.5 billion pre-money valuation, the largest seed round in European startup history. The round drew heavyweight backers: Bezos Expeditions, Eric Schmidt, Mark Cuban, and Xavier Niel, alongside institutional funds. The leadership team includes former Meta VP Laurent Solly as COO and researcher Saining Xie as chief science officer.

AMI's thesis is that LLMs are fundamentally the wrong architecture for understanding physical reality. LeCun has argued for years that autoregressive text prediction cannot produce genuine world understanding. AMI is building on his Joint Embedding Predictive Architecture (JEPA), targeting industrial, robotic, and healthcare applications where hallucinating physical outcomes has real consequences.

World Labs — founded by Fei-Fei Li, the Stanford professor who created ImageNet — raised $1 billion in February, including a $200 million strategic investment from Autodesk, at a reported $5 billion valuation. World Labs shipped Marble, its generative world model product, in November — already producing persistent 3D environments from text, images, and video.

The contrast is instructive. AMI is research-first: build the right architecture (JEPA), then find applications. World Labs is product-first: ship a world model into creative workflows (Autodesk integration), then deepen the research. Both have a billion dollars, legendary founders, and a conviction that the next frontier of AI isn't language — it's physics.

Why it matters for practitioners: World models are directly relevant to our research focus (W13–W15 rotation). The AMI vs. World Labs split mirrors a fundamental architectural debate: embedding-based prediction (JEPA) vs. generative modeling (diffusion/autoregressive). If you're working in robotics, simulation, or 3D content, both approaches are worth tracking — the right architecture may depend on whether your application needs prediction or generation.

OpenAI Acquires Promptfoo: Agent Security Gets Serious

OpenAI acquired Promptfoo, the open-source AI security platform used by over 350,000 developers and 25% of the Fortune 500, on March 9.

Promptfoo specializes in red-teaming and vulnerability detection for AI systems during development — exactly the kind of tooling that becomes critical as AI agents take autonomous actions in production. The technology will be integrated into OpenAI Frontier, the company's platform for building agentic AI workflows.

The acquisition is relatively small (Promptfoo was valued at $86 million after its Series A), but the signal is significant. As agents gain the ability to execute code, make API calls, and take real-world actions, the attack surface expands dramatically. OpenAI is betting that security tooling needs to be built into the development platform, not bolted on after deployment.

Promptfoo's team confirmed the product will remain open source.

Why it matters for practitioners: If you're using Promptfoo for LLM eval and red-teaming (and many teams are), the tool isn't going away. But the long-term integration into OpenAI Frontier suggests that the most advanced security features may eventually become platform-specific. Consider whether your security testing pipeline should remain vendor-neutral.

Open-Source Video Generation Goes 4K

Two open-source video models dropped this week that redefine what's possible without a cloud API.

LTX 2.3 from Lightricks is a 22-billion-parameter Diffusion Transformer that generates synchronized video and audio in a single pass — native 4K at 50 FPS, up to 20 seconds per clip, with portrait-mode support at 1080×1920. The rebuilt VAE delivers sharper textures and edge detail, a new gated attention text connector improves prompt adherence, and the companion desktop editor now open-sourced on GitHub runs the entire model locally on consumer hardware. This is production-grade video generation running on your laptop.

Helios, a 14B autoregressive diffusion model from Peking University, ByteDance, and Canva, takes a different approach: real-time generation. It produces video at 19.5 FPS on a single H100, supports text-to-video, image-to-video, and video-to-video workflows, and handles clips up to 60 seconds. Released under Apache 2.0.

Also notable: HunyuanVideo WorldPlay from Tencent released its RL post-training code on March 8, enabling community training of interactive world models that run at 24 FPS — a direct bridge between video generation and the world models work we're covering in W13.

Why it matters for practitioners: The architectural split — diffusion transformer (LTX) vs. autoregressive diffusion (Helios) — mirrors the text-generation debate between diffusion and autoregressive approaches. If you're building video-generation pipelines, both are worth benchmarking. The HunyuanVideo RL code is particularly interesting for anyone exploring video-based world models.

By the Numbers

$1.03 billion — AMI Labs' seed round, the largest in European startup history. LeCun's bet that JEPA beats autoregressive prediction for physical world modeling.
$1 billion — World Labs' latest raise, with $200M from Autodesk alone. Fei-Fei Li's generative world model is already shipping product.
9 of 11 — xAI co-founders who have departed. Only Kroiss and Nordeen remain from the founding team.
$115–135 billion — Meta's 2026 AI infrastructure spend, the largest in the industry. The Avocado delay raises the question: what is that money producing?
$100 million — Anthropic's initial commitment to the Claude Partner Network. Enterprise channel war is on.
350,000+ — Developers using Promptfoo for AI security testing before OpenAI acquired it.
22 billion — Parameters in LTX 2.3, generating synchronized 4K video and audio in a single forward pass.
19.5 FPS — Helios' real-time video generation speed on a single H100. Apache 2.0 licensed.

What to Watch Next Week

NVIDIA GTC 2026 keynote (March 16) — Jensen Huang unveils Rubin architecture, NemoClaw agent framework, and the next generation of inference hardware. This will dominate next week's news.
Anthropic v. Pentagon first hearing (March 24) — The court date that could reshape government AI procurement. Watch for any pre-hearing motions or settlement signals.
DeepSeek V4 independent benchmarks — The model has been trickling out, but community reproduction of the claimed results is still pending. If the coding benchmarks hold, the open-source landscape shifts again.
Meta Avocado recovery timeline — May is the new target, but internal pressure to ship could push it earlier. Any leaked benchmarks will be closely watched.
Apple iOS 26.4 and the Gemini-powered Siri — The reimagined Siri is reportedly weeks away. Whether it can compete with Google's on-device agents remains to be seen.

Stay connected:

📧 Subscribe to our newsletter for updates
📺 Watch our YouTube channel for AI news and tutorials
🐦 Follow us on Twitter for quick updates
🎥 Check us on Rumble for video content