**MENLO PARK, CA —** Meta Platforms has officially launched its highly anticipated Llama 4 Scout and Llama 4 Maverick open-source artificial intelligence models on April 5, 2026, marking a significant advancement in the accessible AI landscape. These new models are Meta's first to feature a Mixture-of-Experts (MoE) architecture and are natively multimodal, capable of processing text, images, and video inputs from the ground up. The release follows months of anticipation and a reported $135 billion investment in AI infrastructure by Meta for 2026, signaling the company's aggressive push to compete with proprietary offerings from industry giants.
Llama 4 Scout, the smaller of the two models, is designed for efficient long-context processing, boasting an industry-leading 10-million-token context window — the longest available in any open-weight model to date. This capability allows it to handle massive documents, entire codebases, and extensive user histories in a single pass, making it ideal for multi-document summarization and complex code reasoning. With 17 billion active parameters across 16 experts and 109 billion total parameters, Scout can be efficiently deployed on a single NVIDIA H100 GPU using Int4 quantization. Initial benchmarks indicate Scout outperforms previous-generation Llama 3 models, including the larger Llama 3.3 70B, as well as competitors like Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 across various benchmarks. Performance improvements from recent vLLM updates showed a 5.5% accuracy increase and a 60% reduction in latency for Llama 4 Scout, though with an uptick in hallucination rates.
Scaling up the MoE approach, Llama 4 Maverick maintains 17 billion active parameters but distributes computation across 128 experts, comprising 400 billion total parameters. Positioned as a high-performance general-purpose model, Maverick aims to be a direct competitor to leading proprietary models such as OpenAI’s GPT-4o and Google’s Gemini 2.0 Flash. Meta claims Maverick outperforms both across a broad range of benchmarks and achieved an ELO rating of 1417 on LMArena. It supports a 1-million-token context window and is particularly effective for conversational and creative functions, as well as complex reasoning tasks. On the DevQualityEval v1.0 coding benchmark, Maverick achieved a perfect 100% on code repair and showed significant improvements in Ruby test generation. However, independent testing has shown some weaker-than-advertised generalist performance.
Both Llama 4 Scout and Maverick are available as open-weight downloads on Hugging Face and llama.com under a Llama 4 Community License Agreement. This release underscores Meta's commitment to democratizing AI technology and providing developers with powerful, American-made open-source options, a strategy championed by Alexandr Wang, who heads Meta's "superintelligence" team. However, Meta is also reportedly adopting a "hybrid model" for its open-source strategy, intending to keep certain larger models or components proprietary for security and competitive advantage, reflecting a cautious shift in its approach. The Llama 4 Community License also includes a notable restriction, requiring a commercial license for organizations with over 700 million monthly users, and EU developers face restrictions due to regional AI regulations.
The developer community has largely welcomed the new models, especially the architectural innovations and multimodal capabilities that enable true, unified understanding of text and images through "early fusion". The MoE design provides strong performance with efficient resource utilization, making inference cheaper and allowing deployment on standard GPU hardware. While Meta's CTO Andrew Bosworth publicly expressed some disappointment with Llama 4's overall focus and performance in certain areas earlier in 2026, Scout and Maverick are still considered viable open-weight options, particularly for use cases requiring data sovereignty or high-volume, cost-minimized deployments. The closing gap between open-source and closed-source models is evident, with Llama 4 Maverick performing within 1-2 points of frontier closed models on coding benchmarks.
Looking ahead, Meta has previewed Llama 4 Behemoth, a powerful "teacher model" with approximately 2 trillion total parameters, which remains in training. Meta reports that Behemoth already outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM benchmarks, serving as the knowledge source for Scout and Maverick through distillation. While no public release date for Behemoth has been announced, more details are anticipated at Meta's LlamaCon event on April 29, 2026. The ongoing evolution of the Llama family, coupled with Meta's strategic investments and a nuanced open-source approach, suggests a continued push to shape the future of accessible and powerful AI, despite the fierce competition in the rapidly advancing field.
