How Moonshot AI beat GPT-5 and Claude at a fraction of the cost

Chinese AI startup Moonshot has upended expectations in artificial intelligence development after its Kimi K2 Thinking model outperformed OpenAI’s GPT-5 and Anthropic’s Claude Sonnet 4.5 on multiple performance benchmarks, sparking a new debate about whether U.S. AI dominance is being challenged by cost-effective Chinese innovation.

Beijing-based Moonshot AI (valued at $3.3 billion), backed by tech giants Alibaba Group Holding and Tencent Holdings, released its open source KimiK2 thinking model on November 6, achieving what industry insiders are calling a new “deep-seeking moment.” This is a reference to the Hangzhou-based startup’s previous smashing of AI cost assumptions.

🚀 Hello, I’m K2 Thinker!
Introducing an open source thinking agent model.

🔹 SOTA (44.9%) and BrowseComp (60.2%) on HLE
🔹 Perform up to 200-300 consecutive tool calls without human intervention
🔹 Great for inference, agent search, and coding
🔹 256K Context Window

Built… pic.twitter.com/lZCNBIgbV2

— kimi.ai (@Kimi_Moonshot) November 6, 2025

Performance indicators become a challenge for the US model

According to the company’s GitHub blog post, Kimi K2 Thinking scored 44.9% on Humanity’s Last Exam, a large-scale language model benchmark consisting of 2,500 questions across a wide range of subjects, beating GPT-5’s 41.7%.

The model also achieved a score of 60.2% on the BrowseComp benchmark, which assesses the web browsing proficiency and persistence of information exploration of large-scale language model agents, and led with a score of 56.3% on the Seal-0 benchmark, which is designed to compete with search expansion models on real-world research queries.

VentureBeat reported that the fully open-weight release meeting or exceeding GPT-5 scores marks a tipping point in high-end reasoning and coding, where the gap between closed-frontier systems and publicly available models has essentially collapsed.

Kimi K2 Thoughts is a new major open weight model. Although it has special strengths in the agent context, it is highly verbose and generates more tokens than any other model when completing intelligence index evaluations.

— ArtificialAnlys (@ArtificialAnlys) November 7, 2025

Questions arise about cost efficiency

The model gained popularity after CNBC reported that its training cost was only US$4.6 million, but Moonshot AI would not comment on the cost. According to South China Morning Post calculations, the cost of Kimi K2 Thinking’s application programming interface was six to 10 times cheaper than the cost of models from OpenAI and Anthropic.

The model uses an expert mixture architecture with a total of 1 trillion parameters, 32 billion of which are activated for each inference, and trained using INT4 quantization, achieving an approximately 2x generation speed increase while maintaining state-of-the-art performance.

Thomas Wolf, co-founder of Hugging Face, commented on X that Kimi K2 Thinking is another case of the open source model overtaking the closed source model, asking, “Is this another DeepSeek moment? Should we now expect it every few months?”

Technical capabilities and limitations

Moonshot AI researchers say Kimi K2 Thinking has set “new records across reasoning, coding, and benchmarks assessing agent capabilities.” The model can perform up to 200-300 consecutive tool calls without human intervention and makes consistent inferences over hundreds of steps to solve complex problems.

In independent testing by consultancy Artificial Analysis, Kimi K2 topped the Tau-2 Bench Telecom agent benchmark with 93% accuracy, which is said to be the highest score independently measured by the company.

However, Nathan Lambert, a researcher at the Allen Institute for AI, acknowledged that Chinese labs have come close and performed very well on key benchmarks, but suggested that there is still a lag of about four to six months in raw performance between the best closed and open models.

Market impacts and competitive pressures

Zhang Ruiwang, a Beijing-based information technology system architect, said Chinese companies tend to keep costs down. “The overall performance of Chinese models still lags behind top American models, so they need to compete in the cost-effectiveness area to find a way out.”

Zhang Yi, principal analyst at consulting firm iiMedia, said the cost of training AI models in China has seen a “cliff-like decline” due to innovations in model architecture and training technology, and input of high-quality training data, marking a shift away from the vast computing resources of the early days.

This model was released under the modified MIT License, which grants full commercial and derivative rights, with one restriction. Deployers that serve more than 100 million monthly active users or generate more than $20 million in monthly revenue must prominently display “Kimi K2” on their product’s user interface.

Industry reaction and future outlook

“Today is a turning point in AI. China’s open source model is number one. A defining moment in AI,” wrote Deedy Das, a partner at early-stage venture capital firm Menlo Ventures, in a post on X.

🚨 Today is a tipping point for AI. China’s open source model is number one.

Kim K2 Thinking scored 51% on the Humanity final exam. This was higher than GPT-5 and all other models. $0.6/M inch, $2.5/M output.

Best at writing, achieving 15tps on two Mac M3 Ultras.

A breakthrough moment in AI.

Try it… pic.twitter.com/fmxlxpCGbE

— Deedy (@deedydas) November 7, 2025

Nathan Lambert wrote in a Substack article that the success of Chinese open source AI developers like Moonshot AI and DeepSeek shows how they have “made closed labs sweat,” adding, “There are serious pricing pressures and expectations that[US developers]have to deal with.”

This release positions Moonshot AI alongside other Chinese AI companies such as DeepSeek, Qwen, and Baichuan, which are increasingly challenging the narrative of American AI dominance through cost-effective innovation and open source development strategies.

Whether this represents a sustainable competitive advantage or a temporary convergence of capabilities remains to be seen as US and Chinese companies continue to evolve their respective models.

The public nature of the statement and the market reaction suggest that substantive discussions may begin soon.

The AI chip landscape is in a period of flux. Organizations should maintain flexibility in their infrastructure strategies and monitor how partnerships like Tesla and Intel reshape the competitive dynamics of AI hardware manufacturing.

Decisions made today regarding chip manufacturing partnerships could determine which organizations have access to cost-effective, high-performance AI infrastructure in the years to come.

Photo provided by Moonshot AI)

SEE ALSO: Disrupting DeepSeek: China’s AI innovation narrows global technology gap

Want to learn more about AI and big data from industry leaders? Check out the AI & Big Data Expos in Amsterdam, California, and London. This comprehensive event is part of TechEx and co-located with other major technology events. Click here for more information.

AI News is brought to you by TechForge Media. Learn about other upcoming enterprise technology events and webinars.

versatileai

See Full Bio

What's Hot

Compact, multilingual, built for the edge

Inspire your creativity with new generative media models and tools.

Rowspace raises $50M to take private equity AI out of the back office

Compact, multilingual, built for the edge

Inspire your creativity with new generative media models and tools.

Rowspace raises $50M to take private equity AI out of the back office

Gemini’s Security Safeguard Advance – Google DeepMind

Wix Get 1 hour to expand generative AI capabilities and accelerate product innovation – TradingView News

Competitive programming with AlphaCode-Google Deepmind

Most Popular

Gemini’s Security Safeguard Advance – Google DeepMind

Wix Get 1 hour to expand generative AI capabilities and accelerate product innovation – TradingView News

Competitive programming with AlphaCode-Google Deepmind

Don't Miss

Compact, multilingual, built for the edge

Inspire your creativity with new generative media models and tools.

Rowspace raises $50M to take private equity AI out of the back office

Subscribe to Updates

What's Hot

How Moonshot AI beat GPT-5 and Claude at a fraction of the cost

Performance indicators become a challenge for the US model

Questions arise about cost efficiency

Technical capabilities and limitations

Market impacts and competitive pressures

Industry reaction and future outlook

Related Posts