Qwen 2.5-Max outperforms the DeepSeek V3 in several benchmarks

Alibaba’s response to Deepseek is the Qwen 2.5-Max, the company’s latest large-scale model of Experts (MOE).

Qwen 2.5-Max boasts fine-tuning through cutting-edge techniques such as pre-deleted 20 trillion tokens and reinforcement learning from monitored fine-tuning (SFT) and human feedback (RLHF).

With the API now available via Alibaba Cloud and models that allow exploration access via Qwen Chat, Chinese technology giants are inviting developers and researchers to see their breakthroughs firsthand.

Out-Performance Peers

Comparing the QWEN 2.5-Max performance with some of the most prominent AI models in various benchmarks, the results are promising.

Evaluations include general ratings such as MMLU-Pro for university-level problem solving, LiveCodebench for coding expertise, live bench for overall functionality, arena hard for assessing models against human preferences. Metrics were included.

According to Alibaba, “Qwen 2.5-Max outperforms the DeepSeek V3 in benchmarks such as Arena-Hard, LiveBench, LiveCodebench and GPQA-Diamond, showing competitive results in other ratings, including the MMLU-Pro.”

(Credit: Alibaba)

Instructional models designed for downstream tasks such as chat and coding are directly competing with major models such as GPT-4O, Claude-3.5-Sonnet, and Deepseek V3. Of these, the Qwen 2.5-Max has managed to outperform its rivals in several key areas.

Comparisons of the base model also provided promising results. Original models such as the GPT-4O and Claude-3.5-Sonnet remained out of reach due to access restrictions, but the Qwen 2.5-Max has a Deepseek V3, Llama-3.1-405B (the largest open weight density model). was evaluated against major public options such as: qwen2.5-72b. Again, the newcomer at Alibaba gave an exceptional performance across the board.

“Our base model shows great advantages across most benchmarks,” Alibaba said.

The Deepseek V3 burst has attracted attention from the AI community as a whole for its large-scale MOE models. At the same time, we are building QWEN2.5-MAX. This is a large MOE LLM trained with curated SFT and RLHF recipes pre-processed with large data. Competitive… pic.twitter.com/ohvl16vfje

– Qwen (@alibaba_qwen) January 28, 2025

Make it accessible to Qwen 2.5-Max

To make the model more accessible to the global community, Alibaba has integrated QWEN 2.5-MAX with the QWen chat platform. Here, users can interact directly with a variety of abilities models. Investigate search capabilities and test your understanding of complex queries.

For developers, the QWen 2.5-Max API is now available through Alibaba Cloud under the model name “QWEN-MAX-2025-01-25”. Interested users can start by registering an Alibaba Cloud account, activating the Model Studio service, and generating an API key.

The API is compatible with Openai’s ecosystem and makes it easy to integrate existing projects and workflows. This compatibility reduces the barriers for people who are keen to use the features of the model to test their applications.

Alibaba has issued a strong intent statement on the Qwen 2.5-Max. The company’s continued commitment to scaling its AI model not only improves performance benchmarks, but also enhances the basic thinking and reasoning capabilities of these systems.

“Scaling data and model sizes not only shows advances in model intelligence, but also reflects an unwavering commitment to pioneering research,” Alibaba said.

Going forward, the team is aiming to push the boundaries of reinforcement learning to promote even more advanced inference skills. This, they say, could allow their models to not only outweigh, but exceed, human intelligence in solving complex problems.

The impact on the industry is profound. As scaling methods improve and Qwen models open new ground, there could be more ripples across the AI-driven fields we’ve seen in recent weeks.

(Photo by Maico Amorim)

See: ChatGpt Gov aims to modernize US government agencies

Want to learn more about AI and big data from industry leaders? Check out the AI & Big Data Expo in Amsterdam, California and London. The comprehensive event will be held in collaboration with other major events, including the Intelligent Automation Conference, Blockx, Digital Transformation Week, and Cyber Security & Cloud Expo.

Check out other upcoming Enterprise Technology events and webinars with TechForge here.

See Full Bio

What's Hot

Microsoft cloud updates support Indonesia’s long-term AI goals

How Picsart’s AI image generator works

Breakthrough in adversarial learning enables real-time AI security

Microsoft cloud updates support Indonesia’s long-term AI goals

Breakthrough in adversarial learning enables real-time AI security

Open Source AI Game Jam Results

ChatGPT group chats can help teams bring AI to their daily planning

Google launches Nano Banana Pro, focused on more reliable AI art generation

AI company Klay Vision signs licensing agreement with major label

Most Popular

ChatGPT group chats can help teams bring AI to their daily planning

Google launches Nano Banana Pro, focused on more reliable AI art generation

AI company Klay Vision signs licensing agreement with major label

Don't Miss

Microsoft cloud updates support Indonesia’s long-term AI goals

How Picsart’s AI image generator works

Breakthrough in adversarial learning enables real-time AI security

Subscribe to Updates

What's Hot

Qwen 2.5-Max outperforms the DeepSeek V3 in several benchmarks

Out-Performance Peers

Make it accessible to Qwen 2.5-Max

Related Posts