DeepSeek Claims Its ‘Reasoning’ Model Beats OpenAI’s o1 On Certain Benchmarks

Chinese AI lab DeepSeek has unveiled its open version of DeepSeek-R1, a reasoning model that the company claims rivals OpenAI’s o1 on various AI benchmarks. DeepSeek-R1, available under an MIT license on the Hugging Face platform, promises commercial usage without restrictions.

According to DeepSeek, the model outperforms o1 in benchmarks like AIME, MATH-500, and SWE-bench Verified, focusing on areas such as fact-checking, programming tasks, and solving word problems. The model’s ability to verify its own responses helps mitigate common errors in other AI systems, though it typically takes longer to deliver results due to its reasoning capabilities.

DeepSeek-R1 boasts 671 billion parameters, significantly larger than many other AI models, a factor that typically enhances its problem-solving skills. However, to cater to different use cases, DeepSeek also offers smaller, distilled versions of R1 ranging from 1.5 billion to 70 billion parameters, some of which can run on a laptop. While the full version requires high-end hardware, it remains 90-95% cheaper than OpenAI’s o1, making it an attractive option for developers. Hugging Face CEO Clem Delangue shared that R1 has already inspired over 500 derivative models, which collectively have reached 2.5 million downloads, a number five times greater than the original R1.

Despite its promise, R1 is not without its limitations. As a Chinese-developed model, it is subject to scrutiny by Chinese regulators who ensure its adherence to government-mandated values, which includes censorship on politically sensitive topics like Tiananmen Square or Taiwan’s autonomy. This filtering mechanism aligns with other Chinese AI systems, which tend to avoid controversial subjects.

The release of DeepSeek-R1 comes amid growing concerns in the U.S. regarding the global competitiveness of AI. The Biden administration is contemplating stricter export rules for AI technologies, particularly targeting Chinese companies. OpenAI has voiced its concern, with VP of policy Chris Lehane pointing out the rapid progress of Chinese AI labs, including DeepSeek, Alibaba, and Kimi, which are developing models that they claim can rival OpenAI’s offerings.

In a broader context, AI researcher Dean Ball suggests that these Chinese AI labs are establishing themselves as “fast followers,” producing models capable of running locally, without the oversight of external regulatory bodies.

Leave a Reply

Your email address will not be published. Required fields are marked *