OpenAI's New Model On Par With PhD? I Had Tsinghua PHD Review It Wake Up

Select Language:

In a surprising move, OpenAI has quietly released its long-anticipated new model, the Oepn AI o1, without any prior announcement. This release has sparked mixed reactions from users, especially considering the company previously teased the model with unrelated images, leading to speculation about its features.

The Oepn AI o1 is being hailed as the most advanced and consistent model developed by OpenAI to date. While the company has not made grand proclamations about its capabilities, the initial results shared indicate significant improvements over its predecessor, GPT-4o. For instance, the model reportedly shows nearly eightfold enhancements in various benchmarks, including international mathematics competitions and doctoral-level scientific queries.

In head-to-head tests, the new model outperformed human experts, scoring 78 against the average score of 69.7 for PhD holders on complex scientific questions. This has led to a flurry of online discussions, with netizens dubbing it a “new god” in the realm of artificial intelligence, celebrating its impressive performance with accolades like “incredible” and “the closest thing to human reasoning.”

However, the capabilities of the Oepn AI o1 come at a cost. The preview version is priced at $15 per million inputs and $60 per million outputs, making it a significant investment for users seeking advanced AI interaction. In its current form, users are limited to a modest number of queries each week, with 30 allowed on the preview version and 50 on the mini version.

To test the model’s claims, several experts were enlisted to assess its performance. Dr. Cui, a solid-state physics PhD candidate from Nanjing University, provided a notably high evaluation, scoring the model between 60 to 80 on difficult technical questions. After asking about overcoming white noise in long-distance entangled photon distribution, the Oepn AI o1 generated ten feasible solutions in just about nine seconds.

Yet, when deeper, more intricate questions were posed, concerns arose regarding the model’s limitations. While its answers were deemed generally satisfactory, there were instances where the AI struggled with creativity and practical application, particularly in areas requiring in-depth knowledge like chemistry and biology.

Dr. Xin from Tsinghua University, studying biology, pointed out significant flaws in the model’s fabricated citations in its references despite its surface-level improvements in comprehension. Dr. K from Peking University echoed similar sentiments, asserting that while the AI demonstrated a respectable level of understanding, it fell short in providing innovative insights.

The consensus among the experts appears to be that while the Oepn AI o1 represents a step forward, it has yet to achieve the creative problem-solving skills and nuanced understanding that human experts possess. OpenAI researchers, including Noam Brown, have indicated that future iterations could involve even more advanced reasoning capabilities, potentially allowing the model to engage in prolonged thought processes for complex tasks.

As the field of artificial intelligence continues to evolve, the introduction of the Oepn AI o1 model may signal a significant leap forward in how machines can process and analyze information. However, the journey toward achieving true artificial general intelligence (AGI) remains ongoing, and experts encourage a cautious approach to its current capabilities.