ChatGPT o1-preview Crushes PhD-Level Physics: Has AI Mastered Advanced Problem-Solving?

ChatGPT o1-preview has set a new benchmark for AI capabilities in academia. "OpenAI wasn’t exaggerating when they said it’s at a PhD level," he concluded. "This model is capable of reasoning and p...
ChatGPT o1-preview Crushes PhD-Level Physics: Has AI Mastered Advanced Problem-Solving?
Written by Ryan Gibson

OpenAI’s release of the o1-preview model on September 12, 2024, has sparked a fresh wave of speculation and excitement in the AI community. Marketed as having the reasoning capabilities of a PhD student across challenging domains like physics, chemistry, and biology, the model promises to tackle advanced problems with unprecedented accuracy.

But the real question remains: Can it truly solve PhD-level problems? One physicist took this claim to the test, using perhaps the most notorious textbook in the field—Jackson’s Classical Electrodynamics.

Listen to our conversation on how the ChatGPT o1-preview Crushes PhD-Level Physics:

 

The Challenge: Can AI Tackle Jackson’s Infamous Problems?

Kyle Kabasares, a seasoned physicist, decided to put the new ChatGPT model through its paces with a set of problems from Jackson’s Classical Electrodynamics—a textbook infamous for its complex, unforgiving problems. “Anyone who has studied physics at an advanced level, either for their master’s or PhD, has heard about this legendary book. The problems are hard, and the material isn’t very explanatory. It kind of assumes you just know everything,” Kabasares remarked. The challenge was clear: Could ChatGPT o1-preview handle these problems as efficiently as a human PhD student?

Kabasares selected three problems from different sections of the book, spanning from relatively straightforward calculations to more complex, multi-step derivations. The goal was not only to see if the model could arrive at the correct solutions but to also assess its approach to these notoriously difficult problems.

“I wanted to see if it could reason through the steps like a graduate student would,” Kabasares said. “The model is designed to take its time and ‘think’ before answering, which was what intrigued me the most.”

A Close Look at the Process: Tackling the First Problem

The first problem Kabasares selected was from the early chapters of Jackson’s textbook, involving the potential inside a hollow, conducting cylinder. After inputting the problem into ChatGPT o1-preview, Kabasares sat back and watched the AI work through it. “Setting up the problem, mapping solutions, reflecting on symmetry—its thought process was remarkably human-like,” he said. The model calculated Fourier series coefficients and analyzed boundary conditions in real-time, making adjustments and backtracking when necessary.

“There was this moment where it seemed like it got stuck and then backtracked, almost like a grad student who realizes halfway through that they’re missing something fundamental. Then, it reapproached the problem, corrected itself, and reached the right answer.”

The result? ChatGPT o1-preview solved the first problem in just under two minutes. “That’s unheard of,” Kabasares noted. “The average grad student takes about a week and a half to solve a Jackson problem. This thing did it in 122 seconds.”

While Kabasares was impressed by the speed, he was particularly struck by the AI’s ability to break down the problem methodically, identifying key mathematical tools like separation of variables and applying them appropriately. “It wasn’t just spitting out an answer,” he said. “It was thinking.”

Escalating the Difficulty: Can the AI Handle Magnetism?

With the first challenge complete, Kabasares moved on to a more difficult problem from the middle of the book, involving mutual inductance between coaxial loops. This problem, rich in elliptic integrals and complex expressions, was designed to push the AI further. “This was a tougher one,” Kabasares admitted. “The kind of problem that usually makes grad students groan.”

ChatGPT o1-preview, however, handled the problem with surprising efficiency. Within seconds, it began deriving mutual inductance expressions, seamlessly transitioning into elliptic integral calculations. “It didn’t just get the first part right—it nailed it,” Kabasares said, clearly surprised. “It got the whole thing done in 21 seconds.”

The AI’s ability to not only solve the problem but to do so with clear reasoning and step-by-step derivations left Kabasares in awe. “Where was this when I was in grad school?” he joked. “The time savings alone are astronomical. What takes humans days, it does in under a minute.”

The Final Challenge: A Two-Part Complex Problem

Kabasares saved the most difficult task for last: a two-part problem deep within Jackson’s textbook, involving the scattering cross-section of electromagnetic waves. “This is where I thought it might hit a wall,” Kabasares explained. “It’s deep within the textbook, and it’s the kind of problem that even seasoned physicists struggle with.”

As the AI began working through the problem, Kabasares watched closely, noting that the model seemed to engage in a “thoughtful” process. “It was almost eerie,” he said. “The way it said ‘I’m digging in’—it was like watching a grad student realize they’re close to a breakthrough.”

Despite the complexity, ChatGPT o1-preview once again delivered the correct answer, neatly solving both parts of the problem in record time. “It was shocking,” Kabasares said. “It didn’t just get it right—it did it in a way that a human would, breaking down the steps, applying the right mathematical tools, and even backtracking when needed.”

The Implications: What Does This Mean for Higher Education?

After witnessing the AI solve all three Jackson problems in under five minutes—a task that would take most graduate students weeks—Kabasares was left with mixed emotions. “I’m impressed, no question,” he said. “But I’m also wondering what this means for the future of education. If this tool can solve PhD-level problems in seconds, what does that mean for students?”

Kabasares raised concerns about academic integrity, noting that ChatGPT could easily be used by students to bypass the learning process entirely. “It’s the ultimate cheating device,” he said. “Universities are going to have to figure out how to handle this, because it’s not going away.”

Yet, Kabasares also sees immense potential in AI as a learning tool. “If used correctly, this could be an incredible study partner,” he mused. “It’s like having a PhD student available 24/7 to help you work through tough problems. I just wish I had this 20 years ago.”

A Revolution in Problem-Solving?

As Kabasares reflects on his experiment, one thing is clear: ChatGPT o1-preview has set a new benchmark for AI capabilities in academia. “OpenAI wasn’t exaggerating when they said it’s at a PhD level,” he concluded. “This model is capable of reasoning and problem-solving in ways that are almost human—but much faster.”

While the implications for education and research are profound, the real test will be how students, educators, and professionals use this tool moving forward. Kabasares summed it up best: “We’re on the brink of something big. Whether that’s a revolution in learning or a massive shift in how we approach problem-solving, one thing is for sure—this AI is here to stay.”

Get the WebProNews newsletter delivered to your inbox

Get the free daily newsletter read by decision makers

Subscribe
Advertise with Us

Ready to get started?

Get our media kit

Advertise with Us