Claude Humiliates ChatGPT and Wins Global AI Challenge with 115 Technical and Literary Tests in 2025

Written by Caio Aviz

Published on 08/06/2025 at 02:15

Imagem ilustrativa mostra a ascensão da IA Claude, destacada por sua superioridade sobre ChatGPT em testes técnicos e literários

Seja o primeiro a reagir!

Anthropic Assistant Outperforms Rivals in Technical Challenge, Mastering Medicine, Law, and Literature, and Stands Out for Not Presenting Serious Interpretation Errors.

AI Performance Is Evaluated in Rigorous Tests

A thorough assessment involving five of the leading artificial intelligence models was conducted by the Washington Post in June 2025.
The goal was to identify which AI would perform best in interpreting and answering 115 questions based on four types of content: fiction, contracts, medical articles, and political speeches.

The tested systems included Claude (Anthropic), ChatGPT (OpenAI), Gemini (Google), Copilot (Microsoft), and Meta AI (Meta).
All faced the same challenge: to demonstrate text comprehension abilities and provide useful, objective, and correct responses.

Literary Test Reveals Claude’s Technical Mastery

The first block of the test involved the book The Sand Beneath the Waves, by Chris Bohjalian, an acclaimed writer in the United States.
Claude was the only AI that correctly understood all the elements of the plot, including secondary characters, the central storyline, and the conclusions.

ARTICLE CONTINUES BELOW

Major Oil, Gas, and Energy Event Will Take Place in Brazil: Macaé Energy Will Gather Petrobras, Equinor, Prio, Among Other Suppliers and Energy Executives for Business, Networking, and Employment Opportunities

The Technical Agenda and the Fair in the City of Macaé Should Boost Discussions on Investments, Energy Transition, and Development of the Oil, Gas, and Energy Chain in the National...

Paulo Nogueira

ChatGPT came close but failed to mention two important characters.
Gemini, on the other hand, had the weakest performance: it produced vague and imprecise answers, lacking narrative depth.
The author himself, Chris Bohjalian, considered Claude the most efficient in literary understanding.

Legal Analysis Exposes Gaps in Competitors

In the second segment, the contract analysis was based on real documents, including rental agreements and employment contracts.
Sterling Miller, a corporate attorney and columnist specializing in governance, was responsible for the evaluation.

Claude suggested solid technical adjustments in the contracts, with clear language and coherent legal application.
In contrast, Meta AI and ChatGPT oversimplified the terms and omitted critical sections.
Copilot, although quick, failed to interpret exclusivity clauses.

Medicine Was the Topic with the Highest Average Score

The medical test involved summarizing recent scientific articles, such as a study on long Covid and another on Parkinson’s.
Cardiologist and researcher Eric Topol was in charge of correcting the responses.

Claude again stood out: it presented all the relevant details without omissions.
ChatGPT had a mediocre performance.
Gemini failed to correctly explain the side effects of the treatment described in the Parkinson’s study, receiving the lowest score in this round.

Political Discourse Challenges Context Comprehension

The fourth type of test involved excerpts from speeches by Donald Trump, aiming to verify the AIs’ ability to identify contradictions, ironies, and manipulation in discourse.

<p.Political reporter Cat Zakrzewski from the Washington Post evaluated this segment.
ChatGPT was the most accurate in pointing out controversial points in the speech and citing politicians who refuted the former president’s remarks.
Copilot, on the other hand, failed to capture the heated tone and lacked contextualization.

Claude Tops Rankings and Avoids Critical Errors

At the end of the evaluation, the consolidated results identified Claude as the most efficient artificial intelligence, with the highest overall score and the lowest rate of “hallucinations”—that is, invented responses.

Here is the final ranking released on June 6, 2025, by the Washington Post:

Claude – 69.9 points
ChatGPT – 68.4 points
Gemini – 49.7 points
Copilot – 49.0 points
Meta AI – 45.0 points

According to the organizers, no system achieved a perfect score. Still, Claude managed to stand out for its consistency.

Experts Warn of Responsible Use

Despite positive results in various areas, evaluators highlight the risks of indiscriminate AI use.
All tools tested presented partial or factually unsupported responses at some point.

Experts like Sterling Miller and Eric Topol warn that these technologies should be used under human supervision, especially in legal and medical contexts.
Moreover, they emphasize that the tools can complement professional work but should not replace it.

Lessons and Future of Artificial Intelligence

The test results indicate that AI evolution is advanced, but still relies on significant adjustments.
Claude from Anthropic emerges as the most reliable AI in 2025, according to technical and specialized assessment.

With more challenges anticipated in the coming months, developing companies promise updates to enhance the accuracy and safety in the use of language systems.

What to Expect from AI in the Coming Years?

The competition among technology giants is far from over.
However, technical advancement demands regulation, ethics, and transparency, points considered fundamental by all specialists involved in the study.

What about you, do you believe that AIs are ready to make complex decisions or do they still need to evolve further for that?

0 Comentários

Mais recente

Mais antigos Mais votado

Feedbacks

Visualizar todos comentários

Claude Humiliates ChatGPT and Wins Global AI Challenge with 115 Technical and Literary Tests in 2025

Anthropic Assistant Outperforms Rivals in Technical Challenge, Mastering Medicine, Law, and Literature, and Stands Out for Not Presenting Serious Interpretation Errors.

AI Performance Is Evaluated in Rigorous Tests

Literary Test Reveals Claude’s Technical Mastery

Major Oil, Gas, and Energy Event Will Take Place in Brazil: Macaé Energy Will Gather Petrobras, Equinor, Prio, Among Other Suppliers and Energy Executives for Business, Networking, and Employment Opportunities

Legal Analysis Exposes Gaps in Competitors

Medicine Was the Topic with the Highest Average Score

Political Discourse Challenges Context Comprehension

Claude Tops Rankings and Avoids Critical Errors

Experts Warn of Responsible Use

Lessons and Future of Artificial Intelligence

What to Expect from AI in the Coming Years?

At just 14 years old, a boy creates a system without energy, using steel pipes buried in the ground, that irrigates seedlings with moisture from the air, to combat the water scarcity threatening reforestation in northern China.

Where there was only sand and wind at 40 degrees, China built a megacity of 500,000 inhabitants with farms, wineries, and universities in the middle of the desert using melted glacier water from hundreds of kilometers away.

At over 8,800 meters above sea level, the summit of Mount Everest is made up of rocks that originated at the bottom of an ocean about 500 million years ago and were pushed to the highest point on Earth by the collision of tectonic plates.

Ocyan & Mota-Engil consortium opens offshore position with a 14×14 schedule for those who want to work onboard in Macaé (RJ) as Offshore Nurse.

A 176,000-hectare farm in the Northeast is up for sale for R$ 1.2 billion, featuring a runway, fuel station, river, schools, and warehouses, and has become a phenomenon on social media due to its size, which surpasses that of Brazilian capitals.

Document organization can cut invisible costs in small businesses, a simple step that prevents waste, rework, and losses in daily operations.

A trick with a fan and a PET bottle goes viral in the extreme heat of 2026 and promises to cool the environment while using almost no electricity.

The sugar-energy sector advances with agricultural technology, but agricultural productivity still raises concerns.

In Kenya, engineer Nzambi Matee created “bricks 2.0” using packaging plastic: a mixture with sand, heated and pressed; they are 5 times more resistant, already have official licensing, and are being used in streets and construction projects.

Federal Institute opens over 700 vacancies in free technical courses in Civil Engineering, Mechanics, Electrotechnics, Computer Science, and Nursing; see how to enroll and study at IFPB.

More than 20,000 positions could be opened, and thousands of technicians are starting to be trained in Brazil, as the expansion of data centers creates an urgent race for professionals to keep the internet, cloud, and artificial intelligence running 24 hours a day.

China approves the 15th Five-Year Plan 2026-2030 to become a global power: prioritizes brain implants, 6G with AI, humanoid robots, flying cars, quantum technology, and nuclear fusion, while strengthening defense and the economy.

Claude Humiliates ChatGPT and Wins Global AI Challenge with 115 Technical and Literary Tests in 2025

Anthropic Assistant Outperforms Rivals in Technical Challenge, Mastering Medicine, Law, and Literature, and Stands Out for Not Presenting Serious Interpretation Errors.

AI Performance Is Evaluated in Rigorous Tests

Literary Test Reveals Claude’s Technical Mastery

Major Oil, Gas, and Energy Event Will Take Place in Brazil: Macaé Energy Will Gather Petrobras, Equinor, Prio, Among Other Suppliers and Energy Executives for Business, Networking, and Employment Opportunities

Legal Analysis Exposes Gaps in Competitors

Medicine Was the Topic with the Highest Average Score

Political Discourse Challenges Context Comprehension

Claude Tops Rankings and Avoids Critical Errors

Experts Warn of Responsible Use

Lessons and Future of Artificial Intelligence

What to Expect from AI in the Coming Years?

At just 14 years old, a boy creates a system without energy, using steel pipes buried in the ground, that irrigates seedlings with moisture from the air, to combat the water scarcity threatening reforestation in northern China.

Where there was only sand and wind at 40 degrees, China built a megacity of 500,000 inhabitants with farms, wineries, and universities in the middle of the desert using melted glacier water from hundreds of kilometers away.

At over 8,800 meters above sea level, the summit of Mount Everest is made up of rocks that originated at the bottom of an ocean about 500 million years ago and were pushed to the highest point on Earth by the collision of tectonic plates.

Ocyan & Mota-Engil consortium opens offshore position with a 14×14 schedule for those who want to work onboard in Macaé (RJ) as Offshore Nurse.

A 176,000-hectare farm in the Northeast is up for sale for R$ 1.2 billion, featuring a runway, fuel station, river, schools, and warehouses, and has become a phenomenon on social media due to its size, which surpasses that of Brazilian capitals.

Document organization can cut invisible costs in small businesses, a simple step that prevents waste, rework, and losses in daily operations.

A trick with a fan and a PET bottle goes viral in the extreme heat of 2026 and promises to cool the environment while using almost no electricity.

Artificial intelligence is skyrocketing energy consumption, raising emissions from tech giants and pushing Google, Microsoft, and Meta closer to natural gas.

Mercor paid $1.5 million per day for doctors, lawyers, and former Goldman Sachs bankers to teach artificial intelligence to do their jobs, and in 17 months, it went from zero to $500 million in annual revenue while its own contractors accelerated the replacement of their own work.

Casio Unveils Moflin, Robotic Pet With Artificial Intelligence Designed To Provide Emotional Comfort And Simulate A Permanent Affectionate Bond

Google Maps Revolutionizes GPS Navigation With Gemini Artificial Intelligence, Ask Maps Feature, and Advanced 3D Visualization for Route and Travel Planning

China Warns the U.S. of a “Terminator” Style Apocalypse: Military AI Rivalry, Sanctions Against American Startup, and Fear of Algorithms Deciding Who Lives or Dies Place the World on the Brink of a Nightmare That Seemed Like Fiction

The sugar-energy sector advances with agricultural technology, but agricultural productivity still raises concerns.

In Kenya, engineer Nzambi Matee created “bricks 2.0” using packaging plastic: a mixture with sand, heated and pressed; they are 5 times more resistant, already have official licensing, and are being used in streets and construction projects.

Federal Institute opens over 700 vacancies in free technical courses in Civil Engineering, Mechanics, Electrotechnics, Computer Science, and Nursing; see how to enroll and study at IFPB.

More than 20,000 positions could be opened, and thousands of technicians are starting to be trained in Brazil, as the expansion of data centers creates an urgent race for professionals to keep the internet, cloud, and artificial intelligence running 24 hours a day.

China approves the 15th Five-Year Plan 2026-2030 to become a global power: prioritizes brain implants, 6G with AI, humanoid robots, flying cars, quantum technology, and nuclear fusion, while strengthening defense and the economy.