Microsoft’s AI Achieves 85% Accuracy in Clinical Test Diagnoses and Outperforms Human Doctors in Precision and Resource Efficiency
Microsoft has unveiled a new artificial intelligence system that promises to transform the way medical diagnoses are made. Named Microsoft AI Diagnostic Orchestrator (MAI-DxO), the feature showed remarkable results in tests comparing its performance to that of experienced doctors.
According to the company, the tool was able to achieve an accuracy rate four times higher than that of human professionals in diagnosing complex clinical cases.
Performance Above Expectations
According to Microsoft, the MAI-DxO was tested based on 304 clinical cases published in the New England Journal of Medicine.
-
ChatGPT reveals which human job it would choose if it could really work
-
Massive use of AI could ‘cut’ jobs. A frightening prediction was made by the CEOs of the two largest companies in the sector, Sam Altman (OpenAI) and Jensen Huang (Nvidia), who even coined the expression “Job Apocalypse.” However, with an eye on the appreciation of their respective stocks, they softened their rhetoric.
-
Coach loses job for using ChatGPT to make decisions – could these professionals’ jobs be at risk in the future due to the use of AI?
-
Brazil could fall in the quarter-finals of the 2026 World Cup after AI conducts 10,000 simulations: Claude predicts France as the champion, ranks the national team only sixth among favorites, and turns the path to the sixth title into a warning for the fans.
This content was chosen for containing detailed and comprehensive accounts of different care journeys with real patients.
The result was an accuracy rate of 85.5% for the AI system’s diagnoses, compared to only 20% accuracy achieved by the group of 21 human doctors who participated in the comparison.
The superior performance drew attention not only for its precision but also for its cost-effectiveness. The system was able to arrive at correct diagnoses while spending fewer resources.
This was possible because the AI more efficiently selected the exams and tests to be requested, considering both the cost of the procedure and its practical utility.
Process Similar to That of Doctors
To simulate a situation closer to clinical practice, Microsoft’s team adopted a sequential diagnostic format.
In this model, both the AI and the doctors received information gradually and could request new tests as they deemed necessary.
An example cited was that of a patient with fever and cough. Based on these initial symptoms, the professional – whether human or artificial – could request blood tests and an X-ray before defining the pneumonia diagnosis.
In the end, the conclusion was compared to the diagnosis considered correct by the experts from the article used as a basis.
Combined Use of AI Models
Another differentiator of MAI-DxO is that it functions as an orchestrator of different artificial intelligence models.
The tool consults existing systems, such as OpenAI’s GPT, Google’s Gemini, Anthropic’s Claude, Meta’s Llama, and xAI’s Grok.
This allows the system to gather the best capabilities of each, somewhat replicating the collaborative work among human specialists in hospitals.
By bringing together these various sources, the system can cover a broader range of possibilities and adjust reasoning as new information arises. This combination proved effective during tests with real clinical cases.
Cost Control and Precision
One of the innovations highlighted in the experiment is how the system manages to balance the search for the correct diagnosis with resource limitations.
Instead of requesting all possible tests, the MAI-DxO was configured to act cautiously, selecting only the procedures that offered the best cost-benefit ratio.
This means that the system took into account the financial cost of tests, the potential discomfort for the patient, and the time required to complete the care. The AI, thus, demonstrated the capacity for clinical reasoning focused on efficiency.
Promising Performance but with Limitations
Even with the good results, Microsoft makes a point of highlighting that the technology still has significant limitations. The company acknowledges that the high performance of the AI was recorded in more complex cases, but that there is still insufficient evidence regarding its performance in more common medical situations.
Furthermore, Microsoft states that the role of doctors cannot be replaced solely by diagnosis. Healthcare professionals also need to deal with ambiguity, build relationships with patients, and make human decisions that AI cannot simulate.
Attention to Next Steps
For Microsoft, the MAI-DxO represents only the beginning of a new phase. According to the company, the system can help patients take better care of their health and can also provide advanced support for doctors dealing with difficult cases.
Moreover, the official note states that healthcare costs have been rising at an unsustainable rate. With over 50 million searches for medical information conducted every day, the company sees a scenario where people are increasingly seeking digital help.
In this context, AI emerges as a support tool, not as a replacement.
Conclusion Focused on Practical Impact
The experiment also brought to light a point little explored in other research: the relationship between cost and precision.
By applying a methodology that simulates real healthcare spending, Microsoft was able to measure not only the quality of diagnoses but also their economic viability.
Although the company emphasizes that there are still obstacles to the widespread use of AI in medicine, the MAI-DxO has already demonstrated that it can go beyond memorizing answers and simulate a real clinical process with efficiency and criteria.
For Microsoft, this is just the beginning of a profound transformation in the healthcare sector.
With information from Galileu Magazine.


Be the first to react!