5 ways to evaluate AI’s accuracy
After extensive modeling and the running of more than 100,000 simulations, an Artificial Intelligence (AI) system was given the task of predicting the 2018 FIFA champions. The AI predicted that Spain would be the champion (28.9% probability), followed by Germany (26.3%), and Brazil (21.9%). The real world FIFA outcome saw France as champion, followed by Croatia and Belgium
If we’re going to get the most out of AI technology, we need to find ways to optimize both human and machine actions for best results.So, AI doesn’t always work.
Research and consultancy firm Deloitte recommended viewing AI not as “thinking machines,” but as cognitive prostheses that can help humans think better.
One way to do this is by establishing accuracy checkpoints on AI outcomes. Below are five of the best ways to expedite this.
1. Clearly defined AI and humans roles
When AI is used in medical diagnosis, its role is to rip through volumes of medical data at speeds that no human could ever hope to match. The AI then produces a diagnosis and treatment plan. At that point, a human medical practitioner takes over and reviews the AI-generated diagnosis and treatment plan and weighs it against his/her clinical experience. During the process, other medical specialists might also be consulted.
The diagnosis and treatment plan isn’t finalized until all sources of input are evaluated—from the AI— as well as from the practitioner and specialists. A medical professional finalizes the diagnosis and treatment plan. This is an example of how AI works with human review, and how clearly defined roles for both AI and humans leads to the best possible outcomes.
2. Repeatedly run AI model simulations
Running a series of trials for repeatability is an important element of AI testing. If you can’t achieve repeatability, your AI isn’t ready for prime time.
SEE: IT leader’s guide to deep learning (Tech Pro Research)
3. Check data quality
Data quality is paramount in any AI exercise. If your data isn’t of high quality, your results won’t be, either.
4. Avoid deploying AI as the ultimate decision-maker
Several years ago, I was working with a European company on a disaster recovery plan. New technology was installed for failover, and one of the choices was to fully automate failover and the failover decision. Management decided that the system would be used for auto-failover—but that the final decision to failover would be left in human hands. This is a savvy approach to take for any AI operations when dealing with a mission-critical function.
5. Always include an override mechanism
Even for non-mission critical operations, it’s a good idea to have a manual override for your AI because software, hardware, and networks can fail.