A team of computer scientists from the Stanford University has created a powerful artificially intelligent diagnosis algorithm that could detect dangerous skin lesions just like a seasoned dermatologist.
Their final product, described in a paper published n the journal Nature, was able to match the performance of 21 board-certified dermatologists when it comes to diagnosing skin lesions.
"We realized it was feasible, not just to do something well, but as well as a human dermatologist," said Sebastian Thrun, an adjunct professor in the Stanford Artificial Intelligence Laboratory, in a press release. "That's when our thinking changed. That's when we said, 'Look, this is not just a class project for students, this is an opportunity to do something great for humanity.'"
For their new deep learning algorithm, the researchers first made a database of nearly 130,000 images of skin lesions representing over 2,000 different diseases. The images were taken from the records of their medical school and different published medical journals.
Using their newly compiled database of skin lesions, the researchers trained an algorithm, which was developed and provided by Google, to visually diagnose potential cancers. Google's algorithm was already trained to identify 1.28 million images from 1,000 object categories. However, the researchers need to re-train the algorithm because it was initially primed to be able to differentiate cats from dogs.
To test out the accuracy of the newly trained algorithm, the researchers recruited 21 board-certified dermatologists. The researchers asked the dermatologists to determine whether they would proceed with biopsy, treatment or reassure the patient based on each image shown to them.
On the other hand, the algorithm was programmed to assess the lesion through three key diagnostic tasks: keratinocyte carcinoma classification, melanoma classification, and melanoma classification when viewed using dermoscopy.
The performance of the dermatologists was evaluated based on their success in correctly diagnosing both cancerous and non-cancerous lesion in over370 images, while the algorithm's performance was measured through the creation of a sensitivity-specificity curve.
The researchers found that the algorithm matched the performance of the dermatologists in all three tasks with the area under the sensitivity-specificity curve amounting to at least 91 percent of the total area of the graph.