A new artificial intelligence (AI) platform developed by Northwestern University researchers can detect COVID-19 in the lungs 10 times faster and a bit more accurately than specialized cardiothoracic radiologists, according to a study published today in Radiology.
The researchers trained and tested DeepCOVID-XR, a machine-learning algorithm that analyzes chest X-rays, on 17,002 X-ray images, 5,445 of them with signs of COVID-19, collected from February to April.
When pitted against five experienced cardiothoracic radiology subspecialists, DeepCOVID-XR analyzed each of 300 randomly selected test images in about 18 minutes, versus the 2.5 to 3.5 hours of individual radiologists. DeepCOVID-XR was 82% accurate, compared with the radiologists' 76% to 81% individually and 81% as a team.
"These are experts who are sub-specialty trained in reading chest imaging, whereas the majority of chest X-rays are read by general radiologists or initially interpreted by non-radiologists, such as the treating clinician," lead author Ramsey Wehbe, MD, said in a Northwestern news release. "Whereas the majority of chest X-rays are read by general radiologists or initially interpreted by non-radiologists, such as the treating clinician. A lot of times decisions are made based off that initial interpretation."
In relation to standard reverse-transcription polymerase chain reaction COVID-19 testing, the AI platform was 82% accurate in classifying test X-ray images. DeepCOVID-XR was also 71% sensitive, compared with 60% with one radiologist, and 92% specific, compared with 75% with two radiologists.
Sensitivity and specificity are test performance measures, with sensitivity referring to the proportion of positive test results correctly identified, while specificity is the proportion of negative results correctly identified.
Shortening time to diagnosis, isolation
While the AI platform is still in the research stage and not available clinically, the authors said it could one day be used to rapidly screen patients admitted to hospitals for conditions other than coronavirus, enabling rapid COVID-19 testing and isolation, if warranted.
Study coauthor Aggelos Katsaggelos, PhD, said in the release that the team is not trying to replace COVID-19 testing with the AI platform (which can't confirm a diagnosis) but rather to shave hours or even days from time to diagnosis and isolation so the patient does not spread the virus to healthcare workers or other patients. "It would take seconds for our system to screen a patient and determine if that patient needs to be isolated," he said.
However, the system will miss cases because many people with coronavirus have no symptoms, and most typically don't show signs of it on X-rays until later in their illness, according to Wehbe. "In those cases, the A.I. system will not flag the patient as positive," he said. "But neither would a radiologist."
DeepCOVID-XR, Wehbe said, can help distinguish between COVID-19 and pneumonia, heart failure, and other illnesses that display similar hazy patches on X-ray. "Many patients with COVID-19 have characteristic findings on their chest images," he said. "These include 'bilateral consolidations.' The lungs are filled with fluid and inflamed, particularly along the lower lobes and periphery."
Also, X-rays are routine, safe, inexpensive, and always available, in contrast to radiologists, according to Katsaggelos. "This could potentially save money and time—especially because timing is so critical when working with COVID-19," he said.
In their conclusion, the authors said that they plan to conduct a prospective evaluation that includes patients not under investigation for COVID-19 and add other clinical data to the platform to boost its performance and adapt it for risk prediction of clinical outcomes in patients with confirmed COVID-19.
Northwestern has made DeepCOVID-XR publicly available for other research teams to train it using their data. "By providing the DeepCOVID-XR algorithm code base as an open source resource, we hope investigators around the world will further improve, fine tune, and test the algorithm using clinical images from their own institutions," the authors wrote.