Novel Outbreak Detection System That Combines Whole-Genome Sequencing, Machine Learning Shows Promise

genome sequencing
genome sequencing
Researchers sought to determine whether a novel system that combines whole-genome sequencing and machine learning is effective for detecting infectious outbreaks and identifying transmission routes of bacterial pathogens.

The Enhanced Detection System for Healthcare-Associated Transmission (EDS-HAT) was found to detect multiple outbreaks that traditional infection prevention (IP) methods failed to identify. These findings were published in Clinical Infectious Diseases.

In this study, researchers assessed bacterial pathogens isolated from clinical specimens that were collected from a tertiary care hospital in Pittsburgh, Pennsylvania between 2016 and 2018. The researchers sought to assess the effectiveness of the EDS-HAT in detecting infectious outbreaks and identifying transmission routes compared with traditional IP methods. The EDS-HAT combined whole-genome sequencing (WGS) surveillance to identify outbreaks of infectious pathogens and machine learning to assess electronic health records to identify potential routes of transmission.

Among a total of 3165 clinical isolates included in the analysis, there were 2752 unique isolates clustered. The researchers identified a total of 297 isolates in 99 unique, genetically-related clusters, with 2 to 14 isolates contained within each cluster. Of the 297 isolates identified, 90.6% were sourced from inpatient visits, 9.1% were from the emergency department, and 0.3% were from outpatient visits.

Potential transmission routes were detected by the EDS-HAT for 65.7% of clusters containing 74.4% of the related isolates. Of the remaining 34 clusters, which comprised 2 to 5 patients and 76 isolates, no significant transmission routes were detected by the EDS-HAT.

During the study, the IP department requested WGS for 15 potentially actionable outbreaks of infectious pathogens involving 133 patients, including Clostridioides difficile (n=6), Serratia marcescens (n=3), Acinetobacter baumannii (n=2), Serratia maltophilia (n=2), Burkholderia cepacia (n=1), and Klebsiella pneumoniae (n=1).

The researchers estimated that the EDS-HAT may have prevented between 25 and 63 transmissions. They also noted that 3.1 to 8.0 fewer hospital readmissions at 30 days, and 1.6 to3.3 fewer deaths would have occurred if the EDS-HAT had been running in real time.

The cost of WGS was offset by the potential cost savings from treating infections, resulting in total cost savings between $192,408 and $692,532. Of note, The EDS-HAT was found to be more cost-effective compared with traditional IP methods for 88% to 99% of simulations.

This study was limited by its retrospective design, and it remains unclear whether the EDS-HAT would be as effective with the use of real-time WGS surveillance.

According to the researchers, these findings “suggest that EDS-HAT represents a potential paradigm shift in how outbreaks are detected in hospitals.” In addition, “if instituted in real time, [EDS-HAT] can [decrease] healthcare-related costs and significantly improve patient safety,” the researchers concluded.

Disclosure: Multiple authors declared affiliations with industry. Please see the original reference for a full list of disclosures.


Sundermann AJ, Chen J, Kumar P, et al. Whole-genome sequencing surveillance and machine learning of the electronic health record for enhanced healthcare outbreak detection. Clin Infect Dis. 2021;ciab946. doi:10.1093/cid/ciab946