Authors (including presenting author) :
WONG RCW (1), LIN HYS (2), WONG LCH (1), YANG KC (3), CHEUNG IYY (1), CHOW VCY (1), LAI CKC (4)
Affiliation :
(1) Department of Microbiology, Prince of Wales Hospital,
(2) Faculty of Medicine, The Chinese University of Hong Kong,
(3) Department of Statistics, University of Oxford,
(4) Department of Microbiology, The Chinese University of Hong Kong
Introduction :
Methicillin-resistant Staphylococcus aureus (MRSA) is categorized as high-priority on the 2024 WHO bacterial priority pathogens list and poses a significant burden to the healthcare system. It can be identified by matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) mass spectrometry and reported as Staphylococcus aureus, while antimicrobial susceptibility testing (AST) is needed to determine whether it is MRSA or methicillin-susceptible Staphylococcus aureus (MSSA), and such a method requires an additional 24 hours of incubation. In total, two days would be required to differentiate between MRSA and MSSA.
Limited studies have evaluated the usefulness of MALDI-TOF in AST. To move beyond its routine identification function, usage of artificial intelligence (AI) in prediction of MRSA and MSSA would be a value-added application to match the initiative of the HA smart hospital strategy. Such an application not only reduces result reporting time from 2 to 1 day but also has no additional cost implication. This is a pioneering study to apply machine learning (ML)-based AI technology in the analysis of MALDI-TOF spectra in Hong Kong.
Objectives :
We aim to apply MALDI-TOF mass spectrometry and compare three ML models in rapid differentiation between MRSA and MSSA.
Methodology :
MALDI-TOF mass spectra from 24487 Staphylococcus aureus retrospective isolates (13776 MRSA and 10711 MSSA) collected in the period between Jan 2021 and May 2024 were recruited. These spectra were randomly divided into an 80:20 training-validation split to develop three models using the Keras 3 API of TensorFlow, including the large-scale neural network (NN), the LightGBM gradient boosting (LGBM) framework, and the weight-averaging model ensemble of NN and LGBM (Ensemble). A prospective testing was performed on 2975 clinical isolates (1867 MRSA and 1108 MSSA).
Result & Outcome :
The NN, LGBM, and Ensemble models can accurately differentiate between MRSA and MSSA with the following classification metrics: F1 scores of 0.9430, 0.9510, and 0.9503, respectively, and AUPRC of 0.9843, 0.9849, and 0.9866, respectively. Early identification using machine learning with MALDI-TOF allows prompt isolation and appropriate antibiotic use, which can improve a patient’s outcome and decrease the risk of cross infection.
Here, we demonstrate a proof-of-concept application and have developed an ensemble ML model of NN and LGBM with the best performance and great potential to apply in HA.