Predicting outcomes in older ED patients with influenza in real time using a big data-driven and machine learning approach to the hospital information system

Background Predicting outcomes in older patients with influenza in the emergency department (ED) by machine learning (ML) has never been implemented. Therefore, we conducted this study to clarify the clinical utility of implementing ML. Methods We recruited 5508 older ED patients (≥65 years old) in three hospitals between 2009 and 2018. Patients were randomized into a 70%/30% split for model training and testing. Using 10 clinical variables from their electronic health records, a prediction model using the synthetic minority oversampling technique preprocessing algorithm was constructed to predict five outcomes. Results The best areas under the curves of predicting outcomes were: random forest model for hospitalization (0.840), pneumonia (0.765), and sepsis or septic shock (0.857), XGBoost for intensive care unit admission (0.902), and logistic regression for in-hospital mortality (0.889) in the testing data. The predictive model was further applied in the hospital information system to assist physicians’ decisions in real time. Conclusions ML is a promising way to assist physicians in predicting outcomes in older ED patients with influenza in real time. Evaluations of the effectiveness and impact are needed in the future. Supplementary Information The online version contains supplementary material available at 10.1186/s12877-021-02229-3.

diseases, pneumonia, chronic obstructive pulmonary disease (COPD), and ischemic heart diseases, are the common causes of death [3].
Although the GID score is a potentially good clinical decision rule (CDR) in older adults with influenza, it has the limitations of the small size of derivation sample and lacks both automation and feedback in real time to clinicians [5]. Artificial intelligence (AI) is defined as that uses computer techniques, including machine learning (ML) and deep learning (DL) to represent intelligent behavior [6]. In recent years, a great deal of evidence showed that AI could handle more variables that are already available through electronic health records (EMRs) and may better predict patient outcomes [5]. We performed searches on Google Scholar and PubMed using the keywords "AI," "death," "influenza," "machine learning," "mortality," "older adult," and "outcome," but we did not find any AI application in this field. Therefore, we conducted the present study for clarifying the issue and applying it in the hospital information system (HIS) to assist decision making in real time.

Study design, setting, and participants
We included emergency physicians, information engineers, data scientists, quality managers, and nurse practitioners to establish a multi-disciplinary team for this project (Fig. 1). After our literature review, we decided to use the previous study about predicting mortality in older ED patients with influenza as the main reference [4]. We identified all older patients (≥65 years old) with influenza who visited the ED between January 1, 2009, and December 31, 2018, from the EMRs of three hospitals: Chi Mei Medical Center, Chi Mei Hospital, Liouying, and Chi Mei Hospital, Chiali. The present study hospitals are not the hospitals for developing the GID score. The criteria of influenza are defined as the diagnosis of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) of 487 or 488 or a prescription of Oseltamivir, Peramivir, or Relenza in the index ED visit.

Ethical statement
The present study was approved and granted permission to access the raw data by the institutional review board in the Chi Mei Medical Center. Because this study is retrospective and it contains de-identified information, informed consent from the participants was waived. The waiver does not affect the rights and welfare of the participants. Data processing, model comparison, and application in the HIS First, we extracted, transformed, and validated the data from the HIS into a data mart. Missing and ambiguous data were carefully processed at this step. Second, we randomly split the data to two dataset (70%/30%) and used the synthetic minority oversampling technique (SMOTE) to enlarge the first dataset (70%) as training dataset because of imbalanced outcome samples. The second dataset (30%) is used as testing dataset without any resampling. Third, according the optimal modeling result with testing dataset, we compared accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and the area under the curve (AUC) among the analyses of the random forest, logistic regression, K-nearest neighbors (KNN), support vector machine (SVM), light gradient boosting machine (LightGBM), multilayer perceptron (MLP, a kind of DL), and Extreme Gradient Boosting (XGBoost). In this step, we conducted grid search with hyper-parameters for each algorithm to obtain the optimal models (hyper-parameter ranges for each algorithm were summarized in Supplementary Table 1). Then, we selected the best algorithm to develop the prediction model for each outcome. Fourth, we deployed the model in the AI web service and integrated it with the HIS in the ED. After twomonths of pilot testing and validating, we launched the prediction application in the HIS to assist physicians for decision making in real time.

Patient and public involvement
Patients and the public were not be involved in this study.
Comparisons of predictive accuracies among the random forest, logistic regression, KNN, SVM, LightGBM, MLP, and XGBoost revealed that the random forest model had the best AUC for hospitalization, pneumonia, and sepsis or septic shock than did other models in the testing dataset (Table 2 and Supplementary Fig. 1). The XGBoost had the best AUC for ICU admission (0.902) and logistic regression had the best AUC for in-hospital mortality (0.889). Table 3 summarized the best AUC for each outcome in the testing dataset, which was adopted for building the prediction model in further. Feature importance according to a random forest, logistic regression, LightGBM, and XGBoost for predicting the five outcomes was also reported ( Supplementary Fig. 2).
We applied the best algorithm for predicting outcomes in older ED patients in the HIS to assist decision making in real time. An AI button was set up in the HIS of the ED ( Supplementary Fig. 3). When the clinician presses the AI button, the AI application automatically catch the feature variables from the HIS and pops up a screen of the prediction result within 1 sec (Supplementary Fig. 4). The prediction result shows a personalized prediction for hospitalization, complications with pneumonia, complications with sepsis or septic shock, admitted to ICU, and in-hospital mortality. Using five-level Likert, a mean of 4.6 was responded by 101 times of use, which indicates that the AI prediction is useful for the clinicians.

Discussion
The present study revealed that the random forest had the best AUC for predicting hospitalization, pneumonia, and sepsis or septic shock, XGBoost had the best AUC for predicting ICU admission, and logistic regression had the best AUC for predicting in-hospital mortality in older ED patients with influenza. The predictions are very fast, in real time, and actionable, which provide  prognostic information to assist in decision making, including disposition and outcome explanation. Using AI prediction for assisting decision-making is an appealing idea [7]. Because of the increased availability of EMRs and advancement of computer performance and algorithm, AI prediction based on the medical big data becomes a promising way for healthcare [8]. In recent years, the rapid progression of cloud and IoT (internet of things) by healthcare monitor and wearable sensor networks also greatly support the development of real-time AI prediction [8]. Therefore, the AI-based tools, which are designed to improve diagnosis, care planning, and outcome will be incorporated into healthcare services in the near future [9]. Many regulations about AI use in healthcare need to be developed, including establishment of normative standard, evaluation guidelines, and monitoring and reporting systems [9]. The adopted feature variables in this study, including comorbidities and abnormal vital signs and laboratory data, are the risk factors for poor outcomes. The more feature variables, the poorer outcome in the result of AI prediction.
The random forest is superior to the traditional model (i.e., logistic regression) for developing CDR in predicting hospitalization, pneumonia, sepsis or septic shock, and ICU admission. One possible reason for the lower predictive accuracies of logistic regression is that it lacks external validation [5]. Traditional CDRs, including the GID score, are typically developed by gathering data at one or more hospitals, and then using both to derive and validate a model from a chosen set of predictors. The developed CDRs are then used in other hospitals, different from the original study hospital [10]. A recent study reviewed 127 new prediction models and showed that external independent validation was uncommon in predictive models [11]. Predictive performance in external validation tends to be worse than the original study [11]. In contrast to the GID score derived from other hospitals, we used local real-world big data in multicenters to make predictions about the local population, which improves accuracy over the traditionally derived model. The variables used in the present study are structured data from the local EMR without being subjected to ambiguous clinical definitions or biases of data collection.
The random forest model is an ensemble learning method for classification and regression [12,13]. It combines many binary decision trees, which are built by several bootstrapped learning samples, and chooses a subset of variables randomly at each node [12,13]. Each tree in the random forest will vote for some input x, then the voting majority of trees will determine the output of the classifier [14]. The random forest can use a large number of trees in the ensemble to handle high dimensional data [14]. The random forest is a common method adopted for predicting outcomes and selecting predictors in the ED. A study about predicting in-hospital mortality in ED patients with sepsis revealed that the AUC of the random forest was 0.86, superior to the CART (classification and regression tree) model (0.69); logistic regression model (0.76); CURB-65 (Confusion, Urea, Respiratory rate, Blood pressure plus age ≥ 65 years old) (0.73); MEDS (mortality in emergency department sepsis) (0.71); and mREMS (modified rapid emergency medicine score) (0.72) [5]. A study used the random forest to select the most relevant variables for major adverse cardiac events in ED patients with chest pain [12]. They found that the selection predictor by the random forest is promising in discovering a few relevant and significant predictors [12].
The SMOTE adopted in the present study is the most common and effective method of oversampling for adjusting imbalanced data [15]. SMOTE solves the problems of both high-class skew and high sparsity and works in the "feature space" rather than "data space" [16]. By taking each minority class sample and the K-nearest neighbors, SMOTE creates synthetic samples for effectively forcing the decision region of the minority class to become more general [16]. Without duplicating the data, SMOTE increases the data space and amplifies the features of the minority class [16]. Studies with SMOTE preprocessing in health care are also acceptable [17,18].
According to our literature review, the present study has the strength of being the first real-time prediction

Conclusions
We developed the first real-time prediction application in the HIS for predicting outcomes in older ED patients with influenza using a big data-driven and machine learning approach. This real-time prediction is a promising way to assist the physician's decision making and explanations to patients and their families. Further studies about the predictive accuracy between this model and both the physician's judgment, impact of the application, and including as many variables as possible and reducing the number by running proper variable selection algorithms are needed.