For Doctors in a Hurry
- Researchers investigated whether a single-center surgical risk model could maintain predictive accuracy when applied to diverse multicenter clinical data.
- This retrospective cohort study analyzed 508,097 surgical encounters from 366,875 adult patients across 14 distinct healthcare institutions.
- The model achieved area under the curve values of 0.95 for mortality and 0.92 for acute kidney injury.
- The authors concluded that procedure codes and clinician-specific factors were the most influential variables for predicting postoperative complications.
- These findings suggest that automated risk models can provide generalizable preoperative assessments to help clinicians identify high-risk surgical patients.
Quantifying Perioperative Risk in Complex Surgical Populations
Postoperative complications remain a substantial driver of hospital morbidity and mortality, often occurring despite advancements in surgical technique and perioperative care [1]. While systemic interventions such as standardized safety checklists have successfully reduced global mortality rates, individual risk stratification remains a challenge for clinicians managing patients with multiple comorbidities [2]. The complexity of the perioperative period, including the inherent risks associated with anesthesia handovers and the management of acute physiological shifts, necessitates more precise tools for identifying patients at highest risk for adverse outcomes [3]. Accurate preoperative assessment is critical for the implementation of targeted interventions, such as prehabilitation or optimized fluid management, which can significantly improve functional recovery and reduce hospital length of stay [1, 4]. To address this need, a recent multicenter analysis evaluated an automated framework designed to enhance predictive capabilities by leveraging longitudinal electronic health record data across a diverse network of healthcare institutions.
Large-Scale Validation of the MySurgeryRisk Framework
The MySurgeryRisk framework was originally developed and prospectively validated on a single-center dataset, demonstrating high accuracy in predicting adverse outcomes within a controlled environment. To determine if these results could be replicated across diverse clinical settings, researchers conducted a retrospective, longitudinal, multicenter cohort analysis. The study tested the hypothesis that applying this framework to a large multicenter dataset would enhance generalizability without degrading the predictive performance achieved by the original model. This transition from a single institution to a broad network is a critical step in ensuring that risk stratification tools remain reliable when implemented across different patient populations and varying hospital protocols. The analysis included a substantial dataset of 508,097 encounters from 366,875 adult patients who underwent major inpatient operations. These procedures took place at 14 health care institutions within the OneFlorida+ network, a collaborative research consortium, between 2012 and 2023. By utilizing such a large and geographically diverse sample, the researchers aimed to account for the heterogeneity in surgical care and patient demographics that often limits the utility of single-center models. The formal data analysis focused on the framework's ability to maintain high predictive accuracy, ensuring the tool could correctly distinguish between patients who will and will not experience a complication in real-world practice.
Machine Learning Methodology and Patient Demographics
To develop the predictive models, the researchers utilized eXtreme Gradient Boosting (a machine learning algorithm that builds an ensemble of decision trees to improve prediction accuracy by iteratively correcting errors from previous trees). This computational approach allowed the framework to process complex interactions between variables without requiring manual data entry from clinicians. Instead, the model utilized routinely collected variables from electronic health records across the large multicenter cohort, ensuring that the risk stratification process could be integrated into existing hospital workflows without increasing the administrative burden on surgical teams. The study population was divided into two distinct temporal cohorts to ensure robust performance across different time periods. The models were trained on a development set consisting of 358,216 encounters occurring between 2012 and 2020. Following this training phase, the models were evaluated on a validation set consisting of 149,881 encounters from 2020 to 2023. This longitudinal design allowed the researchers to test the model's stability against shifting clinical practices and the unique challenges posed by the later years of the study period. The demographic profile of the 366,875 total patients reflects a broad adult surgical population, with a mean (SD) age of 59 (18) years. The cohort was nearly evenly split by sex, including 190,799 (52%) women and 176,076 (48%) men. By including such a large and diverse patient base, the study provides clinicians with evidence that the predictive framework remains reliable across different age groups and sexes, which is essential for a tool intended for use in high-volume, heterogeneous surgical environments.
High Predictive Accuracy for Critical Postoperative Outcomes
Postoperative complications remain a significant clinical challenge, affecting up to 15% of patients who undergo inpatient surgery. To address this, the researchers developed models to predict the postoperative risk of four critical outcomes: intensive care unit (ICU) admission, postoperative mechanical ventilation, postoperative acute kidney injury (AKI), and in-hospital mortality. Within the study population, the prevalence of these complications varied, with ICU admission occurring in 8% of cases (n = 42,302) and postoperative mechanical ventilation in 4% (n = 20,435). Acute kidney injury was observed in 7% of patients (n = 36,027), while in-hospital mortality was the least frequent but most severe outcome at 1% (n = 5,131). The researchers primarily evaluated the predictive performance of these models using the area under the receiver operating characteristics curve (AUROC), a statistical metric where a value of 1.0 represents perfect prediction and 0.5 represents random chance. The models demonstrated high accuracy across all four endpoints. Specifically, the AUROC for ICU admission was 0.93 (95% CI, 0.93-0.93), and the AUROC for postoperative mechanical ventilation was 0.94 (95% CI, 0.94-0.94). For the prediction of AKI, the model achieved an AUROC of 0.92 (95% CI, 0.92-0.92), while the AUROC for in-hospital mortality was 0.95 (95% CI, 0.94-0.95). These results indicate that the model predictive performance was comparable with previously validated single-center MySurgeryRisk models. By maintaining such high levels of accuracy across 14 different health systems, the findings suggest that the framework is robust enough for broad clinical application. For the practicing physician, this level of predictive precision offers a reliable method for identifying high-risk patients before complications occur, potentially allowing for earlier intervention and more tailored postoperative monitoring.
Clinical Drivers of Surgical Risk
The predictive utility of the MySurgeryRisk framework stems from its ability to weigh complex variables within the electronic health record to identify the most significant contributors to patient risk. In this multicenter analysis, the researchers found that the primary procedure code and clinician-specific factors were consistently the most influential variables in the risk predictions for all four major outcomes. By identifying the specific surgical procedure as a top-tier predictor, the model reinforces the clinical reality that the inherent complexity and physiological stress of certain operations remain the primary determinants of postoperative stability. Furthermore, the inclusion of clinician-specific factors (data points related to the individual providers involved in a patient's care) suggests that variations in practice patterns and surgical volume are critical components of the risk equation. For the practicing clinician, these findings provide a roadmap for prioritizing resources and optimizing care delivery in a high-volume surgical environment. Because the model identifies which procedures and provider-related factors carry the highest weight, hospitals can more effectively flag high-risk cases for preoperative optimization or enhanced postoperative surveillance. Having a validated, automated tool for real-time risk stratification allows for an objective assessment that transcends individual clinical intuition, which may be limited by cognitive bias or incomplete data. By integrating these influential drivers into a seamless digital workflow, clinicians can focus their attention on the patients most likely to experience intensive care unit admission or acute kidney injury, ultimately aiming to reduce the 15% complication rate currently observed in inpatient surgery.
References
1. Ambulkar R, Kunte AR, Solanki S, Thakkar V, Deshmukh B, Rana P. Impact of Prehabilitation in Major Gastrointestinal Oncological Surgery: a Systematic Review. Journal of Gastrointestinal Cancer. 2025. doi:10.1007/s12029-025-01196-x
2. Haynes AB, Weiser TG, Berry WR, et al. A Surgical Safety Checklist to Reduce Morbidity and Mortality in a Global Population. New England Journal of Medicine. 2009. doi:10.1056/nejmsa0810119
3. Meersch M, Weiss R, Küllmar M, et al. Effect of Intraoperative Handovers of Anesthesia Care on Mortality, Readmission, or Postoperative Complications Among Adults: The HandiCAP Randomized Clinical Trial.. JAMA. 2022. doi:10.1001/jama.2022.9451
4. Galiè N, Humbert M, Vachiéry J, et al. 2015 ESC/ERS Guidelines for the diagnosis and treatment of pulmonary hypertension. European Respiratory Journal. 2015. doi:10.1183/13993003.01032-2015