TY  - JOUR
AU  - Cui, Junqi
AU  - Li, Weijia
AU  - Lim, Ngai Enoch Chi
AU  - Wu, Xiaoqin
AU  - Lim, Danforn Chi Eung
PY  - 2026/2/25
TI  - Stratified Causal Inference for Intensive Care Unit Risk Prediction: Informatics-Based Modeling of Anesthetic Drug Combinations
JO  - JMIR Form Res
SP  - e80294
VL  - 10
KW  - fentanyl
KW  - propofol
KW  - intensive care
KW  - dose-response analysis
KW  - counterfactual modeling
N2  - Background: Postoperative intensive care unit (ICU) admission affects 15% to 20% of surgical patients and represents a major source of morbidity and health care costs. Current anesthetic dosing relies on empirical guidelines rather than individualized risk assessment. We developed a counterfactual dose-response model to identify optimal fentanyl-propofol combinations. Objective: This study aimed to develop and evaluate a stratified, causal machine learning framework using electronic health record data to identify optimal fentanyl-propofol dose combinations and predict postoperative ICU admission risk, enabling precision anesthesia and individualized clinical decision support. Methods: We analyzed perioperative electronic health records of 67,134 surgical procedures from UC Irvine Medical Center (2017?2022). A hierarchical learning framework was used to estimate causal effects while controlling for confounding variables. A total of 6 dose-sensitive subgroups were identified through stratified analysis. The primary end point was postoperative ICU admission. Results: High-risk combinations (fentanyl >5 mcg/kg with propofol <1 mg/kg) increased ICU admissions? absolute risk difference by 36% (absolute risk increase; 95% CI 0.351-0.509; P<.001). A total of 6 patient subgroups demonstrated distinct dose-response patterns, with populations considered vulnerable (high glucose, elevated creatinine) showing elevated risk even at standard doses. The optimal dose range for decision-making was determined to be 1.25 to 4.25 mg/kg for propofol and 3.5 to 4.0 mcg/kg for fentanyl. Conclusions: Fentanyl-propofol combinations exhibit complex, nonlinear dose-response relationships with ICU admission risk. High-dose combinations markedly increase risk through synergistic effects, while specific patient subgroups require enhanced monitoring even at standard doses. These findings support the development of individualized dosing algorithms and risk assessment tools that could inform future decision support tools aimed at reducing postoperative ICU use, although their predictive performance and clinical impact would require external validation. 
UR  - https://formative.jmir.org/2026/1/e80294
UR  - http://dx.doi.org/10.2196/80294
ID  - info:doi/10.2196/80294
ER  - 

TY  - JOUR
AU  - Causio, Andrea Francesco
AU  - De Vita, Vittorio
AU  - Nappi, Andrea
AU  - Sawaya, Melissa
AU  - Rocco, Bernardo
AU  - Foschi, Nazario
AU  - Maioriello, Giuseppe
AU  - Russo, Pierluigi
PY  - 2026/2/19
TI  - Survival Prediction in Patients With Bladder Cancer Undergoing Radical Cystectomy Using a Machine Learning Algorithm: Retrospective Single-Center Study
JO  - JMIR Perioper Med
SP  - e86666
VL  - 9
KW  - cystectomy
KW  - disease-free survival
KW  - artificial intelligence
KW  - neoplasm staging
KW  - retrospective studies
KW  - urinary bladder neoplasms
KW  - clinical decision-making
KW  - machine learning
KW  - statistical models
N2  - Background: Traditional statistical models often fail to capture the complex dynamics influencing survival outcomes in patients with bladder cancer after radical cystectomy, a procedure where approximately 50% of patients develop metastases within 2 years. The integration of artificial intelligence (AI) offers a promising avenue for enhancing prognostic accuracy and personalizing treatment strategies. Objective: This study aimed to develop and evaluate a machine learning algorithm for predicting disease-free survival (DFS), overall survival (OS), and the cause of death in patients with bladder cancer undergoing cystectomy, using a comprehensive dataset of clinical and pathological variables. Methods: Retrospective data of 370 patients with bladder cancer who underwent radical cystectomy at Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy, were collected. The dataset comprised 20 input variables, encompassing demographics, tumor characteristics, treatment variables, and inflammatory markers. For specific analyses and models, we used patient subcohorts. The CatBoost algorithm was used for regression tasks (DFS in 346 patients, OS in 347 patients) and a binary classification task (tumor-related death in 312 patients). Model performance was assessed using mean absolute error (MAE) for regression and F1-score for classification, prioritizing a minimum recall of 75% for tumor-related deaths. Five-fold cross-validation and Shapley additive explanations (SHAP) values were used to ensure robustness and interpretability. Results: For DFS prediction, the CatBoost model achieved an MAE of 18.68 months, with clinical tumor stage and pathological tumor classification identified as the most influential predictors. OS prediction yielded an MAE of 17.2 months, which improved to 14.6 months after feature filtering, where tumor classification and the systemic immune-inflammation index (SII) were most impactful. For tumor-related death classification, the model achieved a recall of 78.6% and an F1-score of 0.44 for the positive class (tumor-related deaths), correctly identifying 11 of 14 cases. Bladder tumor position was the most influential feature for cause-of-death prediction. Conclusions: The developed machine learning algorithm demonstrates promising accuracy in predicting survival and the cause of death in patients with bladder cancer after cystectomy. The key predictors include clinical and pathological tumor staging, systemic inflammation (SII), and bladder tumor position. These findings highlight the potential of AI in providing clinicians with an objective, data-driven tool to improve personalized prognostic assessment and guide clinical decision-making. 
UR  - https://periop.jmir.org/2026/1/e86666
UR  - http://dx.doi.org/10.2196/86666
ID  - info:doi/10.2196/86666
ER  - 

TY  - JOUR
AU  - Mevik, Kjersti
AU  - Woldaregay, Zebene Ashenafi
AU  - Jonsson, Lindell Eva
AU  - Tejedor, Miguel
AU  - Temple-Oberle, Claire
PY  - 2026/2/17
TI  - Application of AI Models for Preventing Surgical Complications: Scoping Review of Clinical Readiness and Barriers to Implementation
JO  - JMIR AI
SP  - e75064
VL  - 5
KW  - surgical complications prediction models
KW  - machine learning
KW  - artificial intelligence
KW  - AI
KW  - surgical complications
KW  - predictive modeling
KW  - risk prediction
KW  - surgery outcomes
KW  - perioperative care
KW  - clinical decision support
N2  - Background: The impact of surgical complications is substantial and multifaceted, affecting patients and their families, surgeons, and health care systems. Despite the remarkable progress in artificial intelligence (AI), there remains a notable gap in the prospective implementation of AI models in surgery that use real-time data to support decision-making and enable proactive intervention to reduce the risk of surgical complications. Objective: This scoping review aims to assess and analyze the adoption and use of AI models for preventing surgical complications. Furthermore, this review aims to identify barriers and facilitators for implementation at the bedside. Methods: Following PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines, we conducted a literature search using IEEE Xplore, Scopus, Web of Science, MEDLINE, ProQuest, PubMed, ABI, Embase, Epistemonikos, CINAHL, and Cochrane registries. The inclusion criteria included empirical, peer-reviewed studies published in English between January 2013 and January 2025, involving AI models for preventing surgical complications (surgical site infections, and heart and lung complications or stroke) in real-world settings. Exclusions included retrospective algorithm-only validations, nonempirical research (eg, editorials or protocols), and non-English studies. Study characteristics and AI model development details were extracted, along with performance statistics (eg, sensitivity and area under the receiver operating characteristic curve). We then used thematic analysis to synthesize findings related to AI models, prediction outputs, and validation methods. Studies were grouped into three main themes: (1) duration of hypotension, (2) risk for complications, and (3) decision support tool. Results: Of the 275 identified records, 19 were included. The included models frequently demonstrated strong technical accuracy with high sensitivity and area under the receiver operating characteristic curve, particularly among studies evaluating decision support tools. However, only a few models were adopted routinely in clinical practice. Two studies evaluated the clinicians? perceptions regarding the use of AI models, reporting predominantly positive assessments of their usefulness. Conclusions: Overall, AI models hold potential to predict and prevent surgical complications as the validation studies demonstrated high accuracy. However, implementation in routine practice remains limited by usability barriers, workflow misalignment, trust concerns, and financial and ethical constraints. The evidence included in this scoping review was limited by the heterogeneity in study design and the predominance of small-scale feasibility studies, particularly for hypotension prediction. Future research should prioritize prospectively validated models that use other physiologic features and address clinicians? concerns regarding generalizability and adoption. 
UR  - https://ai.jmir.org/2026/1/e75064
UR  - http://dx.doi.org/10.2196/75064
ID  - info:doi/10.2196/75064
ER  - 

TY  - JOUR
AU  - Ma, Junwei
AU  - Tang, Huifeng
AU  - Zhang, Yunshan
AU  - Yi, Xuemei
AU  - Zhong, Tangsheng
AU  - Li, Xinyun
AU  - Wang, Gang
PY  - 2026/2/12
TI  - Machine Learning for Predicting Venous Thromboembolism After Joint Arthroplasty: Systematic Review of Clinical Applicability and Model Performance
JO  - JMIR Med Inform
SP  - e79886
VL  - 14
KW  - joint arthroplasty
KW  - venous thromboembolism
KW  - machine learning
KW  - meta-analysis
KW  - systematic review
N2  - Background: There is increasing research on machine learning in predicting venous thromboembolism after joint arthroplasty, but the quality and clinical applicability of these models remain uncertain. Objective: This systematic review aims to evaluate the predictive performance and methodological quality of machine learning models for venous thromboembolism risk after joint replacement surgery. Methods: Web of Science, Embase, Scopus, CNKI, Wanfang, Vipro, and PubMed were searched until December 15, 2024. The risk of bias and applicability were evaluated using the PROBAST (Prediction Model Risk of Bias Assessment Tool) checklist. A qualitative comprehensive analysis was conducted to extract and describe the data related to the model?s characteristics and performance. Results: This review encompassed 34 prediction models from 9 studies. The most frequently used machine learning models were extreme gradient boosting and logistic regression. The results showed that all studies had significant heterogeneity and high risk of bias. Although some models reported nearly flawless area under the curve (>0.9), they lacked external validation and may have overfitted. The models tested on large external datasets demonstrated more conservative performance. Conclusions: The predictive performance of machine learning models varied greatly. Although the reported area under the curve values indicated that some models have good discriminative ability, this performance varied greatly and was inconsistent among the included studies. These models have a high risk of bias, and it is necessary to take this into account when they are used in clinical practice. Future studies should adopt a prospective study design, ensure appropriate data handling, and use external validation to improve model robustness and applicability. Trial Registration: PROSPERO CRD42024625842; https://www.crd.york.ac.uk/PROSPERO/view/CRD42024625842 
UR  - https://medinform.jmir.org/2026/1/e79886
UR  - http://dx.doi.org/10.2196/79886
ID  - info:doi/10.2196/79886
ER  - 

TY  - JOUR
AU  - Dosis, Alexios
AU  - Syversen, Berger Aron
AU  - Kowal, R. Mikolaj
AU  - Grant, Daniel
AU  - Tiernan, Jim
AU  - Wong, David
AU  - Jayne, G. David
PY  - 2026/1/27
TI  - Exploiting Unsupervised Free-Living Data for Cardiorespiratory Fitness Estimation: Systematic Review and Meta-Analysis
JO  - JMIR Mhealth Uhealth
SP  - e69996
VL  - 14
KW  - wearables
KW  - cardiorespiratory fitness
KW  - free-living data
KW  - machine learning
KW  - perioperative medicine
N2  - Background: Current methods of cardiorespiratory fitness (CRF) assessment may discriminate against frail individuals who are challenged to perform a maximal cardiopulmonary exercise test. CRF estimations from free-living wearable data, captured over extended time periods, may offer a more representative assessment and increase usability in clinical settings. Objective: This study aimed to review current evidence behind this novel concept and evaluate the performance and quality of models developed to estimate CRF from free-living, unsupervised data. Methods: Following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, we systematically searched 4 databases (MEDLINE, Embase, Scopus, and arXiv) for studies reporting the development of models to estimate CRF from continuous free-living wearable data. Studies conducted entirely under controlled laboratory conditions were excluded. Performance metrics were combined in a meta-correlation analysis using a random-effects model and Fisher Z transformation. Results: Of 1848 papers screened, 18 met the eligibility criteria, with a total of 31,072 participants. The weighted mean age was 46.9 (SD 1.46) years. Multiple computational techniques were used, with 8 studies employing more advanced machine learning models. The meta-correlation analysis revealed a pooled overall estimate of 0.83 with a 95% CI 0.77?0.88. The I2 test indicated high heterogeneity at 97%. Risk of bias assessment found most concerns in the data analysis domain, with studies often lacking clarity around the data handling process. Conclusions: A promising preliminary agreement between CRF predictions and measured values was noted. However, no definite conclusions can be drawn for clinical implementation due to high heterogeneity among the included studies and lack of external validation. Nonetheless, continuous data streams appear to be a valuable resource that could lead to a step change in how we measure and monitor CRF. Trial Registration: PROSPERO CRD42024593878; https://www.crd.york.ac.uk/PROSPERO/view/CRD42024593878 
UR  - https://mhealth.jmir.org/2026/1/e69996
UR  - http://dx.doi.org/10.2196/69996
ID  - info:doi/10.2196/69996
ER  - 

TY  - JOUR
AU  - Gaessler, Jan
AU  - Remschmidt, Bernhard
AU  - Jopp, Ann-Kathrin
AU  - Arefnia, Behrouz
AU  - Franke, Adrian
AU  - Rieder, Marcus
PY  - 2026/1/5
TI  - Quality of Conventional versus Artificial Intelligence Oral Surgery Consent Forms: Comparative Analysis
JO  - J Med Internet Res
SP  - e59851
VL  - 28
KW  - oral surgical procedures
KW  - informed consent
KW  - quality control
KW  - artificial intelligence
KW  - oral surgery
KW  - consent form
KW  - AI
KW  - dental health
KW  - oral surgeon
KW  - patient care
KW  - paitent autonomy
KW  - dentistry
UR  - https://www.jmir.org/2026/1/e59851
UR  - http://dx.doi.org/10.2196/59851
ID  - info:doi/10.2196/59851
ER  - 

TY  - JOUR
AU  - Escobar-Castillejos, David
AU  - Barrera-Animas, Y. Ari
AU  - Noguez, Julieta
AU  - Magana, J. Alejandra
AU  - Benes, Bedrich
PY  - 2025/11/18
TI  - Transforming Surgical Training With AI Techniques for Training, Assessment, and Evaluation: Scoping Review
JO  - J Med Internet Res
SP  - e58966
VL  - 27
KW  - artificial intelligence
KW  - technology-enhanced learning
KW  - simulation-based training
KW  - performance assessment
KW  - medical training
KW  - surgery
KW  - higher education
KW  - educational innovation
N2  - Background: Artificial intelligence (AI) has introduced novel opportunities for assessment and evaluation in surgical training, offering potential improvements that could surpass traditional educational methods. Objective: This scoping review examines the integration of AI in surgical training, assessment, and evaluation, aiming to determine how AI technologies can enhance trainees? learning paths and performance by incorporating data-driven insights and predictive analytics. In addition, this review examines the current state and applications of AI algorithms in this field, identifying potential areas for future research. Methods: Following the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines, the PubMed, Scopus, and Web of Science were searched for studies published between January 2020 and March 18, 2024. Eligibility criteria included English-language full-text articles that investigated the application of AI in surgical training, assessment, or evaluation; non-English texts, reviews, preprints, and studies not addressing AI in surgical education were excluded. After duplicate removal and screening, 56 studies were included in the analysis. Data were structured by categorizing studies according to surgical procedure, AI technique, and training setup. Results were synthesized narratively and summarized in frequency tables. Results: From 1400 initial records, 56 studies met the inclusion criteria. Most were journal articles (84%, 47/56), with the remainder being conference papers (16%, 9/56). AI was most frequently applied in minimally invasive surgery (27%, 15/56), neurosurgery (20%, 11/56), and laparoscopy (16%, 9/56). Common techniques included machine learning (20%, 11/56), clustering (14%, 8/56), deep learning (11%, 6/56), convolutional neural networks (11%, 6/56), and support vector machines (11%, 6/56). Training setups were dominated by simulation platforms (33%, 19/56) and box trainers (24%, 13/56), followed by surgical video analysis (16%, 9/56), and robotic systems such as the da Vinci platform (13%, 7/56). Across studies, AI-enhanced training environments provided automated skill assessment, personalized feedback, and adaptive learning trajectories, with several reporting improvements in trainees? learning curves and technical proficiency. However, heterogeneity in study design and outcome measures limited comparability, and algorithmic transparency was often lacking. Conclusions: The application of AI in surgical training demonstrates the potential to enhance skill acquisition and support more efficient, personalized, and adaptive learning pathways. Despite encouraging findings, several limitations exist, including small sample sizes, the lack of standardized evaluation metrics, and insufficient external validation of AI models. Future studies should aim to clarify AI methodologies, improve reproducibility, and develop scalable, simulation-based solutions aligned with global education goals. 
UR  - https://www.jmir.org/2025/1/e58966
UR  - http://dx.doi.org/10.2196/58966
UR  - http://www.ncbi.nlm.nih.gov/pubmed/41252719
ID  - info:doi/10.2196/58966
ER  - 

TY  - JOUR
AU  - Han, Changho
AU  - Soh, Sarah
AU  - Park, Je-Wook
AU  - Pak, Hui-Nam
AU  - Yoon, Dukyong
PY  - 2025/11/10
TI  - Artificial Intelligence?Based Electrocardiogram Model as a Predictor of Postoperative Atrial Fibrillation Following Cardiac Surgery: Retrospective Cohort Study
JO  - J Med Internet Res
SP  - e77164
VL  - 27
KW  - postoperative atrial fibrillation
KW  - cardiac surgery
KW  - electrocardiogram
KW  - artificial intelligence
KW  - deep learning
N2  - Background: Postoperative atrial fibrillation (AF) after cardiac surgery is common and is associated with substantial clinical and economic repercussions. However, existing strategies for preventing postoperative AF remain suboptimal, limiting proactive management. Advances in artificial intelligence (AI) may improve the prediction of postoperative AF. Studies have shown that deep learning applied to electrocardiograms (ECGs) can detect subtle patterns in non-AF ECGs associated with a history of (or impending) AF (referred to as the AI-ECG-AF model). As a noninvasive test routinely performed throughout the perioperative period, the ECG presents a unique opportunity for additional risk stratification. Objective: We aimed to determine whether the AI-ECG-AF model can serve as an independent risk factor for postoperative AF after cardiac surgery, compare its predictive performance with existing postoperative AF prediction tools, and assess its additive value. Methods: This single-center retrospective cohort study included 2266 patients (5402 standard 12-lead ECGs) who underwent cardiac surgery at a tertiary hospital in South Korea between December 2018 and December 2023. The AI-ECG-AF model was trained on 4.05 million non-AF standard 12-lead ECGs (1.13 million patients) using a 1D EfficientNet-B0 architecture and achieved an area under the receiver operating characteristic curve (AUROC) of 0.901 (95% CI 0.900?0.902) in its held-out test set. Postoperative AF was defined as AF documented by ECG within 30 days after surgery. Using multivariable logistic regression, we assessed the association between the AI-ECG-AF model score and postoperative AF, adjusting for conventional clinical variables. We also investigated the additive or synergistic predictive value of the AI-ECG-AF model score when combined with an existing postoperative AF tool (the postoperative atrial fibrillation score) or other risk factors, based on the AUROC. Results: After adjusting for other clinical variables, a 10% absolute increase in the AI-ECG-AF model score was associated with a 1.197- to 1.209-fold increase in the odds of developing postoperative AF. The AI-ECG-AF model score significantly enhanced postoperative AF prediction: the AUROC of the existing postoperative atrial fibrillation score was 0.643; adding the AI-ECG-AF model score increased it to 0.680 (P<.001), and combining the AI-ECG-AF model score with other risk factors raised it to 0.710 (P<.001). Conclusions: The AI-ECG-AF model serves as a novel, robust, and independent risk factor for postoperative AF following cardiac surgery and provides additive or synergistic predictive value when integrated with existing postoperative AF prediction tools or other risk factors. By capturing atrial electrophysiological vulnerability not reflected in conventional clinical scores, the AI-ECG-AF model may function as a noninvasive biomarker for preoperative risk stratification for postoperative AF prediction in cardiac surgery patients, potentially enabling targeted prophylaxis and closer monitoring during the perioperative period. 
UR  - https://www.jmir.org/2025/1/e77164
UR  - http://dx.doi.org/10.2196/77164
ID  - info:doi/10.2196/77164
ER  - 

TY  - JOUR
AU  - Wang, Runchen
AU  - Zheng, Jianqi
AU  - Guo, Wenwei
AU  - Huang, Haiqi
AU  - Wang, Qixia
AU  - Li, Yihong
AU  - Lin, Manwan
AU  - Huang, Linchong
AU  - Zhang, Qing
AU  - Chen, Kaishen
AU  - Ye, Zhiming
AU  - Deng, Hongsheng
AU  - Jiang, Yu
AU  - Lin, Yuechun
AU  - Feng, Yi
AU  - Huang, Ying
AU  - Chen, Ying
AU  - He, Jianxing
AU  - Liang, Hengrui
PY  - 2025/9/16
TI  - Integrating a Multimodal Digital Device for Continuous Perioperative Monitoring in Patients With Lung Cancer Undergoing Thoracic Surgery: Development and Usability Study
JO  - JMIR Mhealth Uhealth
SP  - e69512
VL  - 13
KW  - lung cancer
KW  - digital device
KW  - wearable device
KW  - patients reported outcomes
KW  - multi-modal
KW  - artificial intelligence
KW  - AI
N2  - Background: Minimally invasive thoracic surgery has improved lung cancer outcomes but requires enhanced postoperative care. Traditionally, the episodic care model has limited timely and multidimensional monitoring of patients. Recent technological advances in multimodal digital devices, including wearable devices and electronic patient-reported outcomes (ePROs), offer a promising solution to these challenges. However, current studies focus on only a few parameters and limited application in thoracic surgery. Objective: This study aims to propose a self-controlled study to evaluate the feasibility and reliability of multimodal digital devices, including wearables and ePROs, for continuous perioperative monitoring to enhance recovery after thoracic surgery. Methods: We included 288 patients with non?small cell lung cancer from the Guangzhou Medical University cohort, which includes 2757 participants with various lung diseases. Digital data were collected during hospitalization using a commercial smartwatch combined with an ePROs questionnaire, while clinical data were obtained from electronic health records (EHRs). Agreement between the digital device and EHR was evaluated via Bland-Altman analysis. Time-series data were normalized for continuous outlier monitoring, and threshold analysis of ePROs scores were used to explore associations across different modules. Results: Throughout hospitalization, digital devices provided a subjective overview of the patients? recovery trajectories. Results of Bland-Altman analysis demonstrated a high level of agreement between the digital device and the EHR. For body temperature, the analysis revealed a minimal bias of 0.02 �C (95% CI ?0.01 �C to 0.05 �C), the agreement for heart rate showed a bias of 0.26 beats per minute (bpm; 95% CI ?0.49 bpm to 1.01 bpm), and the bias for oxygen saturation was ?0.06% (95% CI ?0.27% to 0.15%), indicating close alignment between the 2 measurement methods. Meanwhile, wearable devices demonstrate significant potential in outlier detection compared to the episodic care model, offering accurate and sensitive monitoring of outliers between traditional measurement intervals. Using a thresholding method, we found that wearable metrics were correlated with the severity of ePROs. Conclusions: These findings highlight the reliability and clinical potential of digital device?based multimodal systems within the enhanced recovery after surgery framework, offering a novel approach for continuous perioperative monitoring. 
UR  - https://mhealth.jmir.org/2025/1/e69512
UR  - http://dx.doi.org/10.2196/69512
UR  - http://www.ncbi.nlm.nih.gov/pubmed/
ID  - info:doi/10.2196/69512
ER  - 

TY  - JOUR
AU  - Chandrasekar, Subramaniam Rajagopal
AU  - Kane, Michael
AU  - Krishnamurti, Lakshmanan
PY  - 2025/9/15
TI  - Machine-Learning Predictive Tool for the Individualized Prediction of Outcomes of Hematopoietic Cell Transplantation for Sickle Cell Disease: Registry-Based Study
JO  - JMIR AI
SP  - e64519
VL  - 4
KW  - sickle cell disease
KW  - SCD
KW  - prediction algorithms
KW  - hematopoietic stem cell transplantation
KW  - machine learning
KW  - ML
KW  - predictive tool
KW  - prediction
KW  - hematopoietic cell transplantation
KW  - HCT
KW  - hematopoietic cell
KW  - registry-based study
KW  - clinical decision-making
KW  - prediction model
KW  - clinical outcomes
KW  - gene therapy
KW  - shared decision-making
N2  - Background: Disease-modifying therapies ameliorate disease severity of sickle cell disease (SCD), but hematopoietic cell transplantation (HCT), and more recently, autologous gene therapy are the only treatments that have curative potential for SCD. While registry-based studies provide population-level estimates, they do not address the uncertainty regarding individual outcomes of HCT. Computational machine learning (ML) has the potential to identify generalizable predictive patterns and quantify uncertainty in estimates, thereby improving clinical decision-making. There is no existing ML model for SCD, and ML models for HCT for other diseases focus on single outcomes rather than all relevant outcomes. Objective: This study aims to address the existing knowledge gap by developing and validating an individualized ML prediction model SPRIGHT (Sickle Cell Predicting Outcomes of Hematopoietic Cell Transplantation), incorporating multiple relevant pre-HCT features to make predictions of key post-HCT clinical outcomes. Methods: We applied a supervised random forest ML model to clinical parameters in a deidentified Center for International Blood and Marrow Transplant Research (CIBMTR) dataset of 1641 patients who underwent HCT between 1991 and 2021 and were followed for a median of 42.5 (IQR 52.5;range 0.3?312.9) months. We applied forward and reverse feature selection methods to optimize a set of predictive variables. To counter the imbalance bias toward predicting positive outcomes due to the small number of negative outcomes, we constructed a training dataset, taking each outcome as variable of interest, and performed 2-times repeated 10-fold cross-validation. SPRIGHT is a web-based individualized prediction tool accessible by smartphone, tablet, or personal computer. It incorporates predictive variables of age, age group, Karnofsky or Lansky score, comorbidity index, recipient cytomegalovirus seropositivity, history of acute chest syndrome, need for exchange transfusion, occurrence and frequency of vaso-occlusive crisis (VOC) before HCT, and either a published or custom chemotherapy or radiation conditioning, serotherapy, and graft-versus-host disease prophylaxis. SPRIGHT makes individualized predictions of overall survival (OS), event-free survival, graft failure, acute graft-versus-host disease (AGVHD), chronic graft-versus-host disease (CGVHD), and occurrence of VOC or stroke post-HCT. Results: The model's ability to distinguish between positive and negative classes, that is, discrimination, was evaluated using the area under the curve, accuracy, and balanced accuracy. Discrimination met or exceeded published predictive benchmarks with area under the curve for OS (0.7925), event-free survival (0.7900), graft failure (0.8024), acute graft-versus-host disease (0.6793), chronic graft-versus-host disease (0.7320), and VOC post-HCT (0.8779). SPRIGHT revealed good calibration with a slope of 0.87?0.96, with small negative intercepts (?0.01 to 0.03), for 4 out of the 5 outcomes. However, OS exhibits nonideal calibration, which may be reflective of the overall high OS in all subgroups. Conclusions: A web-based ML prediction tool incorporating multiple clinically relevant variables predicts key clinical outcomes with a high level of discrimination and calibration and has potential in shared decision-making 
UR  - https://ai.jmir.org/2025/1/e64519
UR  - http://dx.doi.org/10.2196/64519
ID  - info:doi/10.2196/64519
ER  - 

TY  - JOUR
AU  - Lex, R. Johnathan
AU  - Abbas, Aazad
AU  - Mosseri, Jacob
AU  - Singh Toor, Jay
AU  - Simone, Michael
AU  - Ravi, Bheeshma
AU  - Whyne, Cari
AU  - Khalil, B. Elias
PY  - 2025/9/10
TI  - Using Machine Learning to Predict-Then-Optimize Elective Orthopedic Surgery Scheduling to Improve Operating Room Utilization: Retrospective Study
JO  - JMIR Med Inform
SP  - e70857
VL  - 13
KW  - machine learning
KW  - orthopedic surgery
KW  - optimization
KW  - elective surgery
KW  - scheduling
KW  - hip and knee arthroplasty
N2  - Background: Total knee and hip arthroplasty (TKA and THA) are among the most performed elective procedures. Rising demand and the resource-intensive nature of these procedures have contributed to longer wait times despite significant health care investment. Current scheduling methods often rely on average surgical durations, overlooking patient-specific variability. Objective: To determine the potential for improving elective surgery scheduling for TKA and THA, respectively, by using a 2-stage approach that incorporates machine learning (ML) prediction of the duration of surgery (DOS) with scheduling optimization. Methods: In total, 2 ML models (one each for TKA and THA) were trained to predict DOS using patient factors based on 302,490 and 196,942 patients, respectively, from a large international database. In total, 3 optimization formulations based on varying surgeon flexibility were compared: Any (surgeons could operate in any operating room at any time), Split (limitation of 2 surgeons per operating room per day), and multiple subset sum problem (MSSP; limit of 1 surgeon per operating room per day). Two years of daily scheduling simulations were performed for each optimization problem using ML prediction or mean DOS over a range of schedule parameters. Constraints and resources were based on a high-volume arthroplasty hospital in Canada. Results: The TKA and THA prediction models achieved test accuracy (with a 30 min buffer) of 78.1% (mean squared error 0.898) and 75.4% (mean squared error 0.916), respectively. Any scheduling formulation performed significantly worse than the Split and MSSP formulations with respect to overtime and underutilization (P<.001). The latter 2 problems performed similarly (P>.05) over most schedule parameters. The ML prediction schedules outperformed those generated using a mean DOS for most scheduling parameters, with overtime reduced on average by 300-500 minutes per week (12?20 min per operating room per day; P<.001). However, there was more operating room underutilization with the ML prediction schedules, with it ranging from 70?192 minutes more underutilization (P<.001). Using a 15-minute schedule granularity with a waitlist pool of a minimum of 1 month generated the ML schedule that outperformed the mean schedule 97.1% of times. Conclusions: Assuming a full waiting list, optimizing an individual surgeon?s elective operating room time using an ML-assisted predict-then-optimize scheduling system improves overall operating room efficiency, significantly decreasing overtime. This has significant potential implications for health care systems struggling with pressures of rising costs and growing operative waitlists. 
UR  - https://medinform.jmir.org/2025/1/e70857
UR  - http://dx.doi.org/10.2196/70857
ID  - info:doi/10.2196/70857
ER  - 

TY  - JOUR
AU  - Huang, Kecheng
AU  - Wu, Chujun
AU  - Pi, Rongpeng
AU  - Fang, Jieyu
PY  - 2025/8/22
TI  - AI-Driven Integration of Deep Learning With Lung Imaging, Functional Analysis, and Blood Gas Metrics for Perioperative Hypoxemia Prediction
JO  - JMIR Med Inform
SP  - e73995
VL  - 13
KW  - pneumonia
KW  - perioperative
KW  - artificial intelligence
KW  - hypoxemia
KW  - deep learning
UR  - https://medinform.jmir.org/2025/1/e73995
UR  - http://dx.doi.org/10.2196/73995
UR  - http://www.ncbi.nlm.nih.gov/pubmed/40759599
ID  - info:doi/10.2196/73995
ER  - 

TY  - JOUR
AU  - Obst, Miriam
AU  - Arensmeyer, Jan
AU  - Bonsmann, Henrik
AU  - Kolbinger, Andreas
AU  - Kigenyi, Joel
AU  - Oneka, Francis
AU  - Owere, Benard
AU  - Schmidt, Joachim
AU  - Feodorovici, Philipp
AU  - Wynands, Jan
PY  - 2025/8/18
TI  - AI-Enhanced 3D Models in Global Virtual Reality Case Conferences for Surgical Care in a Low-Income Country: Exploratory Study
JO  - JMIR Form Res
SP  - e69300
VL  - 9
KW  - 3D scanning
KW  - artificial intelligence
KW  - virtual reality
KW  - extended reality
KW  - metaverse
KW  - spatial computing
KW  - global surgery
KW  - reconstructive surgery
N2  - Background: Approximately 5 billion people worldwide lack adequate access to surgical care, primarily in the Global South. Especially in crisis regions and war zones, telemedical applications may enhance health services. This study explores the feasibility of using artificial intelligence (AI)-enhanced 3D imaging and extended reality (XR) technologies for intercontinental surgical case conferences in a low-resource scenario in Uganda. Our pilot study aims to assess the value of these technologies to address the lack of surgical resources and multilateral knowledge exchange. Objective: This study intends to determine the feasibility of using new AI-enhanced image modeling technology within an immersive spatial XR scenario to collaboratively and remotely assess reconstructive patient cases in the resource-limited country of Uganda. Methods: Within a surgical camp at Lamu Medical Centre, Uganda, 3D models of patients? conditions were created using a smartphone app. Digital models were generated from photographs taken on-site and processed into 3D formats to be visualized in virtual case conferences. Here, surgeons from Uganda and Germany used virtual reality (VR) headsets to collaboratively discuss case strategies while marking surgical approaches on each digital patient model. Results: The study included 15 patients requiring reconstructive surgery, with a diverse range of conditions. The use of XR technology facilitated detailed visualization and discussion of surgical strategies. The process was time-efficient, with a total of under 8 minutes per case for data acquisition and model creation, and resource-efficient with surgeons reporting sufficient quality of smartphone-derived models. Valuable user experience and precise interaction during the VR case processing were found, underlining its potential to improve surgical planning and patient care in resource-limited settings. Conclusions: The findings indicate that AI-enhanced 3D imaging and immersive virtual communication platforms are valuable tools for integrative surgical case assessments. The cost-effectiveness of the used consumer solutions should be especially beneficial for low-resource environments. While the study demonstrates the feasibility of this approach, further research is needed to explore a broader application and impact of these technologies in global health. The study highlights the potential of XR to enhance training and surgical precision, contributing to better health care outcomes in underserved regions. 
UR  - https://formative.jmir.org/2025/1/e69300
UR  - http://dx.doi.org/10.2196/69300
ID  - info:doi/10.2196/69300
ER  - 

TY  - JOUR
AU  - Maruyama, Hiroki
AU  - Toyama, Yoshitaka
AU  - Takanami, Kentaro
AU  - Takase, Kei
AU  - Kamei, Takashi
PY  - 2025/7/30
TI  - Role of Artificial Intelligence in Surgical Training by Assessing GPT-4 and GPT-4o on the Japan Surgical Board Examination With Text-Only and Image-Accompanied Questions: Performance Evaluation Study
JO  - JMIR Med Educ
SP  - e69313
VL  - 11
KW  - LLM
KW  - ChatGPT
KW  - Japan Surgical Board Examination
KW  - surgical education
KW  - large language models
KW  - artificial intelligence
KW  - Medical Licensing Examination
KW  - diagnostic imaging
N2  - Background: Artificial intelligence and large language models (LLMs)?particularly GPT-4 and GPT-4o?have demonstrated high correct-answer rates in medical examinations. GPT-4o has enhanced diagnostic capabilities, advanced image processing, and updated knowledge. Japanese surgeons face critical challenges, including a declining workforce, regional health care disparities, and work-hour-related challenges. Nonetheless, although LLMs could be beneficial in surgical education, no studies have yet assessed GPT-4o?s surgical knowledge or its performance in the field of surgery. Objective: This study aims to evaluate the potential of GPT-4 and GPT-4o in surgical education by using them to take the Japan Surgical Board Examination (JSBE), which includes both textual questions and medical images?such as surgical and computed tomography scans?to comprehensively assess their surgical knowledge. Methods: We used 297 multiple-choice questions from the 2021?2023 JSBEs. The questions were in Japanese, and 104 of them included images. First, the GPT-4 and GPT-4o responses to only the textual questions were collected via OpenAI?s application programming interface to evaluate their correct-answer rate. Subsequently, the correct-answer rate of their responses to questions that included images was assessed by inputting both text and images. Results: The overall correct-answer rates of GPT-4o and GPT-4 for the text-only questions were 78% (231/297) and 55% (163/297), respectively, with GPT-4o outperforming GPT-4 by 23% (P=<.01). By contrast, there was no significant improvement in the correct-answer rate for questions that included images compared with the results for the text-only questions. Conclusions: GPT-4o outperformed GPT-4 on the JSBE. However, the results of the LLMs were lower than those of the examinees. Despite the capabilities of LLMs, image recognition remains a challenge for them, and their clinical application requires caution owing to the potential inaccuracy of their results. 
UR  - https://mededu.jmir.org/2025/1/e69313
UR  - http://dx.doi.org/10.2196/69313
ID  - info:doi/10.2196/69313
ER  - 

TY  - JOUR
AU  - Li, Xin
AU  - Yang, Wen-yu
AU  - Zhang, Fan
AU  - Shan, Rui
AU  - Mei, Fang
AU  - Song, Shi-Bing
AU  - Sun, Bang-Kai
AU  - Chen, Jing
AU  - Hu, Run-ze
AU  - Yang, Yang
AU  - Yang, Yi-hang
AU  - Liu, Jing-yao
AU  - Yuan, Chun-Hui
AU  - Liu, Zheng
PY  - 2025/7/11
TI  - Size-Specific Predictors for Malignancy Risk in Follicular Thyroid Neoplasms: Machine Learning Analysis
JO  - JMIR Cancer
SP  - e73069
VL  - 11
KW  - follicular thyroid neoplasm
KW  - tumor size
KW  - machine learning
KW  - malignancy
KW  - follicular thyroid cancer
KW  - follicular thyroid adenoma
KW  - random forest
KW  - XGBoost
N2  - Background: Surgeons often face challenges in distinguishing between benign and malignant follicular thyroid neoplasms (FTNs), particularly small tumors, until diagnostic surgery is performed. Objective: This study aimed to identify the size-specific predictors for the malignancy risk of FTNs preoperatively. Methods: A retrospective cohort study was conducted at Peking University Third Hospital in Beijing, China, from 2012 to 2023. Patients with a postoperative pathological diagnosis of follicular thyroid adenoma (FTA) or follicular thyroid carcinoma (FTC) were included. FTNs were classified into small- and large-sized categories based on the cutoff value of the tumor diameter derived from spline regression, which indicated the turning point of malignancy risk. We identified the 5 most important predictors from 22 variables including demography, sonography, and hormones, using machine learning methods. We also calculated the odds ratios (OR) with 95% CI for these predictors in both small- and large-sized FTNs. Results: Altogether, we included 1494 FTNs, comprising 1266 FTAs and 228 FTCs. FTNs with a maximum diameter less than 3.0 cm were grouped as small-sized tumors (n=715), while those with larger diameters were categorized as large-sized tumors (n=779). In the small-sized group, tumors with macrocalcification (OR 2.90, 95% CI 1.50-5.60), those with peripheral calcification (OR 4.50, 95% CI 1.50-13.00), and those in younger patients (OR 1.33, 95% CI 1.05-1.69) showed a higher malignancy risk. In the large-sized group, tumors presenting with a nodule-in-nodule appearance (OR 3.30, 95% CI 1.30-7.90) exhibited a higher malignancy risk. In both groups, lower thyroid-stimulating hormone levels (OR 1.49, 95% CI 1.20-1.85 for small-sized FTNs; OR 1.61, 95% CI 1.37-1.96 for large-sized FTNs) and a larger mean diameter (OR 1.40, 95% CI 1.10-1.70 for small-sized FTNs; OR 1.50 95% CI 1.20-1.70 for large-sized FTNs) were associated with the malignancy risk of FTNs. Conclusion: This study identified size-specific predictors for malignancy risk in FTNs, highlighting the importance of stratified prediction based on tumor size. 
UR  - https://cancer.jmir.org/2025/1/e73069
UR  - http://dx.doi.org/10.2196/73069
ID  - info:doi/10.2196/73069
ER  - 

TY  - JOUR
AU  - Parduzi, Qendresa
AU  - Wermelinger, Jonathan
AU  - Koller, Domingo Simon
AU  - Sariyar, Murat
AU  - Schneider, Ulf
AU  - Raabe, Andreas
AU  - Seidel, Kathleen
PY  - 2025/3/24
TI  - Explainable AI for Intraoperative Motor-Evoked Potential Muscle Classification in Neurosurgery: Bicentric Retrospective Study
JO  - J Med Internet Res
SP  - e63937
VL  - 27
KW  - intraoperative neuromonitoring
KW  - motor evoked potential
KW  - artificial intelligence
KW  - machine learning
KW  - deep learning
KW  - random forest
KW  - convolutional neural network
KW  - explainability
KW  - medical informatics
KW  - personalized medicine
KW  - neurophysiological
KW  - monitoring
KW  - orthopedic
KW  - motor
KW  - neurosurgery
N2  - Background: Intraoperative neurophysiological monitoring (IONM) guides the surgeon in ensuring motor pathway integrity during high-risk neurosurgical and orthopedic procedures. Although motor-evoked potentials (MEPs) are valuable for predicting motor outcomes, the key features of predictive signals are not well understood, and standardized warning criteria are lacking. Developing a muscle identification prediction model could increase patient safety while allowing the exploration of relevant features for the task. Objective: The aim of this study is to expand the development of machine learning (ML) methods for muscle classification and evaluate them in a bicentric setup. Further, we aim to identify key features of MEP signals that contribute to accurate muscle classification using explainable artificial intelligence (XAI) techniques. Methods: This study used ML and deep learning models, specifically random forest (RF) classifiers and convolutional neural networks (CNNs), to classify MEP signals from routine supratentorial neurosurgical procedures from two medical centers according to muscle identity of four muscles (extensor digitorum, abductor pollicis brevis, tibialis anterior, and abductor hallucis). The algorithms were trained and validated on a total of 36,992 MEPs from 151 surgeries in one center, and they were tested on 24,298 MEPs from 58 surgeries from the other center. Depending on the algorithm, time-series, feature-engineered, and time-frequency representations of the MEP data were used. XAI techniques, specifically Shapley Additive Explanation (SHAP) values and gradient class activation maps (Grad-CAM), were implemented to identify important signal features. Results: High classification accuracy was achieved with the RF classifier, reaching 87.9% accuracy on the validation set and 80% accuracy on the test set. The 1D- and 2D-CNNs demonstrated comparably strong performance. Our XAI findings indicate that frequency components and peak latencies are crucial for accurate MEP classification, providing insights that could inform intraoperative warning criteria. Conclusions: This study demonstrates the effectiveness of ML techniques and the importance of XAI in enhancing trust in and reliability of artificial intelligence?driven IONM applications. Further, it may help to identify new intrinsic features of MEP signals so far overlooked in conventional warning criteria. By reducing the risk of muscle mislabeling and by providing the basis for possible new warning criteria, this study may help to increase patient safety during surgical procedures. 
UR  - https://www.jmir.org/2025/1/e63937
UR  - http://dx.doi.org/10.2196/63937
UR  - http://www.ncbi.nlm.nih.gov/pubmed/
ID  - info:doi/10.2196/63937
ER  - 

TY  - JOUR
AU  - Oh, Mi-Young
AU  - Kim, Hee-Soo
AU  - Jung, Mi Young
AU  - Lee, Hyung-Chul
AU  - Lee, Seung-Bo
AU  - Lee, Mi Seung
PY  - 2025/3/19
TI  - Machine Learning?Based Explainable Automated Nonlinear Computation Scoring System for Health Score and an Application for Prediction of Perioperative Stroke: Retrospective Study
JO  - J Med Internet Res
SP  - e58021
VL  - 27
KW  - machine learning
KW  - explainability
KW  - score
KW  - computation scoring system
KW  - Nonlinear computation
KW  - application
KW  - perioperative stroke
KW  - perioperative
KW  - stroke
KW  - efficiency
KW  - ML-based models
KW  - patient
KW  - noncardiac surgery
KW  - noncardiac
KW  - surgery
KW  - effectiveness
KW  - risk tool
KW  - risk
KW  - tool
KW  - real-world data
N2  - Background: Machine learning (ML) has the potential to enhance performance by capturing nonlinear interactions. However, ML-based models have some limitations in terms of interpretability. Objective: This study aimed to develop and validate a more comprehensible and efficient ML-based scoring system using SHapley Additive exPlanations (SHAP) values. Methods: We developed and validated the Explainable Automated nonlinear Computation scoring system for Health (EACH) framework score. We developed a CatBoost-based prediction model, identified key features, and automatically detected the top 5 steepest slope change points based on SHAP plots. Subsequently, we developed a scoring system (EACH) and normalized the score. Finally, the EACH score was used to predict perioperative stroke. We developed the EACH score using data from the Seoul National University Hospital cohort and validated it using data from the Boramae Medical Center, which was geographically and temporally different from the development set. Results: When applied for perioperative stroke prediction among 38,737 patients undergoing noncardiac surgery, the EACH score achieved an area under the curve (AUC) of 0.829 (95% CI 0.753-0.892). In the external validation, the EACH score demonstrated superior predictive performance with an AUC of 0.784 (95% CI 0.694-0.871) compared with a traditional score (AUC=0.528, 95% CI 0.457-0.619) and another ML-based scoring generator (AUC=0.564, 95% CI 0.516-0.612). Conclusions: The EACH score is a more precise, explainable ML-based risk tool, proven effective in real-world data. The EACH score outperformed traditional scoring system and other prediction models based on different ML techniques in predicting perioperative stroke. 
UR  - https://www.jmir.org/2025/1/e58021
UR  - http://dx.doi.org/10.2196/58021
UR  - http://www.ncbi.nlm.nih.gov/pubmed/
ID  - info:doi/10.2196/58021
ER  - 

TY  - JOUR
AU  - Dong, Jiale
AU  - Jin, Zhechuan
AU  - Li, Chengxiang
AU  - Yang, Jian
AU  - Jiang, Yi
AU  - Li, Zeqian
AU  - Chen, Cheng
AU  - Zhang, Bo
AU  - Ye, Zhaofei
AU  - Hu, Yang
AU  - Ma, Jianguo
AU  - Li, Ping
AU  - Li, Yulin
AU  - Wang, Dongjin
AU  - Ji, Zhili
PY  - 2025/3/6
TI  - Machine Learning Models With Prognostic Implications for Predicting Gastrointestinal Bleeding After Coronary Artery Bypass Grafting and Guiding Personalized Medicine: Multicenter Cohort Study
JO  - J Med Internet Res
SP  - e68509
VL  - 27
KW  - machine learning
KW  - personalized medicine
KW  - coronary artery bypass grafting
KW  - adverse outcome
KW  - gastrointestinal bleeding
N2  - Background: Gastrointestinal bleeding is a serious adverse event of coronary artery bypass grafting and lacks tailored risk assessment tools for personalized prevention. Objective: This study aims to develop and validate predictive models to assess the risk of gastrointestinal bleeding after coronary artery bypass grafting (GIBCG) and to guide personalized prevention. Methods: Participants were recruited from 4 medical centers, including a prospective cohort and the Medical Information Mart for Intensive Care IV (MIMIC-IV) database. From an initial cohort of 18,938 patients, 16,440 were included in the final analysis after applying the exclusion criteria. Thirty combinations of machine learning algorithms were compared, and the optimal model was selected based on integrated performance metrics, including the area under the receiver operating characteristic curve (AUROC) and the Brier score. This model was then developed into a web-based risk prediction calculator. The Shapley Additive Explanations method was used to provide both global and local explanations for the predictions. Results: The model was developed using data from 3 centers and a prospective cohort (n=13,399) and validated on the Drum Tower cohort (n=2745) and the MIMIC cohort (n=296). The optimal model, based on 15 easily accessible admission features, demonstrated an AUROC of 0.8482 (95% CI 0.8328-0.8618) in the derivation cohort. In external validation, the AUROC was 0.8513 (95% CI 0.8221-0.8782) for the Drum Tower cohort and 0.7811 (95% CI 0.7275-0.8343) for the MIMIC cohort. The analysis indicated that high-risk patients identified by the model had a significantly increased mortality risk (odds ratio 2.98, 95% CI 1.784-4.978; P<.001). For these high-risk populations, preoperative use of proton pump inhibitors was an independent protective factor against the occurrence of GIBCG. By contrast, dual antiplatelet therapy and oral anticoagulants were identified as independent risk factors. However, in low-risk populations, the use of proton pump inhibitors (?21=0.13, P=.72), dual antiplatelet therapy (?21=0.38, P=.54), and oral anticoagulants (?21=0.15, P=.69) were not significantly associated with the occurrence of GIBCG. Conclusions: Our machine learning model accurately identified patients at high risk of GIBCG, who had a poor prognosis. This approach can aid in early risk stratification and personalized prevention. Trial Registration: Chinese Clinical Registry Center ChiCTR2400086050; http://www.chictr.org.cn/showproj.html?proj=226129 
UR  - https://www.jmir.org/2025/1/e68509
UR  - http://dx.doi.org/10.2196/68509
UR  - http://www.ncbi.nlm.nih.gov/pubmed/40053791
ID  - info:doi/10.2196/68509
ER  - 

TY  - JOUR
AU  - Huang, Pinjie
AU  - Yang, Jirong
AU  - Zhao, Dizhou
AU  - Ran, Taojia
AU  - Luo, Yuheng
AU  - Yang, Dong
AU  - Zheng, Xueqin
AU  - Zhou, Shaoli
AU  - Chen, Chaojin
PY  - 2025/3/3
TI  - Machine Learning?Based Prediction of Early Complications Following Surgery for Intestinal Obstruction: Multicenter Retrospective Study
JO  - J Med Internet Res
SP  - e68354
VL  - 27
KW  - postoperative complications
KW  - intestinal obstruction
KW  - machine learning
KW  - early intervention
KW  - risk calculator
KW  - prediction model
KW  - Shapley additive explanations
N2  - Background: Early complications increase in-hospital stay and mortality after intestinal obstruction surgery. It is important to identify the risk of postoperative early complications for patients with intestinal obstruction at a sufficiently early stage, which would allow preemptive individualized enhanced therapy to be conducted to improve the prognosis of patients with intestinal obstruction. A risk predictive model based on machine learning is helpful for early diagnosis and timely intervention. Objective: This study aimed to construct an online risk calculator for early postoperative complications in patients after intestinal obstruction surgery based on machine learning algorithms. Methods: A total of 396 patients undergoing intestinal obstruction surgery from April 2013 to April 2021 at an independent medical center were enrolled as the training cohort. Overall, 7 machine learning methods were used to establish prediction models, with their performance appraised via the area under the receiver operating characteristic curve (AUROC), accuracy, sensitivity, specificity, and F1-score. The best model was validated through 2 independent medical centers, a publicly available perioperative dataset the Informative Surgical Patient dataset for Innovative Research Environment (INSPIRE), and a mixed cohort consisting of the above 3 datasets, involving 50, 66, 48, and 164 cases, respectively. Shapley Additive Explanations were measured to identify risk factors. Results: The incidence of postoperative complications in the training cohort was 47.44% (176/371), while the incidences in 4 external validation cohorts were 34% (17/50), 56.06% (37/66), 52.08% (25/48), and 48.17% (79/164), respectively. Postoperative complications were associated with 8-item features: Physiological Severity Score for the Enumeration of Mortality and Morbidity (POSSUM physiological score), the amount of colloid infusion, shock index before anesthesia induction, ASA (American Society of Anesthesiologists) classification, the percentage of neutrophils, shock index at the end of surgery, age, and total protein. The random forest model showed the best overall performance, with an AUROC of 0.788 (95% CI 0.709-0.869), accuracy of 0.756, sensitivity of 0.695, specificity of 0.810, and F1-score of 0.727 in the training cohort. The random forest model also achieved a comparable AUROC of 0.755 (95% CI 0.652-0.839) in validation cohort 1, a greater AUROC of 0.817 (95% CI 0.695-0.913) in validation cohort 2, a similar AUROC of 0.786 (95% CI 0.628-0.902) in validation cohort 3, and the comparable AUROC of 0.720 (95% CI 0.671-0.768) in validation cohort 4. We visualized the random forest model and created a web-based online risk calculator. Conclusions: We have developed and validated a generalizable random forest model to predict postoperative early complications in patients undergoing intestinal obstruction surgery, enabling clinicians to screen high-risk patients and implement early individualized interventions. An online risk calculator for early postoperative complications was developed to make the random forest model accessible to clinicians around the world. 
UR  - https://www.jmir.org/2025/1/e68354
UR  - http://dx.doi.org/10.2196/68354
UR  - http://www.ncbi.nlm.nih.gov/pubmed/40053794
ID  - info:doi/10.2196/68354
ER  - 

TY  - JOUR
AU  - Xiong, Xiaojuan
AU  - Fu, Hong
AU  - Xu, Bo
AU  - Wei, Wang
AU  - Zhou, Mi
AU  - Hu, Peng
AU  - Ren, Yunqin
AU  - Mao, Qingxiang
PY  - 2025/1/22
TI  - Ten Machine Learning Models for Predicting Preoperative and Postoperative Coagulopathy in Patients With Trauma: Multicenter Cohort Study
JO  - J Med Internet Res
SP  - e66612
VL  - 27
KW  - traumatic coagulopathy
KW  - preoperative
KW  - postoperative
KW  - machine learning models
KW  - random forest
KW  - Medical Information Mart for Intensive Care
N2  - Background: Recent research has revealed the potential value of machine learning (ML) models in improving prognostic prediction for patients with trauma. ML can enhance predictions and identify which factors contribute the most to posttraumatic mortality. However, no studies have explored the risk factors, complications, and risk prediction of preoperative and postoperative traumatic coagulopathy (PPTIC) in patients with trauma. Objective: This study aims to help clinicians implement timely and appropriate interventions to reduce the incidence of PPTIC and related complications, thereby lowering in-hospital mortality and disability rates for patients with trauma. Methods: We analyzed data from 13,235 patients with trauma from 4 medical centers, including medical histories, laboratory results, and hospitalization complications. We developed 10 ML models in Python (Python Software Foundation) to predict PPTIC based on preoperative indicators. Data from 10,023 Medical Information Mart for Intensive Care patients were divided into training (70%) and test (30%) sets, with 3212 patients from 3 other centers used for external validation. Model performance was assessed with 5-fold cross-validation, bootstrapping, Brier score, and Shapley additive explanation values. Results: Univariate logistic regression identified PPTIC risk factors as (1) prolonged activated partial thromboplastin time, prothrombin time, and international normalized ratio; (2) decreased levels of hemoglobin, hematocrit, red blood cells, calcium, and sodium; (3) lower admission diastolic blood pressure; (4) elevated alanine aminotransferase and aspartate aminotransferase levels; (5) admission heart rate; and (6) emergency surgery and perioperative transfusion. Multivariate logistic regression revealed that patients with PPTIC faced significantly higher risks of sepsis (1.75-fold), heart failure (1.5-fold), delirium (3.08-fold), abnormal coagulation (3.57-fold), tracheostomy (2.76-fold), mortality (2.19-fold), and urinary tract infection (1.95-fold), along with longer hospital and intensive care unit stays. Random forest was the most effective ML model for predicting PPTIC, achieving an area under the receiver operating characteristic of 0.91, an area under�the precision-recall�curve of 0.89, accuracy of 0.84, sensitivity of 0.80, specificity of 0.88, precision of 0.88, F1-score of 0.84, and Brier score of 0.13 in external validation. Conclusions: Key PPTIC risk factors include (1) prolonged activated partial thromboplastin time, prothrombin time, and international normalized ratio; (2) low levels of hemoglobin, hematocrit, red blood cells, calcium, and sodium; (3) low diastolic blood pressure; (4) elevated alanine aminotransferase and aspartate aminotransferase levels; (5) admission heart rate; and (6) the need for emergency surgery and transfusion. PPTIC is associated with severe complications and extended hospital stays. Among the ML models, the random forest model was the most effective predictor. Trial Registration: Chinese Clinical Trial Registry ChiCTR2300078097; https://www.chictr.org.cn/showproj.html?proj=211051 
UR  - https://www.jmir.org/2025/1/e66612
UR  - http://dx.doi.org/10.2196/66612
UR  - http://www.ncbi.nlm.nih.gov/pubmed/39841523
ID  - info:doi/10.2196/66612
ER  - 

TY  - JOUR
AU  - Ding, Zhendong
AU  - Zhang, Linan
AU  - Zhang, Yihan
AU  - Yang, Jing
AU  - Luo, Yuheng
AU  - Ge, Mian
AU  - Yao, Weifeng
AU  - Hei, Ziqing
AU  - Chen, Chaojin
PY  - 2025/1/15
TI  - A Supervised Explainable Machine Learning Model for Perioperative Neurocognitive Disorder in Liver-Transplantation Patients and External Validation on the Medical Information Mart for Intensive Care IV Database: Retrospective Study
JO  - J Med Internet Res
SP  - e55046
VL  - 27
KW  - machine learning
KW  - risk factors
KW  - liver transplantation
KW  - perioperative neurocognitive disorders
KW  - MIMIC-? database
KW  - external validation
N2  - Background: Patients undergoing liver transplantation (LT) are at risk of perioperative neurocognitive dysfunction (PND), which significantly affects the patients? prognosis. Objective: This study used machine learning (ML) algorithms with an aim to extract critical predictors and develop an ML model to predict PND among LT recipients. Methods: In this retrospective study, data from 958 patients who underwent LT between January 2015 and January 2020 were extracted from the Third Affiliated Hospital of Sun Yat-sen University. Six ML algorithms were used to predict post-LT PND, and model performance was evaluated using area under the receiver operating curve (AUC), accuracy, sensitivity, specificity, and F1-scores. The best-performing model was additionally validated using a temporal external dataset including 309 LT cases from February 2020 to August 2022, and an independent external dataset extracted from the Medical Information Mart for Intensive Care ? (MIMIC-?) database including 325 patients. Results: In the development cohort, 201 out of 751 (33.5%) patients were diagnosed with PND. The logistic regression model achieved the highest AUC (0.799) in the internal validation set, with comparable AUC in the temporal external (0.826) and MIMIC-? validation sets (0.72). The top 3 features contributing to post-LT PND diagnosis were the preoperative overt hepatic encephalopathy, platelet level, and postoperative sequential organ failure assessment score, as revealed by the Shapley additive explanations method. Conclusions: A real-time logistic regression model-based online predictor of post-LT PND was developed, providing a highly interoperable tool for use across medical institutions to support early risk stratification and decision making for the LT recipients. 
UR  - https://www.jmir.org/2025/1/e55046
UR  - http://dx.doi.org/10.2196/55046
UR  - http://www.ncbi.nlm.nih.gov/pubmed/39813086
ID  - info:doi/10.2196/55046
ER  - 

TY  - JOUR
AU  - Holler, Emma
AU  - Ludema, Christina
AU  - Ben Miled, Zina
AU  - Rosenberg, Molly
AU  - Kalbaugh, Corey
AU  - Boustani, Malaz
AU  - Mohanty, Sanjay
PY  - 2025/1/9
TI  - Development and Validation of a Routine Electronic Health Record-Based Delirium Prediction Model for Surgical Patients Without Dementia: Retrospective Case-Control Study
JO  - JMIR Perioper Med
SP  - e59422
VL  - 8
KW  - delirium
KW  - machine learning
KW  - prediction
KW  - postoperative
KW  - algorithm
KW  - electronic health records
KW  - surgery
KW  - risk prediction
N2  - Background: Postoperative delirium (POD) is a common complication after major surgery and is associated with poor outcomes in older adults. Early identification of patients at high risk of POD can enable targeted prevention efforts. However, existing POD prediction models require inpatient data collected during the hospital stay, which delays predictions and limits scalability. Objective: This study aimed to develop and externally validate a machine learning-based prediction model for POD using routine electronic health record (EHR) data. Methods: We identified all surgical encounters from 2014 to 2021 for patients aged 50 years and older who underwent an operation requiring general anesthesia, with a length of stay of at least 1 day at 3 Indiana hospitals. Patients with preexisting dementia or mild cognitive impairment were excluded. POD was identified using Confusion Assessment Method records and delirium International Classification of Diseases (ICD) codes. Controls without delirium or nurse-documented confusion were matched to cases by age, sex, race, and year of admission. We trained logistic regression, random forest, extreme gradient boosting (XGB), and neural network models to predict POD using 143 features derived from routine EHR data available at the time of hospital admission. Separate models were developed for each hospital using surveillance periods of 3 months, 6 months, and 1 year before admission. Model performance was evaluated using the area under the receiver operating characteristic curve (AUROC). Each model was internally validated using holdout data and externally validated using data from the other 2 hospitals. Calibration was assessed using calibration curves. Results: The study cohort included 7167 delirium cases and 7167 matched controls. XGB outperformed all other classifiers. AUROCs were highest for XGB models trained on 12 months of preadmission data. The best-performing XGB model achieved a mean AUROC of 0.79 (SD 0.01) on the holdout set, which decreased to 0.69-0.74 (SD 0.02) when externally validated on data from other hospitals. Conclusions: Our routine EHR-based POD prediction models demonstrated good predictive ability using a limited set of preadmission and surgical variables, though their generalizability was limited. The proposed models could be used as a scalable, automated screening tool to identify patients at high risk of POD at the time of hospital admission. 
UR  - https://periop.jmir.org/2025/1/e59422
UR  - http://dx.doi.org/10.2196/59422
UR  - http://www.ncbi.nlm.nih.gov/pubmed/
ID  - info:doi/10.2196/59422
ER  - 

TY  - JOUR
AU  - Tang, Ran
AU  - Qi, Shi-qin
PY  - 2024/11/18
TI  - The Vast Potential of ChatGPT in Pediatric Surgery
JO  - J Med Internet Res
SP  - e66453
VL  - 26
KW  - ChatGPT
KW  - pediatric
KW  - surgery
KW  - artificial intelligence
KW  - AI
KW  - diagnosis
KW  - surgeon
UR  - https://www.jmir.org/2024/1/e66453
UR  - http://dx.doi.org/10.2196/66453
UR  - http://www.ncbi.nlm.nih.gov/pubmed/
ID  - info:doi/10.2196/66453
ER  - 

TY  - JOUR
AU  - Liu, Jiayu
AU  - Liang, Xiuting
AU  - Fang, Dandong
AU  - Zheng, Jiqi
AU  - Yin, Chengliang
AU  - Xie, Hui
AU  - Li, Yanteng
AU  - Sun, Xiaochun
AU  - Tong, Yue
AU  - Che, Hebin
AU  - Hu, Ping
AU  - Yang, Fan
AU  - Wang, Bingxian
AU  - Chen, Yuanyuan
AU  - Cheng, Gang
AU  - Zhang, Jianning
PY  - 2024/9/10
TI  - The Diagnostic Ability of GPT-3.5 and GPT-4.0 in Surgery: Comparative Analysis
JO  - J Med Internet Res
SP  - e54985
VL  - 26
KW  - ChatGPT
KW  - accuracy rates
KW  - artificial intelligence
KW  - diagnosis
KW  - surgeon
N2  - Background: ChatGPT (OpenAI) has shown great potential in clinical diagnosis and could become an excellent auxiliary tool in clinical practice. This study investigates and evaluates ChatGPT in diagnostic capabilities by comparing the performance of GPT-3.5 and GPT-4.0 across model iterations. Objective: This study aims to evaluate the precise diagnostic ability of GPT-3.5 and GPT-4.0 for colon cancer and its potential as an auxiliary diagnostic tool for surgeons and compare the diagnostic accuracy rates between GTP-3.5 and GPT-4.0. We precisely assess the accuracy of primary and secondary diagnoses and analyze the causes of misdiagnoses in GPT-3.5 and GPT-4.0 according to 7 categories: patient histories, symptoms, physical signs, laboratory examinations, imaging examinations, pathological examinations, and intraoperative findings. Methods: We retrieved 316 case reports for intestinal cancer from the Chinese Medical Association Publishing House database, of which 286 cases were deemed valid after data cleansing. The cases were translated from Mandarin to English and then input into GPT-3.5 and GPT-4.0 using a simple, direct prompt to elicit primary and secondary diagnoses. We conducted a comparative study to evaluate the diagnostic accuracy of GPT-4.0 and GPT-3.5. Three senior surgeons from the General Surgery Department, specializing in Colorectal Surgery, assessed the diagnostic information at the Chinese PLA (People?s Liberation Army) General Hospital. The accuracy of primary and secondary diagnoses was scored based on predefined criteria. Additionally, we analyzed and compared the causes of misdiagnoses in both models according to 7 categories: patient histories, symptoms, physical signs, laboratory examinations, imaging examinations, pathological examinations, and intraoperative findings. Results: Out of 286 cases, GPT-4.0 and GPT-3.5 both demonstrated high diagnostic accuracy for primary diagnoses, but the accuracy rates of GPT-4.0 were significantly higher than GPT-3.5 (mean 0.972, SD 0.137 vs mean 0.855, SD 0.335; t285=5.753; P<.001). For secondary diagnoses, the accuracy rates of GPT-4.0 were also significantly higher than GPT-3.5 (mean 0.908, SD 0.159 vs mean 0.617, SD 0.349; t285=?7.727; P<.001). GPT-3.5 showed limitations in processing patient history, symptom presentation, laboratory tests, and imaging data. While GPT-4.0 improved upon GPT-3.5, it still has limitations in identifying symptoms and laboratory test data. For both primary and secondary diagnoses, there was no significant difference in accuracy related to age, gender, or system group between GPT-4.0 and GPT-3.5. Conclusions: This study demonstrates that ChatGPT, particularly GPT-4.0, possesses significant diagnostic potential, with GPT-4.0 exhibiting higher accuracy than GPT-3.5. However, GPT-4.0 still has limitations, particularly in recognizing patient symptoms and laboratory data, indicating a need for more research in real-world clinical settings to enhance its diagnostic capabilities. 
UR  - https://www.jmir.org/2024/1/e54985
UR  - http://dx.doi.org/10.2196/54985
UR  - http://www.ncbi.nlm.nih.gov/pubmed/39255016
ID  - info:doi/10.2196/54985
ER  - 

TY  - JOUR
AU  - Tsai, Feng-Fang
AU  - Chang, Yung-Chun
AU  - Chiu, Yu-Wen
AU  - Sheu, Bor-Ching
AU  - Hsu, Min-Huei
AU  - Yeh, Huei-Ming
PY  - 2024/8/21
TI  - Machine Learning Model for Anesthetic Risk Stratification for Gynecologic and Obstetric Patients: Cross-Sectional Study Outlining a Novel Approach for Early Detection
JO  - JMIR Form Res
SP  - e54097
VL  - 8
KW  - gradient boosting machine
KW  - comorbidity
KW  - gynecological and obstetric procedure
KW  - ASA classification
KW  - American Society of Anesthesiologists
KW  - preoperative evaluation
KW  - machine learning
KW  - machine learning model
KW  - gynecology
KW  - obstetrics
KW  - early detection
KW  - artificial intelligence
KW  - physiological
KW  - gestational
KW  - anesthetic risk
KW  - clinical laboratory data
KW  - laboratory data
KW  - risk
KW  - risk classification
N2  - Background: Preoperative evaluation is important, and this study explored the application of machine learning methods for anesthetic risk classification and the evaluation of the contributions of various factors. To minimize the effects of confounding variables during model training, we used a homogenous group with similar physiological states and ages undergoing similar pelvic organ?related procedures not involving malignancies. Objective: Data on women of reproductive age (age 20-50 years) who underwent gestational or gynecological surgery between January 1, 2017, and December 31, 2021, were obtained from the National Taiwan University Hospital Integrated Medical Database. Methods: We first performed an exploratory analysis and selected key features. We then performed data preprocessing to acquire relevant features related to preoperative examination. To further enhance predictive performance, we used the log-likelihood ratio algorithm to generate comorbidity patterns. Finally, we input the processed features into the light gradient boosting machine (LightGBM) model for training and subsequent prediction. Results: A total of 10,892 patients were included. Within this data set, 9893 patients were classified as having low anesthetic risk (American Society of Anesthesiologists physical status score of 1-2), and 999 patients were classified as having high anesthetic risk (American Society of Anesthesiologists physical status score of >2). The area under the receiver operating characteristic curve of the proposed model was 0.6831. Conclusions: By combining comorbidity information and clinical laboratory data, our methodology based on the LightGBM model provides more accurate predictions for anesthetic risk classification. Trial Registration: Research Ethics Committee of the National Taiwan University Hospital 202204010RINB; https://www.ntuh.gov.tw/RECO/Index.action 
UR  - https://formative.jmir.org/2024/1/e54097
UR  - http://dx.doi.org/10.2196/54097
UR  - http://www.ncbi.nlm.nih.gov/pubmed/38991090
ID  - info:doi/10.2196/54097
ER  - 

TY  - JOUR
AU  - Wong, Chia-En
AU  - Chen, Pei-Wen
AU  - Hsu, Heng-Jui
AU  - Cheng, Shao-Yang
AU  - Fan, Chen-Che
AU  - Chen, Yen-Chang
AU  - Chiu, Yi-Pei
AU  - Lee, Jung-Shun
AU  - Liang, Sheng-Fu
PY  - 2024/7/4
TI  - Collaborative Human?Computer Vision Operative Video Analysis Algorithm for Analyzing Surgical Fluency and Surgical Interruptions in Endonasal Endoscopic Pituitary Surgery: Cohort Study
JO  - J Med Internet Res
SP  - e56127
VL  - 26
KW  - algorithm
KW  - computer vision
KW  - endonasal endoscopic approach
KW  - pituitary
KW  - transsphenoidal surgery
N2  - Background: The endonasal endoscopic approach (EEA) is effective for pituitary adenoma resection. However, manual review of operative videos is time-consuming. The application of a computer vision (CV) algorithm could potentially reduce the time required for operative video review and facilitate the training of surgeons to overcome the learning curve of EEA. Objective: This study aimed to evaluate the performance of a CV-based video analysis system, based on OpenCV algorithm, to detect surgical interruptions and analyze surgical fluency in EEA. The accuracy of the CV-based video analysis was investigated, and the time required for operative video review using CV-based analysis was compared to that of manual review. Methods: The dominant color of each frame in the EEA video was determined using OpenCV. We developed an algorithm to identify events of surgical interruption if the alterations in the dominant color pixels reached certain thresholds. The thresholds were determined by training the current algorithm using EEA videos. The accuracy of the CV analysis was determined by manual review, and the time spent was reported. Results: A total of 46 EEA operative videos were analyzed, with 93.6%, 95.1%, and 93.3% accuracies in the training, test 1, and test 2 data sets, respectively. Compared with manual review, CV-based analysis reduced the time required for operative video review by 86% (manual review: 166.8 and CV analysis: 22.6 minutes; P<.001). The application of a human-computer collaborative strategy increased the overall accuracy to 98.5%, with a 74% reduction in the review time (manual review: 166.8 and human-CV collaboration: 43.4 minutes; P<.001). Analysis of the different surgical phases showed that the sellar phase had the lowest frequency (nasal phase: 14.9, sphenoidal phase: 15.9, and sellar phase: 4.9 interruptions/10 minutes; P<.001) and duration (nasal phase: 67.4, sphenoidal phase: 77.9, and sellar phase: 31.1 seconds/10 minutes; P<.001) of surgical interruptions. A comparison of the early and late EEA videos showed that increased surgical experience was associated with a decreased number (early: 4.9 and late: 2.9 interruptions/10 minutes; P=.03) and duration (early: 41.1 and late: 19.8 seconds/10 minutes; P=.02) of surgical interruptions during the sellar phase. Conclusions: CV-based analysis had a 93% to 98% accuracy in detecting the number, frequency, and duration of surgical interruptions occurring during EEA. Moreover, CV-based analysis reduced the time required to analyze the surgical fluency in EEA videos compared to manual review. The application of CV can facilitate the training of surgeons to overcome the learning curve of endoscopic skull base surgery. Trial Registration: ClinicalTrials.gov NCT06156020; https://clinicaltrials.gov/study/NCT06156020 
UR  - https://www.jmir.org/2024/1/e56127
UR  - http://dx.doi.org/10.2196/56127
UR  - http://www.ncbi.nlm.nih.gov/pubmed/38963694
ID  - info:doi/10.2196/56127
ER  - 

TY  - JOUR
AU  - El-Gabalawy, Ren�e
AU  - Sommer, L. Jordana
AU  - Hebbard, Pamela
AU  - Reynolds, Kristin
AU  - Logan, S. Gabrielle
AU  - Smith, D. Michael S.
AU  - Mutter, C. Thomas
AU  - Mutch, Alan W.
AU  - Mota, Natalie
AU  - Proulx, Catherine
AU  - Gagnon Shaigetz, Vincent
AU  - Maples-Keller, L. Jessica
AU  - Arora, C. Rakesh
AU  - Perrin, David
AU  - Benedictson, Jada
AU  - Jacobsohn, Eric
PY  - 2024/5/14
TI  - An Immersive Virtual Reality Intervention for Preoperative Anxiety and Distress Among Adults Undergoing Oncological Surgery: Protocol for a 3-Phase Development and Feasibility Trial
JO  - JMIR Res Protoc
SP  - e55692
VL  - 13
KW  - virtual reality
KW  - preoperative anxiety and distress
KW  - perioperative mental health
KW  - breast cancer
KW  - oncological surgery
N2  - Background: Preoperative state anxiety (PSA) is distress and anxiety directly associated with perioperative events. PSA is associated with negative postoperative outcomes such as longer hospital length of stay, increased pain and opioid use, and higher rates of rehospitalization. Psychological prehabilitation, such as education, exposure to hospital environments, and relaxation strategies, has been shown to mitigate PSA; however, there are limited skilled personnel to deliver such interventions in clinical practice. Immersive virtual reality (VR) has the potential for greater accessibility and enhanced integration into an immersive and interactive experience. VR is rarely used in the preoperative setting, but similar forms of stress inoculation training involving exposure to stressful events have improved psychological preparation in contexts such as military deployment. Objective: This study seeks to develop and investigate a targeted PSA intervention in patients undergoing oncological surgery using a single preoperative VR exposure. The primary objectives are to (1) develop a novel VR program for patients undergoing oncological surgery with general anesthesia; (2) assess the feasibility, including acceptability, of a single exposure to this intervention; (3) assess the feasibility, including acceptability, of outcome measures of PSA; and (4) use these results to refine the VR content and outcome measures for a larger trial. A secondary objective is to preliminarily assess the clinical utility of the intervention for PSA. Methods: This study comprises 3 phases. Phase 1 (completed) involved the development of a VR prototype targeting PSA, using multidisciplinary iterative input. Phase 2 (data collection completed) involves examining the feasibility aspects of the VR intervention. This randomized feasibility trial involves assessing the novel VR preoperative intervention compared to a VR control (ie, nature trek) condition and a treatment-as-usual group among patients undergoing breast cancer surgery. Phase 3 will involve refining the prototype based on feasibility findings and input from people with lived experience for a future clinical trial, using focus groups with participants from phase 2. Results: This study was funded in March 2019. Phase 1 was completed in April 2020. Phase 2 data collection was completed in January 2024 and data analysis is ongoing. Focus groups were completed in February 2024. Both the feasibility study and focus groups will contribute to further refinement of the initial VR prototype (phase 3), with the final simulation to be completed by mid-2024. Conclusions: The findings from this work will contribute to the limited body of research examining feasible and broadly accessible interventions for PSA. Knowledge gained from this research will contribute to the final development of a novel VR intervention to be tested in a large population of patients with cancer before surgery in a randomized clinical trial. Trial Registration: ClinicalTrials.gov NCT04544618; https://www.clinicaltrials.gov/study/NCT04544618 International Registered Report Identifier (IRRID): DERR1-10.2196/55692 
UR  - https://www.researchprotocols.org/2024/1/e55692
UR  - http://dx.doi.org/10.2196/55692
UR  - http://www.ncbi.nlm.nih.gov/pubmed/38743939
ID  - info:doi/10.2196/55692
ER  - 

TY  - JOUR
AU  - Mittal, Ajay
AU  - Wakim, Jonathan
AU  - Huq, Suhaiba
AU  - Wynn, Tung
PY  - 2024/5/9
TI  - Effectiveness of Virtual Reality in Reducing Perceived Pain and Anxiety Among Patients Within a Hospital System: Protocol for a Mixed Methods Study
JO  - JMIR Res Protoc
SP  - e52649
VL  - 13
KW  - virtual reality
KW  - digital health
KW  - feasibility
KW  - acceptability
KW  - pain
KW  - anxiety
KW  - hospital
KW  - hospitalization
KW  - in-patient
KW  - observational study
KW  - pharmacologic pain management
KW  - pain management
KW  - topical anesthetic creams
KW  - topical cream
N2  - Background: Within hospital systems, diverse subsets of patients are subject to minimally invasive procedures that provide therapeutic relief and necessary health data that are often perceived as anxiogenic or painful. These feelings are particularly relevant to patients experiencing procedures where they are conscious and not sedated or placed under general anesthesia that renders them incapacitated. Pharmacologic pain management and topical anesthetic creams are used to manage these feelings; however, distraction-based methods can provide nonpharmacologic means to modify the painful experience and discomfort often associated with these procedures. Recent studies support distraction as a useful method for reducing anxiety and pain and as a result, improving patient experience. Virtual reality (VR) is an emerging technology that provides an immersive user experience and can operate through a distraction-based method to reduce the negative or painful experience often related to procedures where the patient is conscious. Given the possible short-term and long-term outcomes of poorly managed pain and enduring among patients, health care professionals are challenged to improve patient well-being during medically essential procedures. Objective: The purpose of this pilot project is to assess the efficacy of using VR as a distraction-based intervention for anxiety or pain management compared to other nonpharmacologic interventions in a variety of hospital settings, specifically in patients undergoing lumbar puncture procedures and bone marrow biopsies at the oncology ward, patients receiving nerve block for a broken bone at an anesthesia or surgical center, patients undergoing a cleaning at a dental clinic, patients conscious during an ablation procedure at a cardiology clinic, and patients awake during a kidney biopsy at a nephrology clinic. This will provide the framework for additional studies in other health care settings. Methods: In a single visit, patients eligible for the study will complete brief preprocedural and postprocedural questionnaires about their perceived fear, anxiety, and pain levels. During the procedure, research assistants will place a VR headset on the patient and the patient will undergo a VR experience to distract from any pain felt from the procedure. Participants? vitals, including blood pressure, heart rate, and rate of respiration, will also be recorded before, during, and after the procedure. Results: The study is already underway, and results support a decrease in perceived pain by 1.00 and a decrease in perceived anxiety by 0.3 compared to the control group (on a 10-point Likert scale). Among the VR intervention group, the average rating for comfort was 4.35 out of 5. Conclusions: This study will provide greater insight into how patients? perception of anxiety and pain could potentially be altered. Furthermore, metrics related to the operational efficiency of providing a VR intervention compared to a control will provide insight into the feasibility and integration of such technologies in routine practice. International Registered Report Identifier (IRRID): DERR1-10.2196/52649 
UR  - https://www.researchprotocols.org/2024/1/e52649
UR  - http://dx.doi.org/10.2196/52649
UR  - http://www.ncbi.nlm.nih.gov/pubmed/38722681
ID  - info:doi/10.2196/52649
ER  - 

TY  - JOUR
AU  - Osmanodja, Bilgin
AU  - Sassi, Zeineb
AU  - Eickmann, Sascha
AU  - Hansen, Maria Carla
AU  - Roller, Roland
AU  - Burchardt, Aljoscha
AU  - Samhammer, David
AU  - Dabrock, Peter
AU  - M�ller, Sebastian
AU  - Budde, Klemens
AU  - Herrmann, Anne
PY  - 2024/4/1
TI  - Investigating the Impact of AI on Shared Decision-Making in Post-Kidney Transplant Care (PRIMA-AI): Protocol for a Randomized Controlled Trial
JO  - JMIR Res Protoc
SP  - e54857
VL  - 13
KW  - shared decision-making
KW  - SDM
KW  - kidney transplantation
KW  - artificial intelligence
KW  - AI
KW  - decision-support system
KW  - DSS
KW  - qualitative research
N2  - Background: Patients after kidney transplantation eventually face the risk of graft loss with the concomitant need for dialysis or retransplantation. Choosing the right kidney replacement therapy after graft loss is an important preference-sensitive decision for kidney transplant recipients. However, the rate of conversations about treatment options after kidney graft loss has been shown to be as low as 13% in previous studies. It is unknown whether the implementation of artificial intelligence (AI)?based risk prediction models can increase the number of conversations about treatment options after graft loss and how this might influence the associated shared decision-making (SDM). Objective: This study aims to explore the impact of AI-based risk prediction for the risk of graft loss on the frequency of conversations about the treatment options after graft loss, as well as the associated SDM process. Methods: This is a 2-year, prospective, randomized, 2-armed, parallel-group, single-center trial in a German kidney transplant center. All patients will receive the same routine post?kidney transplant care that usually includes follow-up visits every 3 months at the kidney transplant center. For patients in the intervention arm, physicians will be assisted by a validated and previously published AI-based risk prediction system that estimates the risk for graft loss in the next year, starting from 3 months after randomization until 24 months after randomization. The study population will consist of 122 kidney transplant recipients >12 months after transplantation, who are at least 18 years of age, are able to communicate in German, and have an estimated glomerular filtration rate <30 mL/min/1.73 m2. Patients with multi-organ transplantation, or who are not able to communicate in German, as well as underage patients, cannot participate. For the primary end point, the proportion of patients who have had a conversation about their treatment options after graft loss is compared at 12 months after randomization. Additionally, 2 different assessment tools for SDM, the CollaboRATE mean score and the Control Preference Scale, are compared between the 2 groups at 12 months and 24 months after randomization. Furthermore, recordings of patient-physician conversations, as well as semistructured interviews with patients, support persons, and physicians, are performed to support the quantitative results. Results: The enrollment for the study is ongoing. The first results are expected to be submitted for publication in 2025. Conclusions: This is the first study to examine the influence of AI-based risk prediction on physician-patient interaction in the context of kidney transplantation. We use a mixed methods approach by combining a randomized design with a simple quantitative end point (frequency of conversations), different quantitative measurements for SDM, and several qualitative research methods (eg, records of physician-patient conversations and semistructured interviews) to examine the implementation of AI-based risk prediction in the clinic. Trial Registration: ClinicalTrials.gov NCT06056518; https://clinicaltrials.gov/study/NCT06056518 International Registered Report Identifier (IRRID): PRR1-10.2196/54857 
UR  - https://www.researchprotocols.org/2024/1/e54857
UR  - http://dx.doi.org/10.2196/54857
UR  - http://www.ncbi.nlm.nih.gov/pubmed/38557315
ID  - info:doi/10.2196/54857
ER  - 

TY  - JOUR
AU  - Rohatgi, Nidhi
PY  - 2023/11/21
TI  - JMIR Perioperative Medicine: A Global Journal for Publishing Interdisciplinary Innovations, Research, and Perspectives
JO  - JMIR Perioper Med
SP  - e54344
VL  - 6
KW  - JMIR Perioperative Medicine
KW  - innovation
KW  - technology
KW  - digital health
KW  - research
KW  - interdisciplinary
KW  - perioperative medicine
UR  - https://periop.jmir.org/2023/1/e54344
UR  - http://dx.doi.org/10.2196/54344
UR  - http://www.ncbi.nlm.nih.gov/pubmed/37988142
ID  - info:doi/10.2196/54344
ER  - 

TY  - JOUR
AU  - Nakanishi, Kozo
AU  - Goto, Hidenori
PY  - 2023/11/14
TI  - A New Index for the Quantitative Evaluation of Surgical Invasiveness Based on Perioperative Patients? Behavior Patterns: Machine Learning Approach Using Triaxial Acceleration
JO  - JMIR Perioper Med
SP  - e50188
VL  - 6
KW  - surgery
KW  - invasiveness
KW  - triaxial acceleration
KW  - machine learning
KW  - human activity recognition
KW  - patient-oriented outcome
KW  - video-assisted thoracoscopic surgery
KW  - VATS
KW  - postoperative recovery
KW  - perioperative management
KW  - artificial intelligence
KW  - AI
KW  - mobile phone
N2  - Background: The minimally invasive nature of thoracoscopic surgery is well recognized; however, the absence of a reliable evaluation method remains challenging. We hypothesized that the postoperative recovery speed is closely linked to surgical invasiveness, where recovery signifies the patient?s behavior transition back to their preoperative state during the perioperative period. Objective: This study aims to determine whether machine learning using triaxial acceleration data can effectively capture perioperative behavior changes and establish a quantitative index for quantifying variations in surgical invasiveness. Methods: We trained 7 distinct machine learning models using a publicly available human acceleration data set as supervised data. The 3 top-performing models were selected to predict patient actions, as determined by the Matthews correlation coefficient scores. Two patients who underwent different levels of invasive thoracoscopic surgery were selected as participants. Acceleration data were collected via chest sensors for 8 hours during the preoperative and postoperative hospitalization days. These data were categorized into 4 actions (walking, standing, sitting, and lying down) using the selected models. The actions predicted by the model with intermediate results were adopted as the actions of the participants. The daily appearance probability was calculated for each action. The 2 differences between 2 appearance probabilities (sitting vs standing and lying down vs walking) were calculated using 2 coordinates on the x- and y-axes. A 2D vector composed of coordinate values was defined as the index of behavior pattern (iBP) for the day. All daily iBPs were graphed, and the enclosed area and distance between points were calculated and compared between participants to assess the relationship between changes in the indices and invasiveness. Results: Patients 1 and 2 underwent lung lobectomy and incisional tumor biopsy, respectively. The selected predictive model was a light-gradient boosting model (mean Matthews correlation coefficient 0.98, SD 0.0027; accuracy: 0.98). The acceleration data yielded 548,466 points for patient 1 and 466,407 points for patient 2. The iBPs of patient 1 were [(0.32, 0.19), (?0.098, 0.46), (?0.15, 0.13), (?0.049, 0.22)] and those of patient 2 were [(0.55, 0.30), (0.77, 0.21), (0.60, 0.25), (0.61, 0.31)]. The enclosed areas were 0.077 and 0.0036 for patients 1 and 2, respectively. Notably, the distances for patient 1 were greater than those for patient 2 ({0.44, 0.46, 0.37, 0.26} vs {0.23, 0.0065, 0.059}; P=.03 [Mann-Whitney U test]). Conclusions: The selected machine learning model effectively predicted the actions of the surgical patients with high accuracy. The temporal distribution of action times revealed changes in behavior patterns during the perioperative phase. The proposed index may facilitate the recognition and visualization of perioperative changes in patients and differences in surgical invasiveness. 
UR  - https://periop.jmir.org/2023/1/e50188
UR  - http://dx.doi.org/10.2196/50188
UR  - http://www.ncbi.nlm.nih.gov/pubmed/37962919
ID  - info:doi/10.2196/50188
ER  - 

TY  - JOUR
AU  - Matsumoto, Koutarou
AU  - Nohara, Yasunobu
AU  - Sakaguchi, Mikako
AU  - Takayama, Yohei
AU  - Fukushige, Syota
AU  - Soejima, Hidehisa
AU  - Nakashima, Naoki
AU  - Kamouchi, Masahiro
PY  - 2023/10/26
TI  - Temporal Generalizability of Machine Learning Models for Predicting Postoperative Delirium Using Electronic Health Record Data: Model Development and Validation Study
JO  - JMIR Perioper Med
SP  - e50895
VL  - 6
KW  - postoperative delirium
KW  - prediction model
KW  - machine learning
KW  - temporal generalizability
KW  - electronic health record data
N2  - Background: Although machine learning models demonstrate significant potential in predicting postoperative delirium, the advantages of their implementation in real-world settings remain unclear and require a comparison with conventional models in practical applications. Objective: The objective of this study was to validate the temporal generalizability of decision tree ensemble and sparse linear regression models for predicting delirium after surgery compared with that of the traditional logistic regression model. Methods: The health record data of patients hospitalized at an advanced emergency and critical care medical center in Kumamoto, Japan, were collected electronically. We developed a decision tree ensemble model using extreme gradient boosting (XGBoost) and a sparse linear regression model using least absolute shrinkage and selection operator (LASSO) regression. To evaluate the predictive performance of the model, we used the area under the receiver operating characteristic curve (AUROC) and the Matthews correlation coefficient (MCC) to measure discrimination and the slope and intercept of the regression between predicted and observed probabilities to measure calibration. The Brier score was evaluated as an overall performance metric. We included 11,863 consecutive patients who underwent surgery with general anesthesia between December 2017 and February 2022. The patients were divided into a derivation cohort before the COVID-19 pandemic and a validation cohort during the COVID-19 pandemic. Postoperative delirium was diagnosed according to the confusion assessment method. Results: A total of 6497 patients (68.5, SD 14.4 years, women n=2627, 40.4%) were included in the derivation cohort, and 5366 patients (67.8, SD 14.6 years, women n=2105, 39.2%) were included in the validation cohort. Regarding discrimination, the XGBoost model (AUROC 0.87-0.90 and MCC 0.34-0.44) did not significantly outperform the LASSO model (AUROC 0.86-0.89 and MCC 0.34-0.41). The logistic regression model (AUROC 0.84-0.88, MCC 0.33-0.40, slope 1.01-1.19, intercept ?0.16 to 0.06, and Brier score 0.06-0.07), with 8 predictors (age, intensive care unit, neurosurgery, emergency admission, anesthesia time, BMI, blood loss during surgery, and use of an ambulance) achieved good predictive performance. Conclusions: The XGBoost model did not significantly outperform the LASSO model in predicting postoperative delirium. Furthermore, a parsimonious logistic model with a few important predictors achieved comparable performance to machine learning models in predicting postoperative delirium. 
UR  - https://periop.jmir.org/2023/1/e50895
UR  - http://dx.doi.org/10.2196/50895
UR  - http://www.ncbi.nlm.nih.gov/pubmed/37883164
ID  - info:doi/10.2196/50895
ER  - 

TY  - JOUR
AU  - Bottani, Eleonora
AU  - Bellini, Valentina
AU  - Mordonini, Monica
AU  - Pellegrino, Mattia
AU  - Lombardo, Gianfranco
AU  - Franchi, Beatrice
AU  - Craca, Michelangelo
AU  - Bignami, Elena
PY  - 2023/7/5
TI  - Internet of Things and New Technologies for Tracking Perioperative Patients With an Innovative Model for Operating Room Scheduling: Protocol for a Development and Feasibility Study
JO  - JMIR Res Protoc
SP  - e45477
VL  - 12
KW  - internet of things
KW  - artificial intelligence
KW  - machine learning
KW  - perioperative organization
KW  - operating rooms
N2  - Background: Management of operating rooms is a critical point in health care organizations because surgical departments represent a significant cost in hospital budgets. Therefore, it is increasingly important that there is effective planning of elective, emergency, and day surgery and optimization of both the human and physical resources available, always maintaining a high level of care and health treatment. This would lead to a reduction in patient waiting lists and better performance not only of surgical departments but also of the entire hospital. Objective: This study aims to automatically collect data from a real surgical scenario to develop an integrated technological-organizational model that optimizes operating block resources. Methods: Each patient is tracked and located in real time by wearing a bracelet sensor with a unique identifier. Exploiting the indoor location, the software architecture is able to collect the time spent for every step inside the surgical block. This method does not in any way affect the level of assistance that the patient receives and always protects their privacy; in fact, after expressing informed consent, each patient will be associated with an anonymous identification number. Results: The preliminary results are promising, making the study feasible and functional. Times automatically recorded are much more precise than those collected by humans and reported in the organization?s information system. In addition, machine learning can exploit the historical data collection to predict the surgery time required for each patient according to the patient?s specific profile. Simulation can also be applied to reproduce the system?s functioning, evaluate current performance, and identify strategies to improve the efficiency of the operating block. Conclusions: This functional approach improves short- and long-term surgical planning, facilitating interaction between the various professionals involved in the operating block, optimizing the management of available resources, and guaranteeing a high level of patient care in an increasingly efficient health care system. Trial Registration: ClinicalTrials.gov NCT05106621; https://clinicaltrials.gov/ct2/show/NCT05106621 International Registered Report Identifier (IRRID): DERR1-10.2196/45477 
UR  - https://www.researchprotocols.org/2023/1/e45477
UR  - http://dx.doi.org/10.2196/45477
UR  - http://www.ncbi.nlm.nih.gov/pubmed/37405821
ID  - info:doi/10.2196/45477
ER  - 

TY  - JOUR
AU  - Zoodsma, S. Ruben
AU  - Bosch, Rian
AU  - Alderliesten, Thomas
AU  - Bollen, W. Casper
AU  - Kappen, H. Teus
AU  - Koomen, Erik
AU  - Siebes, Arno
AU  - Nijman, Joppe
PY  - 2023/5/16
TI  - Continuous Data-Driven Monitoring in Critical Congenital Heart Disease: Clinical Deterioration Model Development
JO  - JMIR Cardio
SP  - e45190
VL  - 7
KW  - artificial intelligence
KW  - aberration detection
KW  - clinical deterioration
KW  - classification model
KW  - paediatric intensive care
KW  - pediatric intensive care
KW  - congenital heart disease
KW  - cardiac monitoring
KW  - machine learning
KW  - peri-operative
KW  - perioperative
KW  - surgery
N2  - Background: Critical congenital heart disease (cCHD)?requiring cardiac intervention in the first year of life for survival?occurs globally in 2-3 of every 1000 live births. In the critical perioperative period, intensive multimodal monitoring at a pediatric intensive care unit (PICU) is warranted, as their organs?especially the brain?may be severely injured due to hemodynamic and respiratory events. These 24/7 clinical data streams yield large quantities of high-frequency data, which are challenging in terms of interpretation due to the varying and dynamic physiology innate to cCHD. Through advanced data science algorithms, these dynamic data can be condensed into comprehensible information, reducing the cognitive load on the medical team and providing data-driven monitoring support through automated detection of clinical deterioration, which may facilitate timely intervention. Objective: This study aimed to develop a clinical deterioration detection algorithm for PICU patients with cCHD. Methods: Retrospectively, synchronous per-second data of cerebral regional oxygen saturation (rSO2) and 4 vital parameters (respiratory rate, heart rate, oxygen saturation, and invasive mean blood pressure) in neonates with cCHD admitted to the University Medical Center Utrecht, the Netherlands, between 2002 and 2018 were extracted. Patients were stratified based on mean oxygen saturation during admission to account for physiological differences between acyanotic and cyanotic cCHD. Each subset was used to train our algorithm in classifying data as either stable, unstable, or sensor dysfunction. The algorithm was designed to detect combinations of parameters abnormal to the stratified subpopulation and significant deviations from the patient?s unique baseline, which were further analyzed to distinguish clinical improvement from deterioration. Novel data were used for testing, visualized in detail, and internally validated by pediatric intensivists. Results: A retrospective query yielded 4600 hours and 209 hours of per-second data in 78 and 10 neonates for, respectively, training and testing purposes. During testing, stable episodes occurred 153 times, of which 134 (88%) were correctly detected. Unstable episodes were correctly noted in 46 of 57 (81%) observed episodes. Twelve expert-confirmed unstable episodes were missed in testing. Time-percentual accuracy was 93% and 77% for, respectively, stable and unstable episodes. A total of 138 sensorial dysfunctions were detected, of which 130 (94%) were correct. Conclusions: In this proof-of-concept study, a clinical deterioration detection algorithm was developed and retrospectively evaluated to classify clinical stability and instability, achieving reasonable performance considering the heterogeneous population of neonates with cCHD. Combined analysis of baseline (ie, patient-specific) deviations and simultaneous parameter-shifting (ie, population-specific) proofs would be promising with respect to enhancing applicability to heterogeneous critically ill pediatric populations. After prospective validation, the current?and comparable?models may, in the future, be used in the automated detection of clinical deterioration and eventually provide data-driven monitoring support to the medical team, allowing for timely intervention. 
UR  - https://cardio.jmir.org/2023/1/e45190
UR  - http://dx.doi.org/10.2196/45190
UR  - http://www.ncbi.nlm.nih.gov/pubmed/37191988
ID  - info:doi/10.2196/45190
ER  - 

TY  - JOUR
AU  - Sun, Peng
AU  - Zhao, Yao
AU  - Men, Jie
AU  - Ma, Zhe-Ru
AU  - Jiang, Hao-Zhuo
AU  - Liu, Cheng-Yan
AU  - Feng, Wei
PY  - 2023/3/10
TI  - Application of Virtual and Augmented Reality Technology in Hip Surgery: Systematic Review
JO  - J Med Internet Res
SP  - e37599
VL  - 25
KW  - virtual reality
KW  - augmented reality
KW  - hip
KW  - pelvis
KW  - arthroplasty
KW  - mobile phone
N2  - Background: Virtual and augmented reality (VAR) represents a combination of current state-of-the-art computer and imaging technologies and has the potential to be a revolutionary technology in many surgical fields. An increasing number of investigators have developed and applied VAR in hip-related surgery with the aim of using this technology to reduce hip surgery?related complications, improve surgical success rates, and reduce surgical risks. These technologies are beginning to be widely used in hip-related preoperative operation simulation and training, intraoperative navigation tools in the operating room, and postoperative rehabilitation. Objective: With the aim of reviewing the current status of virtual reality (VR) and augmented reality (AR) in hip-related surgery and summarizing its benefits, we discussed and briefly described the applicability, advantages, limitations, and future perspectives of VR and AR techniques in hip-related surgery, such as preoperative operation simulation and training; explored the possible future applications of AR in the operating room; and discussed the bright prospects of VR and AR technologies in postoperative rehabilitation after hip surgery. Methods: We searched the PubMed and Web of Science databases using the following key search terms: (?virtual reality? OR ?augmented reality?) AND (?pelvis? OR ?hip?). The literature on basic and clinical research related to the aforementioned key search terms, that is, studies evaluating the key factors, challenges, or problems of using of VAR technology in hip-related surgery, was collected. Results: A total of 40 studies and reports were included and classified into the following categories: total hip arthroplasty, hip resurfacing, femoral neck fracture, pelvic fracture, acetabular fracture, tumor, arthroscopy, and postoperative rehabilitation. Quality assessment could be performed in 30 studies. Among the clinical studies, there were 16 case series with an average score of 89 out of 100 points (89%) and 1 case report that scored 81 (SD 10.11) out of 100 points (81%) according to the Joanna Briggs Institute Critical Appraisal Checklist. Two cadaveric studies scored 85 of 100 points (85%) and 92 of 100 points (92%) according to the Quality Appraisal for Cadaveric Studies scale. Conclusions: VR and AR technologies hold great promise for hip-related surgeries, especially for preoperative operation simulation and training, feasibility applications in the operating room, and postoperative rehabilitation, and have the potential to assist orthopedic surgeons in operating more accurately and safely. More comparative studies are necessary, including studies focusing on clinical outcomes and cost-effectiveness. 
UR  - https://www.jmir.org/2023/1/e37599
UR  - http://dx.doi.org/10.2196/37599
UR  - http://www.ncbi.nlm.nih.gov/pubmed/36651587
ID  - info:doi/10.2196/37599
ER  - 

TY  - JOUR
AU  - Gabriel, Allanigue Rodney
AU  - Simpson, Sierra
AU  - Zhong, William
AU  - Burton, Nicole Brittany
AU  - Mehdipour, Soraya
AU  - Said, Tadros Engy
PY  - 2023/2/8
TI  - A Neural Network Model Using Pain Score Patterns to Predict the Need for Outpatient Opioid Refills Following Ambulatory Surgery: Algorithm Development and Validation
JO  - JMIR Perioper Med
SP  - e40455
VL  - 6
KW  - opioids
KW  - ambulatory surgery
KW  - machine learning
KW  - surgery
KW  - outpatient
KW  - pain medication
KW  - pain
KW  - pain management
KW  - patient needs
KW  - predict
KW  - algorithms
KW  - clinical decision support
KW  - pain care
N2  - Background: Expansion of clinical guidance tools is crucial to identify patients at risk of requiring an opioid refill after outpatient surgery. Objective: The objective of this study was to develop machine learning algorithms incorporating pain and opioid features to predict the need for outpatient opioid refills following ambulatory surgery. Methods: Neural networks, regression, random forest, and a support vector machine were used to evaluate the data set. For each model, oversampling and undersampling techniques were implemented to balance the data set. Hyperparameter tuning based on k-fold cross-validation was performed, and feature importance was ranked based on a Shapley Additive Explanations (SHAP) explainer model. To assess performance, we calculated the average area under the receiver operating characteristics curve (AUC), F1-score, sensitivity, and specificity for each model. Results: There were 1333 patients, of whom 144 (10.8%) refilled their opioid prescription within 2 weeks after outpatient surgery. The average AUC calculated from k-fold cross-validation was 0.71 for the neural network model. When the model was validated on the test set, the AUC was 0.75. The features with the highest impact on model output were performance of a regional nerve block, postanesthesia care unit maximum pain score, postanesthesia care unit median pain score, active smoking history, and total perioperative opioid consumption. Conclusions: Applying machine learning algorithms allows providers to better predict outcomes that require specialized health care resources such as transitional pain clinics. This model can aid as a clinical decision support for early identification of at-risk patients who may benefit from transitional pain clinic care perioperatively in ambulatory surgery. 
UR  - https://periop.jmir.org/2023/1/e40455
UR  - http://dx.doi.org/10.2196/40455
UR  - http://www.ncbi.nlm.nih.gov/pubmed/36753316
ID  - info:doi/10.2196/40455
ER  - 

TY  - JOUR
AU  - Mlodzinski, Eric
AU  - Wardi, Gabriel
AU  - Viglione, Clare
AU  - Nemati, Shamim
AU  - Crotty Alexander, Laura
AU  - Malhotra, Atul
PY  - 2023/1/27
TI  - Assessing Barriers to Implementation of Machine Learning and Artificial Intelligence?Based Tools in Critical Care: Web-Based Survey Study
JO  - JMIR Perioper Med
SP  - e41056
VL  - 6
KW  - surveys and questionnaires
KW  - machine learning
KW  - artificial intelligence
KW  - critical care
KW  - respiratory insufficiency
KW  - survey
KW  - Qualtrics
KW  - questionnaire
KW  - perception
KW  - trust
KW  - perspective
KW  - attitude
KW  - intubation
KW  - predict
KW  - barrier
KW  - adoption
KW  - implementation
N2  - Background: Although there is considerable interest in machine learning (ML) and artificial intelligence (AI) in critical care, the implementation of effective algorithms into practice has been limited. Objective: We sought to understand physician perspectives of a novel intubation prediction tool. Further, we sought to understand health care provider and nonprovider perspectives on the use of ML in health care. We aim to use the data gathered to elucidate implementation barriers and determinants of this intubation prediction tool, as well as ML/AI-based algorithms in critical care and health care in general. Methods: We developed 2 anonymous surveys in Qualtrics, 1 single-center survey distributed to 99 critical care physicians via email, and 1 social media survey distributed via Facebook and Twitter with branching logic to tailor questions for providers and nonproviders. The surveys included a mixture of categorical, Likert scale, and free-text items. Likert scale means with SD were reported from 1 to 5. We used student t tests to examine the differences between groups. In addition, Likert scale responses were converted into 3 categories, and percentage values were reported in order to demonstrate the distribution of responses. Qualitative free-text responses were reviewed by a member of the study team to determine validity, and content analysis was performed to determine common themes in responses. Results: Out of 99 critical care physicians, 47 (48%) completed the single-center survey. Perceived knowledge of ML was low with a mean Likert score of 2.4 out of 5 (SD 0.96), with 7.5% of respondents rating their knowledge as a 4 or 5. The willingness to use the ML-based algorithm was 3.32 out of 5 (SD 0.95), with 75% of respondents answering 3 out of 5. The social media survey had 770 total responses with 605 (79%) providers and 165 (21%) nonproviders. We found no difference in providers? perceived knowledge based on level of experience in either survey. We found that nonproviders had significantly less perceived knowledge of ML (mean 3.04 out of 5, SD 1.53 vs mean 3.43, SD 0.941; P<.001) and comfort with ML (mean 3.28 out of 5, SD 1.02 vs mean 3.53, SD 0.935; P=.004) than providers. Free-text responses revealed multiple shared concerns, including accuracy/reliability, data bias, patient safety, and privacy/security risks. Conclusions: These data suggest that providers and nonproviders have positive perceptions of ML-based tools, and that a tool to predict the need for intubation would be of interest to critical care providers. There were many shared concerns about ML/AI in health care elucidated by the surveys. These results provide a baseline evaluation of implementation barriers and determinants of ML/AI-based tools that will be important in their optimal implementation and adoption in the critical care setting and health care in general. 
UR  - https://periop.jmir.org/2023/1/e41056
UR  - http://dx.doi.org/10.2196/41056
UR  - http://www.ncbi.nlm.nih.gov/pubmed/36705960
ID  - info:doi/10.2196/41056
ER  - 

TY  - JOUR
AU  - Gabriel, Allanigue Rodney
AU  - Harjai, Bhavya
AU  - Simpson, Sierra
AU  - Du, Liu Austin
AU  - Tully, Logan Jeffrey
AU  - George, Olivier
AU  - Waterman, Ruth
PY  - 2023/1/26
TI  - An Ensemble Learning Approach to Improving Prediction of Case Duration for Spine Surgery: Algorithm Development and Validation
JO  - JMIR Perioper Med
SP  - e39650
VL  - 6
KW  - ensemble learning
KW  - machine learning
KW  - spine surgery
KW  - case duration
KW  - prediction accuracy
KW  - operating room efficiency
KW  - learning
KW  - surgery
KW  - spine
KW  - operating room
KW  - case
KW  - model
KW  - patient
KW  - surgeon
KW  - linear regression
KW  - accuracy
KW  - estimation
KW  - time
N2  - Background: Estimating surgical case duration accurately is an important operating room efficiency metric. Current predictive techniques in spine surgery include less sophisticated approaches such as classical multivariable statistical models. Machine learning approaches have been used to predict outcomes such as length of stay and time returning to normal work, but have not been focused on case duration. Objective: The primary objective of this 4-year, single-academic-center, retrospective study was to use an ensemble learning approach that may improve the accuracy of scheduled case duration for spine surgery. The primary outcome measure was case duration. Methods: We compared machine learning models using surgical and patient features to our institutional method, which used historic averages and surgeon adjustments as needed. We implemented multivariable linear regression, random forest, bagging, and XGBoost (Extreme Gradient Boosting) and calculated the average R2, root-mean-square error (RMSE), explained variance, and mean absolute error (MAE) using k-fold cross-validation. We then used the SHAP (Shapley Additive Explanations) explainer model to determine feature importance. Results: A total of 3189 patients who underwent spine surgery were included. The institution?s current method of predicting case times has a very poor coefficient of determination with actual times (R2=0.213). On k-fold cross-validation, the linear regression model had an explained variance score of 0.345, an R2 of 0.34, an RMSE of 162.84 minutes, and an MAE of 127.22 minutes. Among all models, the XGBoost regressor performed the best with an explained variance score of 0.778, an R2 of 0.770, an RMSE of 92.95 minutes, and an MAE of 44.31 minutes. Based on SHAP analysis of the XGBoost regression, body mass index, spinal fusions, surgical procedure, and number of spine levels involved were the features with the most impact on the model. Conclusions: Using ensemble learning-based predictive models, specifically XGBoost regression, can improve the accuracy of the estimation of spine surgery times. 
UR  - https://periop.jmir.org/2023/1/e39650
UR  - http://dx.doi.org/10.2196/39650
UR  - http://www.ncbi.nlm.nih.gov/pubmed/36701181
ID  - info:doi/10.2196/39650
ER  - 

TY  - JOUR
AU  - Jozsa, Felix
AU  - Baker, Rose
AU  - Kelly, Peter
AU  - Ahmed, Muneer
AU  - Douek, Michael
PY  - 2022/11/15
TI  - The Use of Machine Learning to Reduce Overtreatment of the Axilla in Breast Cancer: Retrospective Cohort Study
JO  - JMIR Perioper Med
SP  - e34600
VL  - 5
IS  - 1
KW  - breast cancer
KW  - preoperative screening
KW  - machine learning
KW  - artificial intelligence
KW  - artificial neural network
KW  - breast
KW  - cancer
KW  - axillary node
KW  - metastasis
KW  - metastatic
KW  - preoperative
KW  - axillary clearance
KW  - metastases
KW  - oncology
N2  - Background: Patients with early breast cancer undergoing primary surgery, who have low axillary nodal burden, can safely forego axillary node clearance (ANC). However, routine use of axillary ultrasound (AUS) leads to 43% of patients in this group having ANC unnecessarily, following a positive AUS. The intersection of machine learning with medicine can provide innovative ways to understand specific risks within large patient data sets, but this has not yet been trialed in the arena of axillary node management in breast cancer. Objective: The objective of this study was to assess if machine learning techniques could be used to improve preoperative identification of patients with low and high axillary metastatic burden. Methods: A single-center retrospective analysis was performed on patients with breast cancer who had a preoperative AUS, and the specificity and sensitivity of AUS were calculated. Standard statistical methods and machine learning methods, including artificial neural network, naive Bayes, support vector machine, and random forest, were applied to the data to see if they could improve the accuracy of preoperative AUS to better discern high and low axillary burden. Results: The study included 459 patients; 142 (31%) had a positive AUS; among this group, 88 (62%) had 2 or fewer macrometastatic nodes at ANC. Logistic regression outperformed AUS (specificity 0.950 vs 0.809). Of all the methods, the artificial neural network had the highest accuracy (0.919). Interestingly, AUS had the highest sensitivity of all methods (0.777), underlining its utility in this setting. Conclusions: We demonstrated that machine learning improves identification of the important subgroup of patients with no palpable axillary disease, positive ultrasound, and more than 2 metastatically involved nodes. A negative ultrasound in patients with no palpable lymphadenopathy is highly indicative of low axillary burden, and it is unclear whether sentinel node biopsy adds value in this situation. Further studies with larger patient numbers focusing on specific breast cancer subgroups are required to refine these techniques in this setting. 
UR  - https://periop.jmir.org/2022/1/e34600
UR  - http://dx.doi.org/10.2196/34600
UR  - http://www.ncbi.nlm.nih.gov/pubmed/36378516
ID  - info:doi/10.2196/34600
ER  - 

TY  - JOUR
AU  - Bardia, Amit
AU  - Deshpande, Ranjit
AU  - Michel, George
AU  - Yanez, David
AU  - Dai, Feng
AU  - Pace, L. Nathan
AU  - Schuster, Kevin
AU  - Mathis, R. Michael
AU  - Kheterpal, Sachin
AU  - Schonberger, B. Robert
PY  - 2022/10/5
TI  - Demonstration and Performance Evaluation of Two Novel Algorithms for Removing Artifacts From Automated Intraoperative Temperature Data Sets: Multicenter, Observational, Retrospective Study
JO  - JMIR Perioper Med
SP  - e37174
VL  - 5
IS  - 1
KW  - temperature
KW  - intraoperative
KW  - artifacts
KW  - algorithms
KW  - perioperative
KW  - surgery
KW  - temperature probe
KW  - artifact reduction
KW  - data acquisition
KW  - accuracy
N2  - Background: The automated acquisition of intraoperative patient temperature data via temperature probes leads to the possibility of producing a number of artifacts related to probe positioning that may impact these probes? utility for observational research. Objective: We sought to compare the performance of two de novo algorithms for filtering such artifacts. Methods: In this observational retrospective study, the intraoperative temperature data of adults who received general anesthesia for noncardiac surgery were extracted from the Multicenter Perioperative Outcomes Group registry. Two algorithms were developed and then compared to the reference standard?anesthesiologists? manual artifact detection process. Algorithm 1 (a slope-based algorithm) was based on the linear curve fit of 3 adjacent temperature data points. Algorithm 2 (an interval-based algorithm) assessed for time gaps between contiguous temperature recordings. Sensitivity and specificity values for artifact detection were calculated for each algorithm, as were mean temperatures and areas under the curve for hypothermia (temperatures below 36 �C) for each patient, after artifact removal via each methodology. Results: A total of 27,683 temperature readings from 200 anesthetic records were analyzed. The overall agreement among the anesthesiologists was 92.1%. Both algorithms had high specificity but moderate sensitivity (specificity: 99.02% for algorithm 1 vs 99.54% for algorithm 2; sensitivity: 49.13% for algorithm 1 vs 37.72% for algorithm 2; F-score: 0.65 for algorithm 1 vs 0.55 for algorithm 2). The areas under the curve for time � hypothermic temperature and the mean temperatures recorded for each case after artifact removal were similar between the algorithms and the anesthesiologists. Conclusions: The tested algorithms provide an automated way to filter intraoperative temperature artifacts that closely approximates manual sorting by anesthesiologists. Our study provides evidence demonstrating the efficacy of highly generalizable artifact reduction algorithms that can be readily used by observational studies that rely on automated intraoperative data acquisition. 
UR  - https://periop.jmir.org/2022/1/e37174
UR  - http://dx.doi.org/10.2196/37174
UR  - http://www.ncbi.nlm.nih.gov/pubmed/36197702
ID  - info:doi/10.2196/37174
ER  - 

TY  - JOUR
AU  - McLeod, Graeme
AU  - Kennedy, Iain
AU  - Simpson, Eilidh
AU  - Joss, Judith
AU  - Goldmann, Katriona
PY  - 2022/3/30
TI  - Pilot Project for a Web-Based Dynamic Nomogram to Predict Survival 1 Year After Hip Fracture Surgery: Retrospective Observational Study
JO  - Interact J Med Res
SP  - e34096
VL  - 11
IS  - 1
KW  - hip fracture
KW  - survival
KW  - prediction
KW  - nomogram
KW  - web
KW  - surgery
KW  - postoperative
KW  - machine learning
KW  - model
KW  - mortality
KW  - hip
KW  - fracture
N2  - Background: Hip fracture is associated with high mortality. Identification of individual risk informs anesthetic and surgical decision-making and can reduce the risk of death. However, interpreting mathematical models and applying them in clinical practice can be difficult. There is a need to simplify risk indices for clinicians and laypeople alike. Objective: Our primary objective was to develop a web-based nomogram for prediction of survival up to 365 days after hip fracture surgery. Methods: We collected data from 329 patients. Our variables included sex; age; BMI; white cell count; levels of lactate, creatinine, hemoglobin, and C-reactive protein; physical status according to the American Society of Anesthesiologists Physical Status Classification System; socioeconomic status; duration of surgery; total time in the operating room; side of surgery; and procedure urgency. Thereafter, we internally calibrated and validated a Cox proportional hazards model of survival 365 days after hip fracture surgery; logistic regression models of survival 30, 120, and 365 days after surgery; and a binomial model. To present the models on a laptop, tablet, or mobile phone in a user-friendly way, we built an app using Shiny (RStudio). The app showed a drop-down box for model selection and horizontal sliders for data entry, model summaries, and prediction and survival plots. A slider represented patient follow-up over 365 days. Results: Of the 329 patients, 24 (7.3%) died within 30 days of surgery, 65 (19.8%) within 120 days, and 94 (28.6%) within 365 days. In all models, the independent predictors of mortality were age, BMI, creatinine level, and lactate level. The logistic model also incorporated white cell count as a predictor. The Cox proportional hazards model showed that mortality differed as follows: age 80 vs 60 years had a hazard ratio (HR) of 0.6 (95% CI 0.3-1.1), a plasma lactate level of 2 vs 1 mmol/L had an HR of 2.4 (95% CI 1.5-3.9), and a plasma creatinine level of 60 vs 90 mol/L had an HR of 2.3 (95% CI 1.3-3.9). Conclusions: In conclusion, we provide an easy-to-read web-based nomogram that predicts survival up to 365 days after hip fracture. The Cox proportional hazards model and logistic models showed good discrimination, with concordance index values of 0.732 and 0.781, respectively. 
UR  - https://www.i-jmr.org/2022/1/e34096
UR  - http://dx.doi.org/10.2196/34096
UR  - http://www.ncbi.nlm.nih.gov/pubmed/35238320
ID  - info:doi/10.2196/34096
ER  - 

TY  - JOUR
AU  - Shin, Jeong Seo
AU  - Park, Jungchan
AU  - Lee, Seung-Hwa
AU  - Yang, Kwangmo
AU  - Park, Woong Rae
PY  - 2021/10/14
TI  - Predictability of Mortality in Patients With Myocardial Injury After Noncardiac Surgery Based on Perioperative Factors via Machine Learning: Retrospective Study
JO  - JMIR Med Inform
SP  - e32771
VL  - 9
IS  - 10
KW  - myocardial injury after noncardiac surgery
KW  - high-sensitivity cardiac troponin
KW  - machine learning
KW  - extreme gradient boosting
N2  - Background: Myocardial injury after noncardiac surgery (MINS) is associated with increased postoperative mortality, but the relevant perioperative factors that contribute to the mortality of patients with MINS have not been fully evaluated. Objective: To establish a comprehensive body of knowledge relating to patients with MINS, we researched the best performing predictive model based on machine learning algorithms. Methods: Using clinical data from 7629 patients with MINS from the clinical data warehouse, we evaluated 8 machine learning algorithms for accuracy, precision, recall, F1 score, area under the receiver operating characteristic (AUROC) curve, and area under the precision-recall curve to investigate the best model for predicting mortality. Feature importance and Shapley Additive Explanations values were analyzed to explain the role of each clinical factor in patients with MINS. Results: Extreme gradient boosting outperformed the other models. The model showed an AUROC of 0.923 (95% CI 0.916-0.930). The AUROC of the model did not decrease in the test data set (0.894, 95% CI 0.86-0.922; P=.06). Antiplatelet drugs prescription, elevated C-reactive protein level, and beta blocker prescription were associated with reduced 30-day mortality. Conclusions: Predicting the mortality of patients with MINS was shown to be feasible using machine learning. By analyzing the impact of predictors, markers that should be cautiously monitored by clinicians may be identified. 
UR  - https://medinform.jmir.org/2021/10/e32771
UR  - http://dx.doi.org/10.2196/32771
UR  - http://www.ncbi.nlm.nih.gov/pubmed/34647900
ID  - info:doi/10.2196/32771
ER  - 

TY  - JOUR
AU  - Conway, Aaron
AU  - Jungquist, R. Carla
AU  - Chang, Kristina
AU  - Kamboj, Navpreet
AU  - Sutherland, Joanna
AU  - Mafeld, Sebastian
AU  - Parotto, Matteo
PY  - 2021/10/5
TI  - Predicting Prolonged Apnea During Nurse-Administered Procedural Sedation: Machine Learning Study
JO  - JMIR Perioper Med
SP  - e29200
VL  - 4
IS  - 2
KW  - procedural sedation and analgesia
KW  - conscious sedation
KW  - nursing
KW  - informatics
KW  - patient safety
KW  - machine learning
KW  - capnography
KW  - anesthesia
KW  - anaesthesia
KW  - medical informatics
KW  - sleep apnea
KW  - apnea
KW  - apnoea
KW  - sedation
N2  - Background: Capnography is commonly used for nurse-administered procedural sedation. Distinguishing between capnography waveform abnormalities that signal the need for clinical intervention for an event and those that do not indicate the need for intervention is essential for the successful implementation of this technology into practice. It is possible that capnography alarm management may be improved by using machine learning to create a ?smart alarm? that can alert clinicians to apneic events that are predicted to be prolonged. Objective: To determine the accuracy of machine learning models for predicting at the 15-second time point if apnea will be prolonged (ie, apnea that persists for >30 seconds). Methods: A secondary analysis of an observational study was conducted. We selected several candidate models to evaluate, including a random forest model, generalized linear model (logistic regression), least absolute shrinkage and selection operator regression, ridge regression, and the XGBoost model. Out-of-sample accuracy of the models was calculated using 10-fold cross-validation. The net benefit decision analytic measure was used to assist with deciding whether using the models in practice would lead to better outcomes on average than using the current default capnography alarm management strategies. The default strategies are the aggressive approach, in which an alarm is triggered after brief periods of apnea (typically 15 seconds) and the conservative approach, in which an alarm is triggered for only prolonged periods of apnea (typically >30 seconds). Results: A total of 384 apneic events longer than 15 seconds were observed in 61 of the 102 patients (59.8%) who participated in the observational study. Nearly half of the apneic events (180/384, 46.9%) were prolonged. The random forest model performed the best in terms of discrimination (area under the receiver operating characteristic curve 0.66) and calibration. The net benefit associated with the random forest model exceeded that associated with the aggressive strategy but was lower than that associated with the conservative strategy. Conclusions: Decision curve analysis indicated that using a random forest model would lead to a better outcome for capnography alarm management than using an aggressive strategy in which alarms are triggered after 15 seconds of apnea. The model would not be superior to the conservative strategy in which alarms are only triggered after 30 seconds. 
UR  - https://periop.jmir.org/2021/2/e29200
UR  - http://dx.doi.org/10.2196/29200
UR  - http://www.ncbi.nlm.nih.gov/pubmed/34609322
ID  - info:doi/10.2196/29200
ER  - 

TY  - JOUR
AU  - Choe, Sooho
AU  - Park, Eunjeong
AU  - Shin, Wooseok
AU  - Koo, Bonah
AU  - Shin, Dongjin
AU  - Jung, Chulwoo
AU  - Lee, Hyungchul
AU  - Kim, Jeongmin
PY  - 2021/9/30
TI  - Short-Term Event Prediction in the Operating Room (STEP-OP) of Five-Minute Intraoperative Hypotension Using Hybrid Deep Learning: Retrospective Observational Study and Model Development
JO  - JMIR Med Inform
SP  - e31311
VL  - 9
IS  - 9
KW  - arterial pressure
KW  - artificial intelligence
KW  - biosignals
KW  - deep learning
KW  - hypotension
KW  - machine learning
N2  - Background: Intraoperative hypotension has an adverse impact on postoperative outcomes. However, it is difficult to predict and treat intraoperative hypotension in advance according to individual clinical parameters. Objective: The aim of this study was to develop a prediction model to forecast 5-minute intraoperative hypotension based on the weighted average ensemble of individual neural networks, utilizing the biosignals recorded during noncardiac surgery. Methods: In this retrospective observational study, arterial waveforms were recorded during noncardiac operations performed between August 2016 and December 2019, at Seoul National University Hospital, Seoul, South Korea. We analyzed the arterial waveforms from the big data in the VitalDB repository of electronic health records. We defined 2s hypotension as the moving average of arterial pressure under 65 mmHg for 2 seconds, and intraoperative hypotensive events were defined when the 2s hypotension lasted for at least 60 seconds. We developed an artificial intelligence?enabled process, named short-term event prediction in the operating room (STEP-OP), for predicting short-term intraoperative hypotension. Results: The study was performed on 18,813 subjects undergoing noncardiac surgeries. Deep-learning algorithms (convolutional neural network [CNN] and recurrent neural network [RNN]) using raw waveforms as input showed greater area under the precision-recall curve (AUPRC) scores (0.698, 95% CI 0.690-0.705 and 0.706, 95% CI 0.698-0.715, respectively) than that of the logistic regression algorithm (0.673, 95% CI 0.665-0.682). STEP-OP performed better and had greater AUPRC values than those of the RNN and CNN algorithms (0.716, 95% CI 0.708-0.723). Conclusions: We developed STEP-OP as a weighted average of deep-learning models. STEP-OP predicts intraoperative hypotension more accurately than the CNN, RNN, and logistic regression models. Trial Registration: ClinicalTrials.gov NCT02914444; https://clinicaltrials.gov/ct2/show/NCT02914444. 
UR  - https://medinform.jmir.org/2021/9/e31311
UR  - http://dx.doi.org/10.2196/31311
UR  - http://www.ncbi.nlm.nih.gov/pubmed/34591024
ID  - info:doi/10.2196/31311
ER  - 

TY  - JOUR
AU  - Naqvi, Ali Syed Asil
AU  - Tennankore, Karthik
AU  - Vinson, Amanda
AU  - Roy, C. Patrice
AU  - Abidi, Raza Syed Sibte
PY  - 2021/8/27
TI  - Predicting Kidney Graft Survival Using Machine Learning Methods: Prediction Model Development and Feature Significance Analysis Study
JO  - J Med Internet Res
SP  - e26843
VL  - 23
IS  - 8
KW  - kidney transplantation
KW  - machine learning
KW  - predictive modeling
KW  - survival prediction
KW  - dimensionality reduction
KW  - feature sensitivity analysis
N2  - Background: Kidney transplantation is the optimal treatment for patients with end-stage renal disease. Short- and long-term kidney graft survival is influenced by a number of donor and recipient factors. Predicting the success of kidney transplantation is important for optimizing kidney allocation. Objective: The aim of this study was to predict the risk of kidney graft failure across three temporal cohorts (within 1 year, within 5 years, and after 5 years following a transplant) based on donor and recipient characteristics. We analyzed a large data set comprising over 50,000 kidney transplants covering an approximate 20-year period. Methods: We applied machine learning?based classification algorithms to develop prediction models for the risk of graft failure for three different temporal cohorts. Deep learning?based autoencoders were applied for data dimensionality reduction, which improved the prediction performance. The influence of features on graft survival for each cohort was studied by investigating a new nonoverlapping patient stratification approach. Results: Our models predicted graft survival with area under the curve scores of 82% within 1 year, 69% within 5 years, and 81% within 17 years. The feature importance analysis elucidated the varying influence of clinical features on graft survival across the three different temporal cohorts. Conclusions: In this study, we applied machine learning to develop risk prediction models for graft failure that demonstrated a high level of prediction performance. Acknowledging that these models performed better than those reported in the literature for existing risk prediction tools, future studies will focus on how best to incorporate these prediction models into clinical care algorithms to optimize the long-term health of kidney recipients. 
UR  - https://www.jmir.org/2021/8/e26843
UR  - http://dx.doi.org/10.2196/26843
UR  - http://www.ncbi.nlm.nih.gov/pubmed/34448704
ID  - info:doi/10.2196/26843
ER  - 

TY  - JOUR
AU  - Cao, Yang
AU  - N�slund, Ingmar
AU  - N�slund, Erik
AU  - Ottosson, Johan
AU  - Montgomery, Scott
AU  - Stenberg, Erik
PY  - 2021/8/19
TI  - Using a Convolutional Neural Network to Predict Remission of Diabetes After Gastric Bypass Surgery: Machine Learning Study From the Scandinavian Obesity Surgery Register
JO  - JMIR Med Inform
SP  - e25612
VL  - 9
IS  - 8
KW  - forecasting
KW  - clinical decision rules
KW  - remission induction
KW  - type 2 diabetes mellitus
KW  - gastric bypass
KW  - morbid obesity
N2  - Background: Prediction of diabetes remission is an important topic in the evaluation of patients with type 2 diabetes (T2D) before bariatric surgery. Several high-quality predictive indices are available, but artificial intelligence algorithms offer the potential for higher predictive capability. Objective: This study aimed to construct and validate an artificial intelligence prediction model for diabetes remission after Roux-en-Y gastric bypass surgery. Methods: Patients who underwent surgery from 2007 to 2017 were included in the study, with collection of individual data from the Scandinavian Obesity Surgery Registry (SOReg), the Swedish National Patients Register, the Swedish Prescribed Drugs Register, and Statistics Sweden. A 7-layer convolution neural network (CNN) model was developed using 80% (6446/8057) of patients randomly selected from SOReg and 20% (1611/8057) of patients for external testing. The predictive capability of the CNN model and currently used scores (DiaRem, Ad-DiaRem, DiaBetter, and individualized metabolic surgery) were compared. Results: In total, 8057 patients with T2D were included in the study. At 2 years after surgery, 77.09% achieved pharmacological remission (n=6211), while 63.07% (4004/6348) achieved complete remission. The CNN model showed high accuracy for cessation of antidiabetic drugs and complete remission of T2D after gastric bypass surgery. The area under the receiver operating characteristic curve (AUC) for the CNN model for pharmacological remission was 0.85 (95% CI 0.83-0.86) during validation and 0.83 for the final test, which was 9%-12% better than the traditional predictive indices. The AUC for complete remission was 0.83 (95% CI 0.81-0.85) during validation and 0.82 for the final test, which was 9%-11% better than the traditional predictive indices. Conclusions: The CNN method had better predictive capability compared to traditional indices for diabetes remission. However, further validation is needed in other countries to evaluate its external generalizability. 
UR  - https://medinform.jmir.org/2021/8/e25612
UR  - http://dx.doi.org/10.2196/25612
UR  - http://www.ncbi.nlm.nih.gov/pubmed/34420921
ID  - info:doi/10.2196/25612
ER  - 

TY  - JOUR
AU  - de Pennington, Nick
AU  - Mole, Guy
AU  - Lim, Ernest
AU  - Milne-Ives, Madison
AU  - Normando, Eduardo
AU  - Xue, Kanmin
AU  - Meinert, Edward
PY  - 2021/7/28
TI  - Safety and Acceptability of a Natural Language Artificial Intelligence Assistant to Deliver Clinical Follow-up to Cataract Surgery Patients: Proposal
JO  - JMIR Res Protoc
SP  - e27227
VL  - 10
IS  - 7
KW  - artificial intelligence
KW  - natural language processing
KW  - telemedicine
KW  - cataract
KW  - aftercare
KW  - speech recognition software
KW  - medical informatics
KW  - health services
KW  - health communication
KW  - delivery of health care
KW  - patient acceptance of health care
KW  - mental health
KW  - cell phone
KW  - internet
KW  - conversational agent
KW  - chatbot
KW  - expert systems
KW  - dialogue system
KW  - relational agent
N2  - Background: Due to an aging population, the demand for many services is exceeding the capacity of the clinical workforce. As a result, staff are facing a crisis of burnout from being pressured to deliver high-volume workloads, driving increasing costs for providers. Artificial intelligence (AI), in the form of conversational agents, presents a possible opportunity to enable efficiency in the delivery of care. Objective: This study aims to evaluate the effectiveness, usability, and acceptability of Dora agent: Ufonia?s autonomous voice conversational agent, an AI-enabled autonomous telemedicine call for the detection of postoperative cataract surgery patients who require further assessment. The objectives of this study are to establish Dora?s efficacy in comparison with an expert clinician, determine baseline sensitivity and specificity for the detection of true complications, evaluate patient acceptability, collect evidence for cost-effectiveness, and capture data to support further development and evaluation. Methods: Using an implementation science construct, the interdisciplinary study will be a mixed methods phase 1 pilot establishing interobserver reliability of the system, usability, and acceptability. This will be done using the following scales and frameworks: the system usability scale; assessment of Health Information Technology Interventions in Evidence-Based Medicine Evaluation Framework; the telehealth usability questionnaire; and the Non-Adoption, Abandonment, and Challenges to the Scale-up, Spread and Suitability framework. Results: The evaluation is expected to show that conversational technology can be used to conduct an accurate assessment and that it is acceptable to different populations with different backgrounds. In addition, the results will demonstrate how successfully the system can be delivered in organizations with different clinical pathways and how it can be integrated with their existing platforms. Conclusions: The project?s key contributions will be evidence of the effectiveness of AI voice conversational agents and their associated usability and acceptability. International Registered Report Identifier (IRRID): PRR1-10.2196/27227 
UR  - https://www.researchprotocols.org/2021/7/e27227
UR  - http://dx.doi.org/10.2196/27227
UR  - http://www.ncbi.nlm.nih.gov/pubmed/34319248
ID  - info:doi/10.2196/27227
ER  - 

TY  - JOUR
AU  - Joo, Hyeon
AU  - Burns, Michael
AU  - Kalidaikurichi Lakshmanan, Saradha Sai
AU  - Hu, Yaokun
AU  - Vydiswaran, Vinod V. G.
PY  - 2021/5/26
TI  - Neural Machine Translation?Based Automated Current Procedural Terminology Classification System Using Procedure Text: Development and Validation Study
JO  - JMIR Form Res
SP  - e22461
VL  - 5
IS  - 5
KW  - CPT classification
KW  - natural language processing
KW  - machine learning
KW  - neural machine translation
N2  - Background: Administrative costs for billing and insurance-related activities in the United States are substantial. One critical cause of the high overhead of administrative costs is medical billing errors. With advanced deep learning techniques, developing advanced models to predict hospital and professional billing codes has become feasible. These models can be used for administrative cost reduction and billing process improvements. Objective: In this study, we aim to develop an automated anesthesiology current procedural terminology (CPT) prediction system that translates manually entered surgical procedure text into standard forms using neural machine translation (NMT) techniques. The standard forms are calculated using similarity scores to predict the most appropriate CPT codes. Although this system aims to enhance medical billing coding accuracy to reduce administrative costs, we compare its performance with that of previously developed machine learning algorithms. Methods: We collected and analyzed all operative procedures performed at Michigan Medicine between January 2017 and June 2019 (2.5 years). The first 2 years of data were used to train and validate the existing models and compare the results from the NMT-based model. Data from 2019 (6-month follow-up period) were then used to measure the accuracy of the CPT code prediction. Three experimental settings were designed with different data types to evaluate the models. Experiment 1 used the surgical procedure text entered manually in the electronic health record. Experiment 2 used preprocessing of the procedure text. Experiment 3 used preprocessing of the combined procedure text and preoperative diagnoses. The NMT-based model was compared with the support vector machine (SVM) and long short-term memory (LSTM) models. Results: The NMT model yielded the highest top-1 accuracy in experiments 1 and 2 at 81.64% and 81.71% compared with the SVM model (81.19% and 81.27%, respectively) and the LSTM model (80.96% and 81.07%, respectively). The SVM model yielded the highest top-1 accuracy of 84.30% in experiment 3, followed by the LSTM model (83.70%) and the NMT model (82.80%). In experiment 3, the addition of preoperative diagnoses showed 3.7%, 3.2%, and 1.3% increases in the SVM, LSTM, and NMT models in top-1 accuracy over those in experiment 2, respectively. For top-3 accuracy, the SVM, LSTM, and NMT models achieved 95.64%, 95.72%, and 95.60% for experiment 1, 95.75%, 95.67%, and 95.69% for experiment 2, and 95.88%, 95.93%, and 95.06% for experiment 3, respectively. Conclusions: This study demonstrates the feasibility of creating an automated anesthesiology CPT classification system based on NMT techniques using surgical procedure text and preoperative diagnosis. Our results show that the performance of the NMT-based CPT prediction system is equivalent to that of the SVM and LSTM prediction models. Importantly, we found that including preoperative diagnoses improved the accuracy of using the procedure text alone. 
UR  - https://formative.jmir.org/2021/5/e22461
UR  - http://dx.doi.org/10.2196/22461
UR  - http://www.ncbi.nlm.nih.gov/pubmed/34037526
ID  - info:doi/10.2196/22461
ER  - 

TY  - JOUR
AU  - Chen, Zhipeng
AU  - Zeng, D. Daniel
AU  - Seltzer, N. Ryan G.
AU  - Hamilton, D. Blake
PY  - 2021/5/11
TI  - Automated Generation of Personalized Shock Wave Lithotripsy Protocols: Treatment Planning Using Deep Learning
JO  - JMIR Med Inform
SP  - e24721
VL  - 9
IS  - 5
KW  - nephrolithiasis
KW  - extracorporeal shock wave therapy
KW  - lithotripsy
KW  - treatment planning
KW  - deep learning
KW  - artificial intelligence
N2  - Background: Though shock wave lithotripsy (SWL) has developed to be one of the most common treatment approaches for nephrolithiasis in recent decades, its treatment planning is often a trial-and-error process based on physicians? subjective judgement. Physicians? inexperience with this modality can lead to low-quality treatment and unnecessary risks to patients. Objective: To improve the quality and consistency of shock wave lithotripsy treatment, we aimed to develop a deep learning model for generating the next treatment step by previous steps and preoperative patient characteristics and to produce personalized SWL treatment plans in a step-by-step protocol based on the deep learning model. Methods: We developed a deep learning model to generate the optimal power level, shock rate, and number of shocks in the next step, given previous treatment steps encoded by long short-term memory neural networks and preoperative patient characteristics. We constructed a next-step data set (N=8583) from top practices of renal SWL treatments recorded in the International Stone Registry. Then, we trained the deep learning model and baseline models (linear regression, logistic regression, random forest, and support vector machine) with 90% of the samples and validated them with the remaining samples. Results: The deep learning models for generating the next treatment steps outperformed the baseline models (accuracy = 98.8%, F1 = 98.0% for power levels; accuracy = 98.1%, F1 = 96.0% for shock rates; root mean squared error = 207, mean absolute error = 121 for numbers of shocks). The hypothesis testing showed no significant difference between steps generated by our model and the top practices (P=.480 for power levels; P=.782 for shock rates; P=.727 for numbers of shocks). Conclusions: The high performance of our deep learning approach shows its treatment planning capability on par with top physicians. To the best of our knowledge, our framework is the first effort to implement automated planning of SWL treatment via deep learning. It is a promising technique in assisting treatment planning and physician training at low cost. 
UR  - https://medinform.jmir.org/2021/5/e24721
UR  - http://dx.doi.org/10.2196/24721
UR  - http://www.ncbi.nlm.nih.gov/pubmed/33973862
ID  - info:doi/10.2196/24721
ER  -