TY  - JOUR
AU  - Liu, Huasheng
AU  - Shang, Guangqian
AU  - Shan, Qianqian
PY  - 2025/10/3
TI  - Deep Learning Algorithms in the Diagnosis of Basal Cell Carcinoma Using Dermatoscopy: Systematic Review and Meta-Analysis
JO  - J Med Internet Res
SP  - e73541
VL  - 27
KW  - deep learning algorithms
KW  - dermatoscopy
KW  - basal cell carcinoma
KW  - meta-analysis
KW  - artificial intelligence
KW  - AI
N2  - Background: In recent years, deep learning algorithms based on dermatoscopy have shown great potential in diagnosing basal cell carcinoma (BCC). However, the diagnostic performance of deep learning algorithms remains controversial. Objective: This meta-analysis evaluates the diagnostic performance of deep learning algorithms based on dermatoscopy in detecting BCC. Methods: An extensive search in PubMed, Embase, and Web of Science databases was conducted to locate pertinent studies published until November 4, 2024. This meta-analysis included articles that reported the diagnostic performance of deep learning algorithms based on dermatoscopy for detecting BCC. The quality and risk of bias in the included studies were assessed using the modified Quality Assessment of Diagnostic Accuracy Studies 2 tool. A bivariate random-effects model was used to calculate the pooled sensitivity and specificity, both with 95% CIs. Results: Of the 1941 studies identified, 15 (0.77%) were included (internal validation sets of 32,069 patients or images; external validation sets of 200 patients or images). For dermatoscopy-based deep learning algorithms, the pooled sensitivity, specificity, and area under the curve (AUC) were 0.96 (95% CI 0.93-0.98), 0.98 (95% CI 0.96-0.99), and 0.99 (95% CI 0.98-1.00). For dermatologists? diagnoses, the sensitivity, specificity, and AUC were 0.75 (95% CI 0.66-0.82), 0.97 (95% CI 0.95-0.98), and 0.96 (95% CI 0.94-0.98). The results showed that dermatoscopy-based deep learning algorithms had a higher AUC than dermatologists? performance when using internal validation datasets (z=2.63; P=.008). Conclusions: This meta-analysis suggests that deep learning algorithms based on dermatoscopy exhibit strong diagnostic performance for detecting BCC. However, the retrospective design of many included studies and variations in reference standards may restrict the generalizability of these findings. The models evaluated in the included studies generally showed improved performance over that of dermatologists in classifying dermatoscopic images of BCC using internal validation datasets, highlighting their potential to support future diagnoses. However, performance on internal validation datasets does not necessarily translate well to external validation datasets. Additional external validation of these results is necessary to enhance the application of deep learning in dermatological diagnostics. Trial Registration: PROSPERO International Prospective Register of Systematic Reviews CRD42025633947; https://www.crd.york.ac.uk/PROSPERO/view/CRD42025633947  
UR  - https://www.jmir.org/2025/1/e73541
UR  - http://dx.doi.org/10.2196/73541
UR  - http://www.ncbi.nlm.nih.gov/pubmed/
ID  - info:doi/10.2196/73541
ER  - 

TY  - JOUR
AU  - McRae, Charlotte
AU  - Zhang, Dan Ting
AU  - Seeley, Donoghue Leslie
AU  - Anderson, Michael
AU  - Turner, Laci
AU  - Graham, V. Lauren
PY  - 2025/9/16
TI  - Patient Perceptions of Artificial Intelligence and Telemedicine in Dermatology: Narrative Review
JO  - JMIR Dermatol
SP  - e75454
VL  - 8
KW  - digital health
KW  - technology
KW  - patient-centered care
KW  - health care innovation
KW  - trust
KW  - convergence
KW  - artificial intelligence
KW  - teledermatology
N2  - Background: Artificial intelligence (AI) and telemedicine have significant potential to transform dermatology care delivery, but patient perspectives on these technologies have not been systematically compared. Objective: This study aimed to examine patient perspectives on AI and telemedicine in dermatology to inform implementation strategies as these technologies increasingly converge in clinical practice. Methods: A comprehensive literature search was conducted using PubMed, Scopus, and Embase databases between August 2024 and October 2024. We identified 48 papers addressing patient perspectives on AI and telemedicine in dermatology, with none directly comparing patients? views of both technologies. Results: Several distinct themes emerged regarding patient perspectives on these technologies: willingness to use, perceived benefits and risks, barriers to implementation, and conditions necessary for successful integration. Findings revealed that patients express hesitancy toward AI-based diagnoses that lack dermatologist involvement, while preferences for teledermatology varied by reason for appointment, age, and previous technology exposure. Patients? motivations for implementing AI are connected to its potential for quicker diagnoses and improved triage efficiency. At the same time, telemedicine addresses logistical challenges such as reduced travel time and improved appointment availability. Both technologies were perceived to improve accessibility and diagnostic efficiency, though patients expressed concerns about AI?s limited communication abilities and teledermatology?s inability to perform physical examinations. Primary adoption barriers for these modalities included technological limitations and trust concerns, with patients emphasizing the need for dermatologist oversight, transparency, and adequate educational resources for successful integration. Conclusions: The complementary strengths of AI and teledermatology suggest they could mitigate each other?s limitations when integrated?AI potentially enhancing teledermatology?s diagnostic accuracy, while teledermatology addresses AI?s lack of human connection. By thoroughly examining these perspectives, this review may serve as a guide for the patient-centered integration of technology in the future landscape of accessible dermatologic care. 
UR  - https://derma.jmir.org/2025/1/e75454
UR  - http://dx.doi.org/10.2196/75454
ID  - info:doi/10.2196/75454
ER  - 

TY  - JOUR
AU  - Ghazanfar, Noshela Misbah
AU  - Al-Mousawi, Ali
AU  - Riemer, Christian
AU  - Björnsson, Ţór Benóný
AU  - Boissard, Charlotte
AU  - Lee, Ivy
AU  - Ali, Zarqa
AU  - Thomsen, Francis Simon
PY  - 2025/7/16
TI  - Effectiveness of a Machine Learning-Enabled Skincare Recommendation for Mild-to-Moderate Acne Vulgaris: 8-Week Evaluator-Blinded Randomized Controlled Trial
JO  - JMIR Dermatol
SP  - e60883
VL  - 8
KW  - machine learning
KW  - personalised skincare
KW  - acne vulgaris
KW  - dermatology
KW  - skincare
N2  - Background: Acne vulgaris (AV) is one of the most common skin disorders, with a peak incidence in adolescence and early adulthood. Topical treatments are usually used for mild to moderate AV; however, a lack of adherence to topical treatment is seen in patients due to various reasons. Therefore, personalized skincare recommendations may be beneficial for treating mild-to-moderate AV. Objective: This study aimed to evaluate the effectiveness of a novel machine learning approach in predicting the optimal treatment for mild-to-moderate AV based on self-assessment and objective measures. Methods: A randomized, evaluator-blinded, parallel-group study was conducted on 100 patients recruited from an internet-based database and randomized in a 1:1 ratio (groups A and B) based on their consent form submission. Groups A and B received customized product recommendations using a Bayesian machine learning model and self-selected treatments, respectively. The patients submitted self-assessed disease scores and photographs after the 8-week treatment. The primary and secondary outcomes were photograph evaluation by two board-certified dermatologists using the Investigator Global Assessment (IGA) scores and quality of life (QoL) measured using the Dermatology Life Quality Index (DLQI), respectively. Results: Overall, 99 patients were screened, and 68 patients (mean age: 27 years, SD 4.56 years) were randomized into groups A (customized) and B (self-selected). IGA scores significantly improved after treatment in group A but not in group B (mean difference in IGA score; group A=0.32, P=.04 vs group B=0.09, P=.54). The DLQI significantly improved in group A from 7.75 at baseline to 3.5 (P<.001) after treatment but reduced in group B from 7.53 to 5.3 (P>.05). IGA scores and the DLQI were significantly correlated in group A, but not in group B. A total of 3 patients reported adverse reactions in group B, but none in group A. Conclusions: Using a machine learning model for personalized skincare recommendations significantly reduced symptoms and improved severity and overall QoL of patients with mild-to-moderate AV, supporting the potential of machine learning-based personalized treatment options in dermatology. 
UR  - https://derma.jmir.org/2025/1/e60883
UR  - http://dx.doi.org/10.2196/60883
ID  - info:doi/10.2196/60883
ER  - 

TY  - JOUR
AU  - Brehmer, Alexander
AU  - Seibold, Constantin
AU  - Egger, Jan
AU  - Majjouti, Khalid
AU  - Tapp-Herrenbrück, Michaela
AU  - Pinnekamp, Hannah
AU  - Priester, Vanessa
AU  - Aleithe, Michael
AU  - Fischer, Uli
AU  - Hosters, Bernadette
AU  - Kleesiek, Jens
PY  - 2025/5/1
TI  - Fine-Grained Classification of Pressure Ulcers and Incontinence-Associated Dermatitis Using Multimodal Deep Learning: Algorithm Development and Validation Study
JO  - JMIR AI
SP  - e67356
VL  - 4
KW  - computer vision
KW  - image classification
KW  - wound classification
KW  - deep learning
KW  - pressure ulcer
KW  - incontinence-associated dermatitis
KW  - multi modal data
KW  - synthetic image generation
N2  - Background: Pressure ulcers (PUs) and incontinence-associated dermatitis (IAD) are prevalent conditions in clinical settings, posing significant challenges due to their similar presentations but differing treatment needs. Accurate differentiation between PUs and IAD is essential for appropriate patient care, yet it remains a burden for nursing staff and wound care experts. Objective: This study aims to develop and introduce a robust multimodal deep learning framework for the classification of PUs and IAD, along with the fine-grained categorization of their respective wound severities, to enhance diagnostic accuracy and support clinical decision-making. Methods: We collected and annotated a dataset of 1555 wound images, achieving consensus among 4 wound experts. Our framework integrates wound images with categorical patient data to improve classification performance. We evaluated 4 models?2 convolutional neural networks and 2 transformer-based architectures?each with approximately 25 million parameters. Various data preprocessing strategies, augmentation techniques, training methods (including multimodal data integration, synthetic data generation, and sampling), and postprocessing approaches (including ensembling and test-time augmentation) were systematically tested to optimize model performance. Results: The transformer-based TinyViT model achieved the highest performance in binary classification of PU and IAD, with an F1-score (harmonic mean of precision and recall) of 93.23%, outperforming wound care experts and nursing staff on the test dataset. In fine-grained classification of wound categories, the TinyViT model also performed best for PU categories with an F1-score of 75.43%, while ConvNeXtV2 showed superior performance in IAD category classification with an F1-score of 53.20%. Incorporating multimodal data improved performance in binary classification but had less impact on fine-grained categorization. Augmentation strategies and training techniques significantly influenced model performance, with ensembling enhancing accuracy across all tasks. Conclusions: Our multimodal deep learning framework effectively differentiates between PUs and IAD, achieving high accuracy and outperforming human wound care experts. By integrating wound images with categorical patient data, the model enhances diagnostic precision, offering a valuable decision-support tool for health care professionals. This advancement has the potential to reduce diagnostic uncertainty, optimize treatment pathways, and alleviate the burden on medical staff, leading to faster interventions and improved patient outcomes. The framework?s strong performance suggests practical applications in clinical settings, such as integration into hospital electronic health record systems or mobile applications for bedside diagnostics. Future work should focus on validating real-world implementation, expanding dataset diversity, and refining fine-grained classification capabilities to further enhance clinical utility. 
UR  - https://ai.jmir.org/2025/1/e67356
UR  - http://dx.doi.org/10.2196/67356
ID  - info:doi/10.2196/67356
ER  - 

TY  - JOUR
AU  - Jones, Tudor Owain
AU  - Calanzani, Natalia
AU  - Scott, E. Suzanne
AU  - Matin, N. Rubeta
AU  - Emery, Jon
AU  - Walter, M. Fiona
PY  - 2025/1/28
TI  - User and Developer Views on Using AI Technologies to Facilitate the Early Detection of Skin Cancers in Primary Care Settings: Qualitative Semistructured Interview Study
JO  - JMIR Cancer
SP  - e60653
VL  - 11
KW  - artificial intelligence
KW  - AI
KW  - machine learning
KW  - ML
KW  - primary care
KW  - skin cancer
KW  - melanoma
KW  - qualitative research
KW  - mobile phone
N2  - Background: Skin cancers, including melanoma and keratinocyte cancers, are among the most common cancers worldwide, and their incidence is rising in most populations. Earlier detection of skin cancer leads to better outcomes for patients. Artificial intelligence (AI) technologies have been applied to skin cancer diagnosis, but many technologies lack clinical evidence and/or the appropriate regulatory approvals. There are few qualitative studies examining the views of relevant stakeholders or evidence about the implementation and positioning of AI technologies in the skin cancer diagnostic pathway. Objective: This study aimed to understand the views of several stakeholder groups on the use of AI technologies to facilitate the early diagnosis of skin cancer, including patients, members of the public, general practitioners, primary care nurse practitioners, dermatologists, and AI researchers. Methods: This was a qualitative, semistructured interview study with 29 stakeholders. Participants were purposively sampled based on age, sex, and geographical location. We conducted the interviews via Zoom between September 2022 and May 2023. Transcribed recordings were analyzed using thematic framework analysis. The framework for the Nonadoption, Abandonment, and Challenges to Scale-Up, Spread, and Sustainability was used to guide the analysis to help understand the complexity of implementing diagnostic technologies in clinical settings. Results: Major themes were ?the position of AI in the skin cancer diagnostic pathway? and ?the aim of the AI technology?; cross-cutting themes included trust, usability and acceptability, generalizability, evaluation and regulation, implementation, and long-term use. There was no clear consensus on where AI should be placed along the skin cancer diagnostic pathway, but most participants saw the technology in the hands of either patients or primary care practitioners. Participants were concerned about the quality of the data used to develop and test AI technologies and the impact this could have on their accuracy in clinical use with patients from a range of demographics and the risk of missing skin cancers. Ease of use and not increasing the workload of already strained health care services were important considerations for participants. Health care professionals and AI researchers reported a lack of established methods of evaluating and regulating AI technologies. Conclusions: This study is one of the first to examine the views of a wide range of stakeholders on the use of AI technologies to facilitate early diagnosis of skin cancer. The optimal approach and position in the diagnostic pathway for these technologies have not yet been determined. AI technologies need to be developed and implemented carefully and thoughtfully, with attention paid to the quality and representativeness of the data used for development, to achieve their potential. 
UR  - https://cancer.jmir.org/2025/1/e60653
UR  - http://dx.doi.org/10.2196/60653
UR  - http://www.ncbi.nlm.nih.gov/pubmed/
ID  - info:doi/10.2196/60653
ER  - 

TY  - JOUR
AU  - Willem, Theresa
AU  - Wollek, Alessandro
AU  - Cheslerean-Boghiu, Theodor
AU  - Kenney, Martha
AU  - Buyx, Alena
PY  - 2025/1/28
TI  - The Social Construction of Categorical Data: Mixed Methods Approach to Assessing Data Features in Publicly Available Datasets
JO  - JMIR Med Inform
SP  - e59452
VL  - 13
KW  - machine learning
KW  - categorical data
KW  - social context dependency
KW  - mixed methods
KW  - dermatology
KW  - dataset analysis
N2  - Background: In data-sparse areas such as health care, computer scientists aim to leverage as much available information as possible to increase the accuracy of their machine learning models? outputs. As a standard, categorical data, such as patients? gender, socioeconomic status, or skin color, are used to train models in fusion with other data types, such as medical images and text-based medical information. However, the effects of including categorical data features for model training in such data-scarce areas are underexamined, particularly regarding models intended to serve individuals equitably in a diverse population. Objective: This study aimed to explore categorical data?s effects on machine learning model outputs, rooted the effects in the data collection and dataset publication processes, and proposed a mixed methods approach to examining datasets? data categories before using them for machine learning training. Methods: Against the theoretical background of the social construction of categories, we suggest a mixed methods approach to assess categorical data?s utility for machine learning model training. As an example, we applied our approach to a Brazilian dermatological dataset (Dermatological and Surgical Assistance Program at the Federal University of Espírito Santo [PAD-UFES] 20). We first present an exploratory, quantitative study that assesses the effects when including or excluding each of the unique categorical data features of the PAD-UFES 20 dataset for training a transformer-based model using a data fusion algorithm. We then pair our quantitative analysis with a qualitative examination of the data categories based on interviews with the dataset authors. Results: Our quantitative study suggests scattered effects of including categorical data for machine learning model training across predictive classes. Our qualitative analysis gives insights into how the categorical data were collected and why they were published, explaining some of the quantitative effects that we observed. Our findings highlight the social constructedness of categorical data in publicly available datasets, meaning that the data in a category heavily depend on both how these categories are defined by the dataset creators and the sociomedico context in which the data are collected. This reveals relevant limitations of using publicly available datasets in contexts different from those of the collection of their data. Conclusions: We caution against using data features of publicly available datasets without reflection on the social construction and context dependency of their categorical data features, particularly in data-sparse areas. We conclude that social scientific, context-dependent analysis of available data features using both quantitative and qualitative methods is helpful in judging the utility of categorical data for the population for which a model is intended. 
UR  - https://medinform.jmir.org/2025/1/e59452
UR  - http://dx.doi.org/10.2196/59452
UR  - http://www.ncbi.nlm.nih.gov/pubmed/
ID  - info:doi/10.2196/59452
ER  - 

TY  - JOUR
AU  - Wang, Wei
AU  - Chen, Xiang
AU  - Xu, Licong
AU  - Huang, Kai
AU  - Zhao, Shuang
AU  - Wang, Yong
PY  - 2024/12/27
TI  - Artificial Intelligence?Aided Diagnosis System for the Detection and Classification of Private-Part Skin Diseases: Decision Analytical Modeling Study
JO  - J Med Internet Res
SP  - e52914
VL  - 26
KW  - artificial intelligence-aided diagnosis
KW  - private parts
KW  - skin disease
KW  - knowledge graph
KW  - dermatology
KW  - classification
KW  - artificial intelligence
KW  - AI
KW  - diagnosis
N2  - Background: Private-part skin diseases (PPSDs) can cause a patient?s stigma, which may hinder the early diagnosis of these diseases. Artificial intelligence (AI) is an effective tool to improve the early diagnosis of PPSDs, especially in preventing the deterioration of skin tumors in private parts such as Paget disease. However, to our knowledge, there is currently no research on using AI to identify PPSDs due to the complex backgrounds of the lesion areas and the challenges in data collection. Objective: This study aimed to develop and evaluate an AI-aided diagnosis system for the detection and classification of PPSDs: aiding patients in self-screening and supporting dermatologists? diagnostic enhancement. Methods: In this decision analytical modeling study, a 2-stage AI-aided diagnosis system was developed to classify PPSDs. In the first stage, a multitask detection network was trained to automatically detect and classify skin lesions (type, color, and shape). In the second stage, we proposed a knowledge graph based on dermatology expertise and constructed a decision network to classify seven PPSDs (condyloma acuminatum, Paget disease, eczema, pearly penile papules, genital herpes, syphilis, and Bowen disease). A reader study with 13 dermatologists of different experience levels was conducted. Dermatologists were asked to classify the testing cohort under reading room conditions, first without and then with system support. This AI-aided diagnostic study used the data of 635 patients from two institutes between July 2019 and April 2022. The data of Institute 1 contained 2701 skin lesion samples from 520 patients, which were used for the training of the multitask detection network in the first stage. In addition, the data of Institute 2 consisted of 115 clinical images and the corresponding medical records, which were used for the test of the whole 2-stage AI-aided diagnosis system. Results: On the test data of Institute 2, the proposed system achieved the average precision, recall, and F1-score of 0.81, 0.86, and 0.83, respectively, better than existing advanced algorithms. For the reader performance test, our system improved the average F1-score of the junior, intermediate, and senior dermatologists by 16%, 7%, and 4%, respectively. Conclusions: In this study, we constructed the first skin-lesion?based dataset and developed the first AI-aided diagnosis system for PPSDs. This system provides the final diagnosis result by simulating the diagnostic process of dermatologists. Compared with existing advanced algorithms, this system is more accurate in identifying PPSDs. Overall, our system can not only help patients achieve self-screening and alleviate their stigma but also assist dermatologists in diagnosing PPSDs. 
UR  - https://www.jmir.org/2024/1/e52914
UR  - http://dx.doi.org/10.2196/52914
UR  - http://www.ncbi.nlm.nih.gov/pubmed/39729353
ID  - info:doi/10.2196/52914
ER  - 

TY  - JOUR
AU  - Parekh, Pranav
AU  - Oyeleke, Richard
AU  - Vishwanath, Tejas
PY  - 2024/12/18
TI  - The Depth Estimation and Visualization of Dermatological Lesions: Development and Usability Study
JO  - JMIR Dermatol
SP  - e59839
VL  - 7
KW  - machine learning
KW  - ML
KW  - computer vision
KW  - neural networks
KW  - explainable AI
KW  - XAI
KW  - computer graphics
KW  - red spot analysis
KW  - mixed reality
KW  - MR
KW  - artificial intelligence
KW  - visualization
N2  - Background: Thus far, considerable research has been focused on classifying a lesion as benign or malignant. However, there is a requirement for quick depth estimation of a lesion for the accurate clinical staging of the lesion. The lesion could be malignant and quickly grow beneath the skin. While biopsy slides provide clear information on lesion depth, it is an emerging domain to find quick and noninvasive methods to estimate depth, particularly based on 2D images. Objective: This study proposes a novel methodology for the depth estimation and visualization of skin lesions. Current diagnostic methods are approximate in determining how much a lesion may have proliferated within the skin. Using color gradients and depth maps, this method will give us a definite estimate and visualization procedure for lesions and other skin issues. We aim to generate 3D holograms of the lesion depth such that dermatologists can better diagnose melanoma. Methods: We started by performing classification using a convolutional neural network (CNN), followed by using explainable artificial intelligence to localize the image features responsible for the CNN output. We used the gradient class activation map approach to perform localization of the lesion from the rest of the image. We applied computer graphics for depth estimation and developing the 3D structure of the lesion. We used the depth from defocus method for depth estimation from single images and Gabor filters for volumetric representation of the depth map. Our novel method, called red spot analysis, measures the degree of infection based on how a conical hologram is constructed. We collaborated with a dermatologist to analyze the 3D hologram output and received feedback on how this method can be introduced to clinical implementation. Results: The neural model plus the explainable artificial intelligence algorithm achieved an accuracy of 86% in classifying the lesions correctly as benign or malignant. For the entire pipeline, we mapped the benign and malignant cases to their conical representations. We received exceedingly positive feedback while pitching this idea at the King Edward Memorial Institute in India. Dermatologists considered this a potentially useful tool in the depth estimation of lesions. We received a number of ideas for evaluating the technique before it can be introduced to the clinical scene. Conclusions: When we map the CNN outputs (benign or malignant) to the corresponding hologram, we observe that a malignant lesion has a higher concentration of red spots (infection) in the upper and deeper portions of the skin, and that the malignant cases have deeper conical sections when compared with the benign cases. This proves that the qualitative results map with the initial classification performed by the neural model. The positive feedback provided by the dermatologist suggests that the qualitative conclusion of the method is sufficient. 
UR  - https://derma.jmir.org/2024/1/e59839
UR  - http://dx.doi.org/10.2196/59839
UR  - http://www.ncbi.nlm.nih.gov/pubmed/
ID  - info:doi/10.2196/59839
ER  - 

TY  - JOUR
AU  - Liu, Xu
AU  - Duan, Chaoli
AU  - Kim, Min-kyu
AU  - Zhang, Lu
AU  - Jee, Eunjin
AU  - Maharjan, Beenu
AU  - Huang, Yuwei
AU  - Du, Dan
AU  - Jiang, Xian
PY  - 2024/8/6
TI  - Claude 3 Opus and ChatGPT With GPT-4 in Dermoscopic Image Analysis for Melanoma Diagnosis: Comparative Performance Analysis
JO  - JMIR Med Inform
SP  - e59273
VL  - 12
KW  - artificial intelligence
KW  - AI
KW  - large language model
KW  - LLM
KW  - Claude
KW  - ChatGPT
KW  - dermatologist
N2  - Background: Recent advancements in artificial intelligence (AI) and large language models (LLMs) have shown potential in medical fields, including dermatology. With the introduction of image analysis capabilities in LLMs, their application in dermatological diagnostics has garnered significant interest. These capabilities are enabled by the integration of computer vision techniques into the underlying architecture of LLMs. Objective: This study aimed to compare the diagnostic performance of Claude 3 Opus and ChatGPT with GPT-4 in analyzing dermoscopic images for melanoma detection, providing insights into their strengths and limitations. Methods: We randomly selected 100 histopathology-confirmed dermoscopic images (50 malignant, 50 benign) from the International Skin Imaging Collaboration (ISIC) archive using a computer-generated randomization process. The ISIC archive was chosen due to its comprehensive and well-annotated collection of dermoscopic images, ensuring a diverse and representative sample. Images were included if they were dermoscopic images of melanocytic lesions with histopathologically confirmed diagnoses. Each model was given the same prompt, instructing it to provide the top 3 differential diagnoses for each image, ranked by likelihood. Primary diagnosis accuracy, accuracy of the top 3 differential diagnoses, and malignancy discrimination ability were assessed. The McNemar test was chosen to compare the diagnostic performance of the 2 models, as it is suitable for analyzing paired nominal data. Results: In the primary diagnosis, Claude 3 Opus achieved 54.9% sensitivity (95% CI 44.08%-65.37%), 57.14% specificity (95% CI 46.31%-67.46%), and 56% accuracy (95% CI 46.22%-65.42%), while ChatGPT demonstrated 56.86% sensitivity (95% CI 45.99%-67.21%), 38.78% specificity (95% CI 28.77%-49.59%), and 48% accuracy (95% CI 38.37%-57.75%). The McNemar test showed no significant difference between the 2 models (P=.17). For the top 3 differential diagnoses, Claude 3 Opus and ChatGPT included the correct diagnosis in 76% (95% CI 66.33%-83.77%) and 78% (95% CI 68.46%-85.45%) of cases, respectively. The McNemar test showed no significant difference (P=.56). In malignancy discrimination, Claude 3 Opus outperformed ChatGPT with 47.06% sensitivity, 81.63% specificity, and 64% accuracy, compared to 45.1%, 42.86%, and 44%, respectively. The McNemar test showed a significant difference (P<.001). Claude 3 Opus had an odds ratio of 3.951 (95% CI 1.685-9.263) in discriminating malignancy, while ChatGPT-4 had an odds ratio of 0.616 (95% CI 0.297-1.278). Conclusions: Our study highlights the potential of LLMs in assisting dermatologists but also reveals their limitations. Both models made errors in diagnosing melanoma and benign lesions. These findings underscore the need for developing robust, transparent, and clinically validated AI models through collaborative efforts between AI researchers, dermatologists, and other health care professionals. While AI can provide valuable insights, it cannot yet replace the expertise of trained clinicians. 
UR  - https://medinform.jmir.org/2024/1/e59273
UR  - http://dx.doi.org/10.2196/59273
UR  - http://www.ncbi.nlm.nih.gov/pubmed/
ID  - info:doi/10.2196/59273
ER  - 

TY  - JOUR
AU  - Gassner, Mathias
AU  - Barranco Garcia, Javier
AU  - Tanadini-Lang, Stephanie
AU  - Bertoldo, Fabio
AU  - Fröhlich, Fabienne
AU  - Guckenberger, Matthias
AU  - Haueis, Silvia
AU  - Pelzer, Christin
AU  - Reyes, Mauricio
AU  - Schmithausen, Patrick
AU  - Simic, Dario
AU  - Staeger, Ramon
AU  - Verardi, Fabio
AU  - Andratschke, Nicolaus
AU  - Adelmann, Andreas
AU  - Braun, P. Ralph
PY  - 2023/8/24
TI  - Saliency-Enhanced Content-Based Image Retrieval for Diagnosis Support in Dermatology Consultation: Reader Study
JO  - JMIR Dermatol
SP  - e42129
VL  - 6
KW  - dermatology
KW  - deep learning
KW  - melanoma
KW  - saliency maps
KW  - image retrieval
KW  - dermoscopy
KW  - skin cancer
KW  - diagnosis
KW  - algorithms
KW  - convolutional neural network
KW  - dermoscopic images
N2  - Background: Previous research studies have demonstrated that medical content image retrieval can play an important role by assisting dermatologists in skin lesion diagnosis. However, current state-of-the-art approaches have not been adopted in routine consultation, partly due to the lack of interpretability limiting trust by clinical users. Objective: This study developed a new image retrieval architecture for polarized or dermoscopic imaging guided by interpretable saliency maps. This approach provides better feature extraction, leading to better quantitative retrieval performance as well as providing interpretability for an eventual real-world implementation. Methods: Content-based image retrieval (CBIR) algorithms rely on the comparison of image features embedded by convolutional neural network (CNN) against a labeled data set. Saliency maps are computer vision?interpretable methods that highlight the most relevant regions for the prediction made by a neural network. By introducing a fine-tuning stage that includes saliency maps to guide feature extraction, the accuracy of image retrieval is optimized. We refer to this approach as saliency-enhanced CBIR (SE-CBIR). A reader study was designed at the University Hospital Zurich Dermatology Clinic to evaluate SE-CBIR?s retrieval accuracy as well as the impact of the participant?s confidence on the diagnosis. Results: SE-CBIR improved the retrieval accuracy by 7% (77% vs 84%) when doing single-lesion retrieval against traditional CBIR. The reader study showed an overall increase in classification accuracy of 22% (62% vs 84%) when the participant is provided with SE-CBIR retrieved images. In addition, the overall confidence in the lesion?s diagnosis increased by 24%. Finally, the use of SE-CBIR as a support tool helped the participants reduce the number of nonmelanoma lesions previously diagnosed as melanoma (overdiagnosis) by 53%. Conclusions: SE-CBIR presents better retrieval accuracy compared to traditional CBIR CNN-based approaches. Furthermore, we have shown how these support tools can help dermatologists and residents improve diagnosis accuracy and confidence. Additionally, by introducing interpretable methods, we should expect increased acceptance and use of these tools in routine consultation. 
UR  - https://derma.jmir.org/2023/1/e42129
UR  - http://dx.doi.org/10.2196/42129
UR  - http://www.ncbi.nlm.nih.gov/pubmed/37616039
ID  - info:doi/10.2196/42129
ER  - 

TY  - JOUR
AU  - Zhang, Xinyuan
AU  - Xie, Ziqian
AU  - Xiang, Yang
AU  - Baig, Imran
AU  - Kozman, Mena
AU  - Stender, Carly
AU  - Giancardo, Luca
AU  - Tao, Cui
PY  - 2022/12/12
TI  - Issues in Melanoma Detection: Semisupervised Deep Learning Algorithm Development via a Combination of Human and Artificial Intelligence
JO  - JMIR Dermatol
SP  - e39113
VL  - 5
IS  - 4
KW  - deep learning
KW  - dermoscopic images
KW  - semisupervised learning
KW  - 3-point checklist
KW  - skin lesion
KW  - dermatology
KW  - algorithm
KW  - melanoma classification
KW  - melanoma
KW  - automatic diagnosis
KW  - skin disease
N2  - Background: Automatic skin lesion recognition has shown to be effective in increasing access to reliable dermatology evaluation; however, most existing algorithms rely solely on images. Many diagnostic rules, including the 3-point checklist, are not considered by artificial intelligence algorithms, which comprise human knowledge and reflect the diagnosis process of human experts. Objective: In this paper, we aimed to develop a semisupervised model that can not only integrate the dermoscopic features and scoring rule from the 3-point checklist but also automate the feature-annotation process. Methods: We first trained the semisupervised model on a small, annotated data set with disease and dermoscopic feature labels and tried to improve the classification accuracy by integrating the 3-point checklist using ranking loss function. We then used a large, unlabeled data set with only disease label to learn from the trained algorithm to automatically classify skin lesions and features. Results: After adding the 3-point checklist to our model, its performance for melanoma classification improved from a mean of 0.8867 (SD 0.0191) to 0.8943 (SD 0.0115) under 5-fold cross-validation. The trained semisupervised model can automatically detect 3 dermoscopic features from the 3-point checklist, with best performances of 0.80 (area under the curve [AUC] 0.8380), 0.89 (AUC 0.9036), and 0.76 (AUC 0.8444), in some cases outperforming human annotators. Conclusions: Our proposed semisupervised learning framework can help with the automatic diagnosis of skin disease based on its ability to detect dermoscopic features and automate the label-annotation process. The framework can also help combine semantic knowledge with a computer algorithm to arrive at a more accurate and more interpretable diagnostic result, which can be applied to broader use cases. 
UR  - https://derma.jmir.org/2022/4/e39113
UR  - http://dx.doi.org/10.2196/39113
UR  - http://www.ncbi.nlm.nih.gov/pubmed/37632881
ID  - info:doi/10.2196/39113
ER  - 

TY  - JOUR
AU  - Rezk, Eman
AU  - Eltorki, Mohamed
AU  - El-Dakhakhni, Wael
PY  - 2022/3/8
TI  - Leveraging Artificial Intelligence to Improve the Diversity of Dermatological Skin Color Pathology: Protocol for an Algorithm Development and Validation Study
JO  - JMIR Res Protoc
SP  - e34896
VL  - 11
IS  - 3
KW  - artificial intelligence
KW  - skin cancer
KW  - skin tone diversity
KW  - people of color
KW  - image blending
KW  - deep learning
KW  - classification
KW  - early diagnosis
N2  - Background: The paucity of dark skin images in dermatological textbooks and atlases is a reflection of racial injustice in medicine. The underrepresentation of dark skin images makes diagnosing skin pathology in people of color challenging. For conditions such as skin cancer, in which early diagnosis makes a difference between life and death, people of color have worse prognoses and lower survival rates than people with lighter skin tones as a result of delayed or incorrect diagnoses. Recent advances in artificial intelligence, such as deep learning, offer a potential solution that can be achieved by diversifying the mostly light-skin image repositories through generating images for darker skin tones. Thus, facilitating the development of inclusive cancer early diagnosis systems that are trained and tested on diverse images that truly represent human skin tones. Objective: We aim to develop and evaluate an artificial intelligence?based skin cancer early detection system for all skin tones using clinical images. Methods: This study consists of four phases: (1) Publicly available skin image repositories will be analyzed to quantify the underrepresentation of darker skin tones, (2) Images will be generated for the underrepresented skin tones, (3) Generated images will be extensively evaluated for realism and disease presentation with quantitative image quality assessment as well as qualitative human expert and nonexpert ratings, and (4) The images will be utilized with available light-skin images to develop a robust skin cancer early detection model. Results: This study started in September 2020. The first phase of quantifying the underrepresentation of darker skin tones was completed in March 2021. The second phase of generating the images is in progress and will be completed by March 2022. The third phase is expected to be completed by May 2022, and the final phase is expected to be completed by September 2022. Conclusions: This work is the first step toward expanding skin tone diversity in existing image databases to address the current gap in the underrepresentation of darker skin tones. Once validated, the image bank will be a valuable resource that can potentially be utilized in physician education and in research applications. Furthermore, generated images are expected to improve the generalizability of skin cancer detection. When completed, the model will assist family physicians and general practitioners in evaluating skin lesion severity and in efficient triaging for referral to expert dermatologists. In addition, the model can assist dermatologists in diagnosing skin lesions. International Registered Report Identifier (IRRID): DERR1-10.2196/34896 
UR  - https://www.researchprotocols.org/2022/3/e34896
UR  - http://dx.doi.org/10.2196/34896
UR  - http://www.ncbi.nlm.nih.gov/pubmed/34983017
ID  - info:doi/10.2196/34896
ER  - 

TY  - JOUR
AU  - Chang, Wei Che
AU  - Lai, Feipei
AU  - Christian, Mesakh
AU  - Chen, Chun Yu
AU  - Hsu, Ching
AU  - Chen, Shen Yo
AU  - Chang, Hao Dun
AU  - Roan, Luen Tyng
AU  - Yu, Che Yen
PY  - 2021/12/2
TI  - Deep Learning?Assisted Burn Wound Diagnosis: Diagnostic Model Development Study
JO  - JMIR Med Inform
SP  - e22798
VL  - 9
IS  - 12
KW  - deep learning
KW  - semantic segmentation
KW  - instance segmentation
KW  - burn wounds
KW  - percentage total body surface area
N2  - Background: Accurate assessment of the percentage total body surface area (%TBSA) of burn wounds is crucial in the management of burn patients. The resuscitation fluid and nutritional needs of burn patients, their need for intensive unit care, and probability of mortality are all directly related to %TBSA. It is difficult to estimate a burn area of irregular shape by inspection. Many articles have reported discrepancies in estimating %TBSA by different doctors. Objective: We propose a method, based on deep learning, for burn wound detection, segmentation, and calculation of %TBSA on a pixel-to-pixel basis. Methods: A 2-step procedure was used to convert burn wound diagnosis into %TBSA. In the first step, images of burn wounds were collected from medical records and labeled by burn surgeons, and the data set was then input into 2 deep learning architectures, U-Net and Mask R-CNN, each configured with 2 different backbones, to segment the burn wounds. In the second step, we collected and labeled images of hands to create another data set, which was also input into U-Net and Mask R-CNN to segment the hands. The %TBSA of burn wounds was then calculated by comparing the pixels of mask areas on images of the burn wound and hand of the same patient according to the rule of hand, which states that one?s hand accounts for 0.8% of TBSA. Results: A total of 2591 images of burn wounds were collected and labeled to form the burn wound data set. The data set was randomly split into training, validation, and testing sets in a ratio of 8:1:1. Four hundred images of volar hands were collected and labeled to form the hand data set, which was also split into 3 sets using the same method. For the images of burn wounds, Mask R-CNN with ResNet101 had the best segmentation result with a Dice coefficient (DC) of 0.9496, while U-Net with ResNet101 had a DC of 0.8545. For the hand images, U-Net and Mask R-CNN had similar performance with DC values of 0.9920 and 0.9910, respectively. Lastly, we conducted a test diagnosis in a burn patient. Mask R-CNN with ResNet101 had on average less deviation (0.115% TBSA) from the ground truth than burn surgeons. Conclusions: This is one of the first studies to diagnose all depths of burn wounds and convert the segmentation results into %TBSA using different deep learning models. We aimed to assist medical staff in estimating burn size more accurately, thereby helping to provide precise care to burn victims. 
UR  - https://medinform.jmir.org/2021/12/e22798
UR  - http://dx.doi.org/10.2196/22798
UR  - http://www.ncbi.nlm.nih.gov/pubmed/34860674
ID  - info:doi/10.2196/22798
ER  - 

TY  - JOUR
AU  - Takiddin, Abdulrahman
AU  - Schneider, Jens
AU  - Yang, Yin
AU  - Abd-Alrazaq, Alaa
AU  - Househ, Mowafa
PY  - 2021/11/24
TI  - Artificial Intelligence for Skin Cancer Detection: Scoping Review
JO  - J Med Internet Res
SP  - e22934
VL  - 23
IS  - 11
KW  - artificial intelligence
KW  - skin cancer
KW  - skin lesion
KW  - machine learning
KW  - deep neural networks
N2  - Background: Skin cancer is the most common cancer type affecting humans. Traditional skin cancer diagnosis methods are costly, require a professional physician, and take time. Hence, to aid in diagnosing skin cancer, artificial intelligence (AI) tools are being used, including shallow and deep machine learning?based methodologies that are trained to detect and classify skin cancer using computer algorithms and deep neural networks. Objective: The aim of this study was to identify and group the different types of AI-based technologies used to detect and classify skin cancer. The study also examined the reliability of the selected papers by studying the correlation between the data set size and the number of diagnostic classes with the performance metrics used to evaluate the models. Methods: We conducted a systematic search for papers using Institute of Electrical and Electronics Engineers (IEEE) Xplore, Association for Computing Machinery Digital Library (ACM DL), and Ovid MEDLINE databases following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) guidelines. The studies included in this scoping review had to fulfill several selection criteria: being specifically about skin cancer, detecting or classifying skin cancer, and using AI technologies. Study selection and data extraction were independently conducted by two reviewers. Extracted data were narratively synthesized, where studies were grouped based on the diagnostic AI techniques and their evaluation metrics. Results: We retrieved 906 papers from the 3 databases, of which 53 were eligible for this review. Shallow AI-based techniques were used in 14 studies, and deep AI-based techniques were used in 39 studies. The studies used up to 11 evaluation metrics to assess the proposed models, where 39 studies used accuracy as the primary evaluation metric. Overall, studies that used smaller data sets reported higher accuracy. Conclusions: This paper examined multiple AI-based skin cancer detection models. However, a direct comparison between methods was hindered by the varied use of different evaluation metrics and image types. Performance scores were affected by factors such as data set size, number of diagnostic classes, and techniques. Hence, the reliability of shallow and deep models with higher accuracy scores was questionable since they were trained and tested on relatively small data sets of a few diagnostic classes. 
UR  - https://www.jmir.org/2021/11/e22934
UR  - http://dx.doi.org/10.2196/22934
UR  - http://www.ncbi.nlm.nih.gov/pubmed/34821566
ID  - info:doi/10.2196/22934
ER  - 

TY  - JOUR
AU  - Aggarwal, Pushkar
PY  - 2021/10/12
TI  - Performance of Artificial Intelligence Imaging Models in Detecting Dermatological Manifestations in Higher Fitzpatrick Skin Color Classifications
JO  - JMIR Dermatol
SP  - e31697
VL  - 4
IS  - 2
KW  - deep learning
KW  - melanoma
KW  - basal cell carcinoma
KW  - skin of color
KW  - image recognition
KW  - dermatology
KW  - disease
KW  - convolutional neural network
KW  - specificity
KW  - prediction
KW  - artificial intelligence
KW  - skin color
KW  - skin tone
N2  - Background: The performance of deep-learning image recognition models is below par when applied to images with Fitzpatrick classification skin types 4 and 5. Objective: The objective of this research was to assess whether image recognition models perform differently when differentiating between dermatological diseases in individuals with darker skin color (Fitzpatrick skin types 4 and 5) than when differentiating between the same dermatological diseases in Caucasians (Fitzpatrick skin types 1, 2, and 3) when both models are trained on the same number of images. Methods: Two image recognition models were trained, validated, and tested. The goal of each model was to differentiate between melanoma and basal cell carcinoma. Open-source images of melanoma and basal cell carcinoma were acquired from the Hellenic Dermatological Atlas, the Dermatology Atlas, the Interactive Dermatology Atlas, and DermNet NZ. Results: The image recognition models trained and validated on images with light skin color had higher sensitivity, specificity, positive predictive value, negative predictive value, and F1 score than the image recognition models trained and validated on images of skin of color for differentiation between melanoma and basal cell carcinoma. Conclusions: A higher number of images of dermatological diseases in individuals with darker skin color than images of dermatological diseases in individuals with light skin color would need to be gathered for artificial intelligence models to perform equally well. 
UR  - https://derma.jmir.org/2021/2/e31697
UR  - http://dx.doi.org/10.2196/31697
UR  - http://www.ncbi.nlm.nih.gov/pubmed/37632853
ID  - info:doi/10.2196/31697
ER  - 

TY  - JOUR
AU  - Huang, Kai
AU  - Jiang, Zixi
AU  - Li, Yixin
AU  - Wu, Zhe
AU  - Wu, Xian
AU  - Zhu, Wu
AU  - Chen, Mingliang
AU  - Zhang, Yu
AU  - Zuo, Ke
AU  - Li, Yi
AU  - Yu, Nianzhou
AU  - Liu, Siliang
AU  - Huang, Xing
AU  - Su, Juan
AU  - Yin, Mingzhu
AU  - Qian, Buyue
AU  - Wang, Xianggui
AU  - Chen, Xiang
AU  - Zhao, Shuang
PY  - 2021/9/21
TI  - The Classification of Six Common Skin Diseases Based on Xiangya-Derm: Development of a Chinese Database for Artificial Intelligence
JO  - J Med Internet Res
SP  - e26025
VL  - 23
IS  - 9
KW  - artificial intelligence
KW  - skin disease
KW  - convolutional neural network
KW  - medical image processing
KW  - automatic auxiliary diagnoses
KW  - dermatology
KW  - skin
KW  - classification
KW  - China
N2  - Background: Skin and subcutaneous disease is the fourth-leading cause of the nonfatal disease burden worldwide and constitutes one of the most common burdens in primary care. However, there is a severe lack of dermatologists, particularly in rural Chinese areas. Furthermore, although artificial intelligence (AI) tools can assist in diagnosing skin disorders from images, the database for the Chinese population is limited. Objective: This study aims to establish a database for AI based on the Chinese population and presents an initial study on six common skin diseases. Methods: Each image was captured with either a digital camera or a smartphone, verified by at least three experienced dermatologists and corresponding pathology information, and finally added to the Xiangya-Derm database. Based on this database, we conducted AI-assisted classification research on six common skin diseases and then proposed a network called Xy-SkinNet. Xy-SkinNet applies a two-step strategy to identify skin diseases. First, given an input image, we segmented the regions of the skin lesion. Second, we introduced an information fusion block to combine the output of all segmented regions. We compared the performance with 31 dermatologists of varied experiences. Results: Xiangya-Derm, as a new database that consists of over 150,000 clinical images of 571 different skin diseases in the Chinese population, is the largest and most diverse dermatological data set of the Chinese population. The AI-based six-category classification achieved a top 3 accuracy of 84.77%, which exceeded the average accuracy of dermatologists (78.15%). Conclusions: Xiangya-Derm, the largest database for the Chinese population, was created. The classification of six common skin conditions was conducted based on Xiangya-Derm to lay a foundation for product research. 
UR  - https://www.jmir.org/2021/9/e26025
UR  - http://dx.doi.org/10.2196/26025
UR  - http://www.ncbi.nlm.nih.gov/pubmed/34546174
ID  - info:doi/10.2196/26025
ER  - 

TY  - JOUR
AU  - Eapen, Raj Bell
AU  - Archer, Norm
AU  - Sartipi, Kamran
PY  - 2020/4/20
TI  - LesionMap: A Method and Tool for the Semantic Annotation of Dermatological Lesions for Documentation and Machine Learning
JO  - JMIR Dermatol
SP  - e18149
VL  - 3
IS  - 1
KW  - LesionMap
KW  - LesionMapper
KW  - digital imaging
KW  - machine learning
KW  - dermatology
UR  - http://derma.jmir.org/2020/1/e18149/
UR  - http://dx.doi.org/10.2196/18149
UR  - http://www.ncbi.nlm.nih.gov/pubmed/
ID  - info:doi/10.2196/18149
ER  -