Machine Learning for Parkinson's Detection
Machine Learning for Parkinson's Detection
B. Data preprocessing
Figure 2. Performance metrics bar charts with mRMR feature Decision 75.65% 81.19% 88.36% 83.70%
selection Tree
Classifier
In this study, as shown in Table II, we demonstrated the Random 82.89% 98.29% 82.73% 89.84%
confusion matrix of different machine learning models Forest
using Minimum Redundancy Maximum Relevance Classifier
(mRMR) feature selection. We noticed that the Random
Forest Classifier not only achived the highest accuracy but Support 89.47% 99.15% 88.54% 93.54%
Vector
also demonstrade a perceptible balance between false Machine
positve and negative. On the other hand, Support Vector
Machine (SVM) demonstrated remarkable performance in K-Nearest 93.42% 98.29% 93.50% 95.83%
identifying true cases of the disease, as indicated by its Neighbors
recall rate. However, each model presents its own set of
Gaussian 80.92% 91.45% 84.92% 88.06%
advantages and disadvantages. For instance, we fount that NB
the K-Nearest Neighbors (KNN) and Gaussian Naive
capability in the SVM. It shows that the SVM is a reliable
model chosen for the identification of Parkinson’s disease.
Logistic
87.50% 95.72% 88.88% 92.18%
Regression
Decision
Tree 83.55% 88.88% 89.65% 89.27%
Figure 3. Performance metrics bar charts with PCA feature selection Classifier
Logistic 21 14 114 3
Regression
Figure 4. Performance metrics bar charts with SelectKBest feature
Decision 20 15 95 22 selection.
Tree
Classifier Table VI, provides a detailed overvew of the
performance metrics from the confusion matrix for
Random 11 24 115 2 different machine learning models using SelectKBest
Forest feature selection. This comprehensive analysis
Classifier
demonstrated that the SVM achived the highest true
Support 20 15 116 1 positive rate with (116 out of 152) and a recall of (1 out of
Vector 152). Additionally, the SVM showed remarkable recall
Machine when combined with mRMR and PCA, maintaining a
recall rate of (1 out of 152). These results indicate that the
K-Nearest 27 8 115 2
Neighbors
SVM is an excellent model for classifying Parkinson’s
disease.
Gaussian 16 19 107 10 The KNN model demonstrated the highest performance
NB in terms of true negatives (28 out of 152) and the lowest
false positives (7 out of 152). Furthermore, the KNN
Table V offers an extensive analysis of the performance achieved notable performance in true negative rates for
metrics for various predictive models utilizing PCA feature selection (27 out of 152).
SelectKBest for feature selection. This detailed Conversely, the Decision Tree Classifier and Gaussian
comparison illustrated that the SVM and KNN models Naive Bayes (NB) demonstrated the lowest performance
have performed remarkable accuracy of 90.13%. Also, the across all feature selection methods. For instance,
SVM demonstrated excellent performance of recall at Gaussian NB achieved a false negative rate of 16 out of
99.15. Prior analyses with mRMR and PCA feature 152 with SelectKBest, while the Decision Tree Classifier
selection methods have also indicated a robust recall reached false negatives of 22 out of 152 with PCA and 23
with mRMR, proving that both of them are unreliable for
diagnosing Parkinson’s disease.
TABLE VI. CONFUSION MATRIX OF DIFFERENT MODELS WITH [4] I. Bhattacharya and M. P. S. Bhatia, “Svm classification to dis-
SELECTKBEST FEATURE SELECTION. tinguish parkinson disease patients,” Sep. 2010. [Online].
Available: [Link]
Models True False True False [5] J. Cai, J. Luo, S. Wang, and S. Yang, “Feature selection in ma- chine
Negative Postive Positive Negative learning: A new perspective,” p. 70–79, Jul. 2018. [Online].
Available: [Link]
[6] J. Goyal, P. Khandnor, and T. C. Aseri, “A comparative analysis of
Logistic 21 14 112 5 machine learning classifiers for dysphonia-based classification of
Regression
parkinson’s disease,” p. 69–83, Oct. 2020. [Online]. Available:
[Link]
Decision 23 12 104 13 [7] C. O. Sakar, G. Serbes, A. Gunduz, H. C. Tunc, H. Nizam, B. E.
Tree Sakar, M. Tutuncu, T. Aydin, M. E. Isenkul, and H. Apaydin, “A
Classifier comparative analysis of speech signal processing algorithms for
parkinson’s disease classification and the use of the tunable q-factor
wavelet transform,” p. 255–263, Jan. 2019. [Online]. Available:
Random 24 11 112 5 [Link]
Forest [8] B. E. Sakar, M. E. Isenkul, C. O. Sakar, A. Sertbas, F. Gurgen, S.
Classifier Delil, H. Apaydin, and O. Kursun, “Collection and analysis of a
parkinson speech dataset with multiple types of sound recordings,”
Support p. 828–834, Jul. 2013. [Online]. Available:
21 14 116 1 [Link]
Vector
Machine [9] I. Aouraghe, A. Ammour, G. Khaissidi, M. Mrabti, G. Aboulem,
and F. Belahsen, “Automatic analysis of arabic online handwriting
of patients with parkinson’s disease,” Mar. 2019. [Online].
K-Nearest 28 7 109 8 Available: [Link]
Neighbors [10] L. Tong, J. He, and L. Peng, “Cnn-based pd hand tremor de- tection
using inertial sensors,” p. 1–4, Jul. 2021. [Online]. Available:
Gaussian [Link]
22 13 101 16 [11] S. Aich, K. Choi, J. Park, and H.-C. Kim, “Prediction of parkinson
NB
disease using nonlinear classifiers with decision tree using gait
dynamics,” in Proceedings of the 2017 4th International
Conference on Biomedical and Bioinformatics Engineering, ser.
IV. CONCLUSION AND FUTURE WORK ICBBE ’17. New York, NY, USA: Association for Computing
In this study, we compared and analyzed machine Machinery, 2017, p. 52–57. [Online]. Available:
[Link]
learning (ML) models using a speech impairment dataset [12] Y. Miao, X. Lou, and H. Wu, “The diagnosis of parkinson’s disease
for distinguishing Parkinson’s disease patients from based on gait, speech analysis and machine learning techniques,” in
healthy controls. Additionally, we explored the potential Proceedings of the 2021 International Conference on
of each algorithm with feature selection methods, namely Bioinformatics and Intelligent Computing, ser. BIC 2021. New
York, NY, USA: Association for Computing Machinery, 2021, p.
Minimum Redundancy Maximum Relevance (mRMR), 358–371. [Online]. Available:
Principe Component Analysis (PCA), and SelectKBest. [Link]
Among them, the PCA showed good performance with K- [13] Y. Cai, T. Huang, L. Hu, X. Shi, L. Xie, and Y. Li, “Prediction of
Nearest Neighbors (KNN), yielded the highest accuracy of lysine ubiquitination with mrmr feature selection and analysis,” p.
1387–1395, Jan. 2011. [Online]. Available:
93.42%, precision with 93.96% and F1-score of 95.83%. [Link]
Similarly, KNN achieved remarkable performance with [14] H. Abdi and L. J. Williams, “Principal component analysis,” p.
SelectKBest achieved the high accuracy of 90.13% and 433–459, Jul. 2010. [Online]. Available:
precision of 93.96%. The Random Forest Classifier also [Link]
[15] M. Ayyanar, S. Jeganathan, S. Parthasarathy, V. Jayaraman and A.
demonstrated the highest accuracy of 89.47% and R. Lakshminarayanan, "Predicting the Cardiac Diseases using
precision of 91.23% and F1-score of 93.27% with mRMR. SelectKBest Method Equipped Light Gradient Boosting Machine,"
Additionally, the Support Vector Machine achieving the 2022 6th International Conference on Trends in Electronics and
excellent result of recall rate in all feature selection with Informatics (ICOEI), Tirunelveli, India, 2022, pp. 117-122, doi:
10.1109/ICOEI53556.2022.9777224.
99.15%, indicating its effectiveness in identifying nearly [16] Jiang, T., Gradus, J. L., & Rosellini, A. J. (2020). Supervised
all affected individuals. Our research showed that the KNN Machine Learning: A Brief Primer. Behavior Therapy, 51(5), 675-
is most stable machine learning algorithm for our datatset, 687. [Link]
showing remarkable results with both PCA and [17] Michael G. Pecht; Myeongsu Kang, "Machine Learning:
SelectKBest. Furthermore, we plan to compare deep Fundamentals," in Prognostics and Health Management of
Electronics: Fundamentals, Machine Learning, and the Internet of
learning and ensemble methods in our future research. Things , IEEE, 2019, pp.85-109, doi: 10.1002/9781119515326.ch4.
[18] M. Lu, X. Wei, Y. Che, J. Wang and K. A. Loparo, "Application of
REFERENCES Reinforcement Learning to Deep Brain Stimulation in a
[1] Balestrino, R., & Schapira, A. H. V. (2020). Parkinson disease. Computational Model of Parkinson’s Disease," in IEEE
European journal of neurology, 27(1), 27–42. Transactions on Neural Systems and Rehabilitation Engineering,
[Link] vol. 28, no. 1, pp. 339-349, Jan. 2020, doi:
[2] C. R. Pereira, D. R. Pereira, S. A. Weber, C. Hook, V. H. C. de 10.1109/TNSRE.2019.2952637.
Albuquerque, and J. P. Papa, “A survey on computer-assisted [19] J. Jia and W. Wang, "Review of reinforcement learning research,"
parkinson’s disease diagnosis,” p. 48–63, Apr 2019. [Online]. 2020 35th Youth Academic Annual Conference of Chinese
Available: [Link] Association of Automation (YAC), Zhanjiang, China, 2020, pp.
[3] P. Feraco, C. Gagliardo, G. La Tona, E. Bruno, C. D’angelo, M. 186-191, doi: 10.1109/YAC51587.2020.9337653.
Marrale, A. Del Poggio, M. C. Malaguti, L. Geraci, R. Baschi, B. [20] Hicks, S.A., Strümke, I., Thambawita, V. et al. On evaluation
Petralia, M. Midiri, and R. Monastero, “Imaging of substantia nigra metrics for medical applications of artificial intelligence. Sci
in parkinson’s disease: A narrative review,” Brain Sciences, vol. Rep 12, 5979 (2022). [Link]
11, no. 6, p. 769, Jun. 2021. [Online]. Available:
[Link]