0% found this document useful (0 votes)
75 views6 pages

Machine Learning for Parkinson's Detection

1) The document discusses a comparative study of supervised machine learning and feature selection methods for Parkinson's disease classification based on speech impairments. 2) It explores various machine learning algorithms like support vector machine, decision tree, logistic regression, random forest, k-nearest neighbors, and naive bayes. It also examines feature selection techniques like principal component analysis, minimum redundancy maximum relevance, and selectKBest. 3) The study uses a dataset from the UCI machine learning repository containing speech features from 188 Parkinson's patients and 64 healthy individuals to compare the performance of different machine learning algorithms and feature selection methods.

Uploaded by

magzhan010799
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views6 pages

Machine Learning for Parkinson's Detection

1) The document discusses a comparative study of supervised machine learning and feature selection methods for Parkinson's disease classification based on speech impairments. 2) It explores various machine learning algorithms like support vector machine, decision tree, logistic regression, random forest, k-nearest neighbors, and naive bayes. It also examines feature selection techniques like principal component analysis, minimum redundancy maximum relevance, and selectKBest. 3) The study uses a dataset from the UCI machine learning repository containing speech features from 188 Parkinson's patients and 64 healthy individuals to compare the performance of different machine learning algorithms and feature selection methods.

Uploaded by

magzhan010799
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

2024 IEEE 4th International Conference on Smart Information Systems and Technologies (SIST)

15-17 May, 2024, Astana, Kazakhstan

A Compartive Study of Supervised Machine


Learning and Feature Selection Based on Speech
Impairments for Parkinson’s Disease
Classification
Nursultan Zhantileuov Serik Ospanov
School of Information Technology and National Center for Space Research and
Engineering Technology
Kazakh-British Technical Univercity KazCosmos
Almaty, Kazakhstan Almaty, Kazakhstan
[Link]@[Link] ospanoff1956@[Link]

Abstract—Parkinson’s disease is irredeemable The initial determination of Parkinson’s disease is


neurodegenerative disease after Alzheimer’s, impacting the complex since the first 1-2 years symptoms of the disease
nervous system. The initial symptoms are tremors of limbs, do not appear. In the subsequent years the main symptoms
slowness of movement, muscle stiffness, sleep problems, and start to emerge, including changes in handwriting, hand
difficulty with walking at early stages. Currently, it is
impossible to completely cure Parkinson’s disease, and
tremors, vocal changes, difficulty with movement in space,
determining the disease during its early stages poses and postural instability. In order to diagnose Parkinson’s
significant complexity for the decision support system. disease early stage doctors use images of the brain taken
However, the utilization of machine learning (ML) and by magnetic resonance tomography (MRI). In the brain
artificial intelligence (AI) in medicine has had significant Figure 1, there is the substantia nigra which provides the
impacts on the decision-support system. Machine Learning necessary hormone dopamine. After impairment of the
can create a model based on the PD patient’s dataset, which nervous system size of the substantia nigra begins
helps to distinguish Parkinson’s patients from healthy decreasing and its use for the detection of Parkinson’s
patients. In this study, we assess and contrast traditional disease [3]. And one of the most essential methods for
machine learning algorithms to demonstrate each
algorithm’s potential. The different classification techniques
diagnosis Parkinson’s disease is speech analysis.
involve Support Vector Machine (SVM) Decision Tree (DT), Researchers proved that common speech impairments are
Logistic Regression (LR), Random Forest (RF), K-Nearest observed in 90 % of patients with Parkinson’s disease. In
Neighbor (KNN), Naive Bayes (NB). We also explore various this study, we utilize a dataset consisting of speech features
feature selections to decrease the size of the dataset. The three from individuals with PD to compare different machine
various feature selection involves Principal Component learning algorithms.
Analysis (PCA), Minimum Redundancy Maximum In recent years, with the dynamite growth of machine
Relevance (mRMR) and SelectKBest. learning (ML) and artificial intelligence (AI) techniques,
there have been largely benefits for medical decision
Keywords—Parkinson’s disease, speech impairment,
machine learning, feature selection, dysphonia, detection
support systems in identifying patients with Parkinson’s
disease disease (PD) from normal individuals. For instance, as
mentioned earlier Bhattacharya and Bhatia [4] adopted
I. INTRODUCTION Support Vector Machines (SVM) based on a voice
Parkinson’s disease is an unhealed neurodegenerative recording dataset and achieved an accuracy of 65.2174%.
brain disorder, which impacts the nerve systems of the However, they did not implement feature selection, which
subject. Parkinson’s disease primarily occurs due to a lack is an essential part of dataset preprocessing. Feature
of sufficient synchronized dopamine levels, which selection helps extract significant features from the dataset
substantially affects people’s ability to move in space, [5]. As shown by Goyal, Khandnor, and Aseri [6], the
stress resistance, motivation, sleep, and learn something authors compared and analyzed feature selection
[1]. Currently, researchers estimate that in the world more techniques such as Principal Components Analysis (PCA),
than 10 million people suffer from Parkinson’s disease [2]. Minimum Redundancy Maximum Relevance (mRMR),
and Genetic Algorithms (GA), which had a significant 2) Explore various feature selection methods used in
impact on the classification metrics. Two different datasets machine learning.
were used in their study. The first dataset was published by
the UCI machine learning repository in 2018 [7]. The II. MATERIALS AND METHODS
second dataset was provided by the UCI machine learning
repository in 2014 [8]. In the first dataset, the best accuracy A. Dataset describtion
of 91.40% was achieved using XGboost with GA feature In this study, We implemented a dataset released by the
selection. Meanwhile, in the second dataset, the best UCI machine learning repository in 2018 [7]. It represents
accuracy of 91.25% was achieved using NB with mRMR the most up-to-date and holistic data available for
feature selection. analyzing speech signals. The dataset contains information
from 188 patients (107 males and 81 females) and 64
healthy controls (23 males and 41 females) collected from
the Department of Neurology at the Cerrahpaaya Faculty
of Medicine, Istanbul University. It consists of 754
features and 756 instances. As you can see dataset includes
a lot of features and required the use of feature selection in
order to decrease the size of the dataset.

B. Data preprocessing

Normalization is one of the main techniques in data


preprocessing, used to transform current data into new
data. In this study, we implemented one of the most
effective normalization methods, known as
Figure 1. Image depicting the structure of the brain, including the
MinMaxScaler. This method reduces the values of current
substantia nigra that produces dopamine hormones data to new values within the range [0; 1]. The formula for
MinMaxScaler is represented by equation (1):
One of the manifestations of Parkinson’s disease (PD) 𝑋! − 𝑚𝑖𝑛(𝑋)
𝑋′ = (1)
can be observed in the hands as tremors, which don’t allow 𝑚𝑎𝑥(𝑋) − 𝑚𝑖𝑛(𝑋)
patients to write correctly. These symptoms help
researchers determine Parkinson’s disease (PD) through Where (𝑋! ) represents the individual instances of the
handwriting, as in [9]. They are used two different feature feature (𝑋), min(𝑋) represents the mean value of (𝑋),
selection methods: the Mann-Whitney test and the Relief and max(𝑋) represents the standard deviation of (𝑋),
Algorithm. Additionally, they employed a Support Vector and (𝑋 " ) represents the scaled value.
Machine (SVM) for classification. The results of the
research show that SVM with the RBF kernel achieves C. Feature selection
more than 80% of accuracy. Tong, He, and Peng [10] In recent years, with the explosive growth of dataset
employs a Convolutional Neural Network (CNN) that features, selection has become a prominent part of data
relies on a wearable inertial sensor for the hand tremor preprocessing. It also helps to reduce the size of the dataset
detection method. Remarkably, the author achieves an and improve the performance of the system [13]. There are
outstanding accuracy of 97.32%. Mentioned earlier [11], three types of feature selection: supervised, unsupervised,
researchers utilize non-linear decision tree algorithms for and semi- supervised. In our study, we use Minimum
predicting Parkinson’s disease based on gait analysis. The Redundancy Maximum Relevance (mRMR), Principal
study yields remarkable results, achieving an accuracy of Component Analysis (PCA), and SelecKBest.
85.31% As you can see, many researchers assert that
feature selection plays a crucial role in distinguishing 1) Minimum Redundancy Maximum Relevance
Parkinson’s disease. For instance, Miao, Lou, and Wu [12] (mRMR)
employs Principal Component Analysis (PCA), which The Minimum Redundancy Maximum Relevance
significantly enhances the accuracy from 58% to an (mRMR) is an important feature selection that determines
impressive 96%. This emphasizes the significance of features with maximum relevance and minimum
comparative feature selection in disease classification. In redundancy. Significant attributes are those that exhibit the
our research, we focus on analyzing feature selection and highest mutual information and demonstrate a robust
comparing classical machine learning algorithms in order correlation with the target class. Mutual information is
to showcase the potential of each algorithm based on calculated using the equation (2):
speech-voice datasets. The major contributions of this
paper, including:
𝑝(𝑥, 𝑦)
1) Compare and analyze traditional machine learning 𝐼(𝑋; 𝑌) = 5 5 𝑝(𝑥, 𝑦) 𝑙𝑜𝑔 < = (2)
algorithms to showcase the potential of each algorithm in 𝑝(𝑥)𝑝(𝑦)
&∈' #∈%
a non-linear speech voice dataset for diagnosing
Parkinson’s diease. 2) Principal Component Analysis (PCA)
The Principal Component Analysis (PCA) is a powerful for tasks include clustering (grouping together similar data
dimensionality reduction technique used to decrease the points) and dimensionality reduction (cutting the amount
dimensionality of the feature space. By eliminating less of input characteristics while keeping relevant
important features, PCA aims to minimize information information).
loss. It achieves this by transforming the original features Semi-supervised learning (SSL) is a part of supervised
into new orthogonal features called principal components learning that combine small amount of data with large
[14]. These principal components capture the highest number of unlabeled data to train a predictive model. SSL
variance in the data. PCA can be regarded as a feature is used widely for anomaly detection.
extraction method since it generates new features. As the Reinforcement learning (RL) is powerful technic of
number of variables decreases, their correlations decrease, machine learning that widely used for detection
reducing the likelihood of overfitting in models. PCA is Parkinson’s disease [18]. To understand, concept of the
favored over other feature selection techniques due to its reinforcement learning necessary familiarity with two key
ability to reduce the dataset’s dimensionality to just two components the agent who are take actions in the
principal components. This reduction significantly environment which surrounds the agent and interacts with
improves computational efficiency. Moreover, PCA offers the agent [19]. The role of the environment is to control the
a favorable balance between computation time, resource agent’s actions by rewards or penalty. These rewards and
utilization, and system accuracy. penalties are feedback to agent on whether the action was
favourable or adverse. The essential goal of agent is to
3) SelectKBest
maximize favourable action in order to get rewards as
The SelectKBest is a popular feature selection technique feedback. This process continues until the agent discovers
commonly used in research papers. It is designed to reduce the most effective strategy for increasing rewards within
the dimensionality of the feature space by selecting the the environment. This is the key idea of reinforcement
most relevant features for a given task [15]. The ’K’ in learning.
SelectKBest refers to the number of features to be selected.
III. RESULTS
Implementation of SelectKBest helps us reduce the size of
the dataset and improve the accuracy of the model. In our There are various metrics used to assess the
research, we observed that SelectKBest does not perform performance of a machine learning model. We have
well with nonlinear datasets. In comparison to mRMR and implemented precision recall, accuracy, F1-score and
PCA, SelectKBest exhibited suboptimal performance. confusion matrix (including true positive, true negative,
Therefore, it is important to consider the nature of the false positive, and false negative) to measure the
dataset and explore alternative feature selection techniques performance of the detection task.
for optimal results. • True positive: The model predicts the person as
sick, and they are actually sick.
D. Machine learning
• True negative: The model predicts the person
A branch of artificial intelligence (AI) known as as not sick, and they are actually not sick.
”machine learning”oriented on making algorithms and • False positive: The model predicts the person as
models that allow computers learn and make predictions sick, but they are actually not sick.
or decisions without having to be explicitly programmed. • False negative: The model predicts the person
To enable machines to learn from and analyze massive as not sick, but they are actually sick.
volumes of data, spot patterns, and make data-driven
predictions or judgments, statistical techniques and In this study, we utilize our aim is to conduct a
computational algorithms are used. [16]. There are four comparative analysis of machine learning (ML)
types of machine learning: supervised learning, algorithms, showcasing the potential of classification
unsupervised learning, semi-supervised learning and algorithms such as Support Vector Machine (SVM),
reinforcement learning [17]. Supervised learning involves Decision Tree (DT), Random Forest (RF), Naive Bayes
training a model using annotated data, where the input data (NB), Logistic Regression (LR), and K-Nearest Neighbors
is accompanied by corresponding output labels. The model (KNN). We evaluate these algorithms with different
gains knowledge from these labeled instances and feature selection methods, namely mRMR, PCA, and
subsequently becomes capable of making predictions or SelectKBest. The formulas for metrics are illustrated as
decisions when presented with novel, unobserved data. It follows:
acquires the ability to establish a connection between input
True Positives
features and the desired output by leveraging the provided Recall = (3)
labels in contrast, unsupervised learning entails training a True Positives + False Negatives
model using unlabeled data, where the input data lacks
True Positives
corresponding output labels. The primary goal is to Precision = (4)
uncover patterns, structures, or correlations within the True Positives + False Positives
data. Without explicit guidance or supervision, the model Precision × Recall
learns to identify similarities, clusters, or latent features F1 score = 2 × (5)
present in the data. Supervised learning is frequently Precision + Recall
utilized for classification (assigning input data to 𝑇𝑃 + 𝑇𝑁
determine teams) and regression (predicting a continuous Accuracy = (6)
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
value). Unsupervised learning, on the other hand, is great
In this comparative study, there are four different Bayes (Gaussian NB) demonstrated excellent in
metrics that evaluate the performance of machine learning minimizing false positive, which is very useful in cases
algorithms combined with the Minimum Redundancy where it is important to avoid incorrectly diagnosing a
Maximum Relevance (mRMR) feature selection, as shown condition. However, Gaussian NB demonstrated the
in Table I and Figure 2. Additionally, we presented a lowest precision (19 out of 152) with SelectKBest feature
confusion matrix in Table II. Table I, it indicated that the selection. This shows the critical importance of choosing
Random Forest Classifier achieved the highest accuracy at the correct model for medical diagnostics.
89.47% and it shows good result in precision of 91.73%
TABLE II. CONFUSION MATRIX OF DIFFERENT MODELS WITH
and F1-score of 93.27%. However, Support Vector MRMR FEATURE SELECTION
Machine (SVM) is the best at finding almost all sick
people, with a recall of 99.15%, which is a critical metric Models True False True False
Negative Positive Positive Negative
in medical diagnosis due to the importance of detecting as
many positive cases as possible [20]. Overall, the Random Logistic
21 14 114 3
Forest Classifier displayed stable results across the board Regression
with the mRMR feature selection, consistently achieving
Decision
good performance. Tree 23 12 94 23
Classifier
TABLE I. PERFORMANCE METRICS OF DIFFERENT MODELS WITH
MRMR FEATURE SELECTION
Random
Forest 25 10 111 6
Models Accuracy Recall Precision F1-score Classifier
Support
Logistic 88.81% 97.44% 89.06% 93.06% Vector 17 18 116 1
Regression Machine
K-Nearest
Decision 24 11 110 7
76.97% 80.34% 88.67% 84.30% Neighbors
Tree
Classifier Gaussian
24 11 102 15
NB
Random
89.47% 94.87% 91.73% 93,27%
Forest
Classifier In the Table III, and Figure 3, we demonstrated
Support
87.50% 99.15% 86.56% 92.43% performance of machine learning models using Principal
Vector Component Analysis (PCA) for classifying Parkinson’s
Machine
disease. The Logistic Regression (LR) indicates the same
K-Nearest 88.15% 94.01% 90.90% 92.43% results of performance as models using PCA with mRMR.
Neighbors
Additionally, SVM also demonstrated same recall rate of
Gaussian 82.89% 87.17% 90.26% 88.69%
99.15% as PCA with mRMR and SelectKbest, which is
NB significant metrics in medicine area. Also, we observed
that the KNN achieved the best accuracy of 93.42% and
precision of 93.50% and F1-score of 95.83% using PCA.
These results shows that PCA when combined with KNN,
provides a reliable approach for identification Parkinson’s
disease.

TABLE III. PERFORMANCE METRICS OF DIFFERENT MODELS WITH


PCA FEATURE SELECTION

Models Accuracy Recall Precision F1-score

Logistic 88.81% 97.44% 89.06% 93.06%


Regression

Figure 2. Performance metrics bar charts with mRMR feature Decision 75.65% 81.19% 88.36% 83.70%
selection Tree
Classifier
In this study, as shown in Table II, we demonstrated the Random 82.89% 98.29% 82.73% 89.84%
confusion matrix of different machine learning models Forest
using Minimum Redundancy Maximum Relevance Classifier
(mRMR) feature selection. We noticed that the Random
Forest Classifier not only achived the highest accuracy but Support 89.47% 99.15% 88.54% 93.54%
Vector
also demonstrade a perceptible balance between false Machine
positve and negative. On the other hand, Support Vector
Machine (SVM) demonstrated remarkable performance in K-Nearest 93.42% 98.29% 93.50% 95.83%
identifying true cases of the disease, as indicated by its Neighbors
recall rate. However, each model presents its own set of
Gaussian 80.92% 91.45% 84.92% 88.06%
advantages and disadvantages. For instance, we fount that NB
the K-Nearest Neighbors (KNN) and Gaussian Naive
capability in the SVM. It shows that the SVM is a reliable
model chosen for the identification of Parkinson’s disease.

TABLE V. PERFORMANCE METRICS OF DIFFERENT MODELS WITH


SELECTKBEST FEATURE SELECTION

Models Accuracy Recall Precision F1-score

Logistic
87.50% 95.72% 88.88% 92.18%
Regression

Decision
Tree 83.55% 88.88% 89.65% 89.27%
Figure 3. Performance metrics bar charts with PCA feature selection Classifier

In the Table IV, we demonstrated a comprehensive Random


Forest 89.47% 95.72% 91.05% 93.33%
overview of the performance metrics from the confusion Classifier
matrix for various machine learning models using PCA
feature selection. In this study focusing on the early stages Support
Vector 90.13% 99.15% 89.23% 93.92%
of Parkinson's disease, the false negative rate is identified Machine
as the most critical metric. The Decision Tree classifier
shows drawback results false negative (22 out of 152). K-Nearest
90.13% 93.16% 93.96% 93.56
However, the Support Vector Machine (SVM) achieved Neighbors
excellent performance of false negative with (1 out of Gaussian
152). It is indicated that the Support Vector Machine is 80.92% 86.32% 88.59% 87.44%
NB
reliable models for classifying Parkinson’s disease in early
stage.
The KNN model illustrated stable performance for
distinguishing Parkinson’s disease with true negative (27
out of 152) and true positive (115 out of 152) and false
negative with (2 out of 152).
TABLE IV. CONFUSION MATRIX OF DIFFERENT MODELS WITH PCA
FEATURE SELECTION.

Models True False True False


Negative Positive Positive Negative

Logistic 21 14 114 3
Regression
Figure 4. Performance metrics bar charts with SelectKBest feature
Decision 20 15 95 22 selection.
Tree
Classifier Table VI, provides a detailed overvew of the
performance metrics from the confusion matrix for
Random 11 24 115 2 different machine learning models using SelectKBest
Forest feature selection. This comprehensive analysis
Classifier
demonstrated that the SVM achived the highest true
Support 20 15 116 1 positive rate with (116 out of 152) and a recall of (1 out of
Vector 152). Additionally, the SVM showed remarkable recall
Machine when combined with mRMR and PCA, maintaining a
recall rate of (1 out of 152). These results indicate that the
K-Nearest 27 8 115 2
Neighbors
SVM is an excellent model for classifying Parkinson’s
disease.
Gaussian 16 19 107 10 The KNN model demonstrated the highest performance
NB in terms of true negatives (28 out of 152) and the lowest
false positives (7 out of 152). Furthermore, the KNN
Table V offers an extensive analysis of the performance achieved notable performance in true negative rates for
metrics for various predictive models utilizing PCA feature selection (27 out of 152).
SelectKBest for feature selection. This detailed Conversely, the Decision Tree Classifier and Gaussian
comparison illustrated that the SVM and KNN models Naive Bayes (NB) demonstrated the lowest performance
have performed remarkable accuracy of 90.13%. Also, the across all feature selection methods. For instance,
SVM demonstrated excellent performance of recall at Gaussian NB achieved a false negative rate of 16 out of
99.15. Prior analyses with mRMR and PCA feature 152 with SelectKBest, while the Decision Tree Classifier
selection methods have also indicated a robust recall reached false negatives of 22 out of 152 with PCA and 23
with mRMR, proving that both of them are unreliable for
diagnosing Parkinson’s disease.
TABLE VI. CONFUSION MATRIX OF DIFFERENT MODELS WITH [4] I. Bhattacharya and M. P. S. Bhatia, “Svm classification to dis-
SELECTKBEST FEATURE SELECTION. tinguish parkinson disease patients,” Sep. 2010. [Online].
Available: [Link]
Models True False True False [5] J. Cai, J. Luo, S. Wang, and S. Yang, “Feature selection in ma- chine
Negative Postive Positive Negative learning: A new perspective,” p. 70–79, Jul. 2018. [Online].
Available: [Link]
[6] J. Goyal, P. Khandnor, and T. C. Aseri, “A comparative analysis of
Logistic 21 14 112 5 machine learning classifiers for dysphonia-based classification of
Regression
parkinson’s disease,” p. 69–83, Oct. 2020. [Online]. Available:
[Link]
Decision 23 12 104 13 [7] C. O. Sakar, G. Serbes, A. Gunduz, H. C. Tunc, H. Nizam, B. E.
Tree Sakar, M. Tutuncu, T. Aydin, M. E. Isenkul, and H. Apaydin, “A
Classifier comparative analysis of speech signal processing algorithms for
parkinson’s disease classification and the use of the tunable q-factor
wavelet transform,” p. 255–263, Jan. 2019. [Online]. Available:
Random 24 11 112 5 [Link]
Forest [8] B. E. Sakar, M. E. Isenkul, C. O. Sakar, A. Sertbas, F. Gurgen, S.
Classifier Delil, H. Apaydin, and O. Kursun, “Collection and analysis of a
parkinson speech dataset with multiple types of sound recordings,”
Support p. 828–834, Jul. 2013. [Online]. Available:
21 14 116 1 [Link]
Vector
Machine [9] I. Aouraghe, A. Ammour, G. Khaissidi, M. Mrabti, G. Aboulem,
and F. Belahsen, “Automatic analysis of arabic online handwriting
of patients with parkinson’s disease,” Mar. 2019. [Online].
K-Nearest 28 7 109 8 Available: [Link]
Neighbors [10] L. Tong, J. He, and L. Peng, “Cnn-based pd hand tremor de- tection
using inertial sensors,” p. 1–4, Jul. 2021. [Online]. Available:
Gaussian [Link]
22 13 101 16 [11] S. Aich, K. Choi, J. Park, and H.-C. Kim, “Prediction of parkinson
NB
disease using nonlinear classifiers with decision tree using gait
dynamics,” in Proceedings of the 2017 4th International
Conference on Biomedical and Bioinformatics Engineering, ser.
IV. CONCLUSION AND FUTURE WORK ICBBE ’17. New York, NY, USA: Association for Computing
In this study, we compared and analyzed machine Machinery, 2017, p. 52–57. [Online]. Available:
[Link]
learning (ML) models using a speech impairment dataset [12] Y. Miao, X. Lou, and H. Wu, “The diagnosis of parkinson’s disease
for distinguishing Parkinson’s disease patients from based on gait, speech analysis and machine learning techniques,” in
healthy controls. Additionally, we explored the potential Proceedings of the 2021 International Conference on
of each algorithm with feature selection methods, namely Bioinformatics and Intelligent Computing, ser. BIC 2021. New
York, NY, USA: Association for Computing Machinery, 2021, p.
Minimum Redundancy Maximum Relevance (mRMR), 358–371. [Online]. Available:
Principe Component Analysis (PCA), and SelectKBest. [Link]
Among them, the PCA showed good performance with K- [13] Y. Cai, T. Huang, L. Hu, X. Shi, L. Xie, and Y. Li, “Prediction of
Nearest Neighbors (KNN), yielded the highest accuracy of lysine ubiquitination with mrmr feature selection and analysis,” p.
1387–1395, Jan. 2011. [Online]. Available:
93.42%, precision with 93.96% and F1-score of 95.83%. [Link]
Similarly, KNN achieved remarkable performance with [14] H. Abdi and L. J. Williams, “Principal component analysis,” p.
SelectKBest achieved the high accuracy of 90.13% and 433–459, Jul. 2010. [Online]. Available:
precision of 93.96%. The Random Forest Classifier also [Link]
[15] M. Ayyanar, S. Jeganathan, S. Parthasarathy, V. Jayaraman and A.
demonstrated the highest accuracy of 89.47% and R. Lakshminarayanan, "Predicting the Cardiac Diseases using
precision of 91.23% and F1-score of 93.27% with mRMR. SelectKBest Method Equipped Light Gradient Boosting Machine,"
Additionally, the Support Vector Machine achieving the 2022 6th International Conference on Trends in Electronics and
excellent result of recall rate in all feature selection with Informatics (ICOEI), Tirunelveli, India, 2022, pp. 117-122, doi:
10.1109/ICOEI53556.2022.9777224.
99.15%, indicating its effectiveness in identifying nearly [16] Jiang, T., Gradus, J. L., & Rosellini, A. J. (2020). Supervised
all affected individuals. Our research showed that the KNN Machine Learning: A Brief Primer. Behavior Therapy, 51(5), 675-
is most stable machine learning algorithm for our datatset, 687. [Link]
showing remarkable results with both PCA and [17] Michael G. Pecht; Myeongsu Kang, "Machine Learning:
SelectKBest. Furthermore, we plan to compare deep Fundamentals," in Prognostics and Health Management of
Electronics: Fundamentals, Machine Learning, and the Internet of
learning and ensemble methods in our future research. Things , IEEE, 2019, pp.85-109, doi: 10.1002/9781119515326.ch4.
[18] M. Lu, X. Wei, Y. Che, J. Wang and K. A. Loparo, "Application of
REFERENCES Reinforcement Learning to Deep Brain Stimulation in a
[1] Balestrino, R., & Schapira, A. H. V. (2020). Parkinson disease. Computational Model of Parkinson’s Disease," in IEEE
European journal of neurology, 27(1), 27–42. Transactions on Neural Systems and Rehabilitation Engineering,
[Link] vol. 28, no. 1, pp. 339-349, Jan. 2020, doi:
[2] C. R. Pereira, D. R. Pereira, S. A. Weber, C. Hook, V. H. C. de 10.1109/TNSRE.2019.2952637.
Albuquerque, and J. P. Papa, “A survey on computer-assisted [19] J. Jia and W. Wang, "Review of reinforcement learning research,"
parkinson’s disease diagnosis,” p. 48–63, Apr 2019. [Online]. 2020 35th Youth Academic Annual Conference of Chinese
Available: [Link] Association of Automation (YAC), Zhanjiang, China, 2020, pp.
[3] P. Feraco, C. Gagliardo, G. La Tona, E. Bruno, C. D’angelo, M. 186-191, doi: 10.1109/YAC51587.2020.9337653.
Marrale, A. Del Poggio, M. C. Malaguti, L. Geraci, R. Baschi, B. [20] Hicks, S.A., Strümke, I., Thambawita, V. et al. On evaluation
Petralia, M. Midiri, and R. Monastero, “Imaging of substantia nigra metrics for medical applications of artificial intelligence. Sci
in parkinson’s disease: A narrative review,” Brain Sciences, vol. Rep 12, 5979 (2022). [Link]
11, no. 6, p. 769, Jun. 2021. [Online]. Available:
[Link]

You might also like