0% found this document useful (0 votes)
35 views77 pages

Cloud Privilege Escalation Detection Using ML

The document is a project dissertation by K. Vijaya Santhi for a Master's degree in Technology, focusing on detecting and mitigating privilege escalation attacks in cloud environments using machine learning. It outlines the research methodology, including the use of various algorithms like XGBoost and LightGBM, and highlights the findings that suggest a hybrid approach may enhance threat detection. The dissertation is submitted to KKR & KSR Institute of Technology and Sciences and includes acknowledgments, objectives, and program outcomes related to the field of Computer Science and Engineering.

Uploaded by

kandulanewone
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views77 pages

Cloud Privilege Escalation Detection Using ML

The document is a project dissertation by K. Vijaya Santhi for a Master's degree in Technology, focusing on detecting and mitigating privilege escalation attacks in cloud environments using machine learning. It outlines the research methodology, including the use of various algorithms like XGBoost and LightGBM, and highlights the findings that suggest a hybrid approach may enhance threat detection. The dissertation is submitted to KKR & KSR Institute of Technology and Sciences and includes acknowledgments, objectives, and program outcomes related to the field of Computer Science and Engineering.

Uploaded by

kandulanewone
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

PRIVILEGE ESCALATION ATTACK DETECTION AND

MITIGATION IN CLOUD USING MACHINE LEARNING

A project dissertation
submitted in partial fulfillment of the requirements for the award of the degree of

MASTER OF TECHNOLOGY
in
CSE- ARTIFICIAL INTELLIGENCE & MACHINE LEARNING

by

K. VIJAYA SANTHI
(Regd. No: 23JR1DAI04)

under the esteemed guidance of


Mrs. P. NEELA SUNDARI
Assistant Professor

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


KKR & KSR INSTITUTE OF TECHNOLOGY AND SCIENCES
(AUTONOMOUS)
Vinjanampadu (V), Vatticherukuru (M), Guntur (Dt), AP - 522 017
December -2025

1
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
KKR & KSR INSTITUTE OF TECHNOLOGY AND SCIENCES
(AUTONOMOUS)
(Approved by AICTE New Delhi || Permanently Affiliated to JNTUK, Kakinada) || Accredited with ‘A’
Grade by NAAC || NBA Accreditation), Vinjanampadu (V), Vatticherukuru (M), Guntur (Dt), A.P-
522017

DECLARATION

I hereby declare that the project “PRIVILEGE ESCALATION ATTACK


DETECTION AND MITIGATION IN CLOUD USING MACHINE LEARNING” has
been carried out by me and this work has been submitted to KKR & KSR Institute of
Technology and Sciences (Autonomous), Vinjanampadu, affiliated to JAWAHARLAL
NEHRU TECHNOLOGICAL UNIVERSITY, KAKINADA in partial fulfillment of the
requirements for the award of Degree of Master of Technology.

I further declare that this project work has not been submitted in full or part for the
award of any other degree or diploma in any other educational institutions.

Student Name : K. VIJAYA SANTHI

Regd. No : 23JR1DAI04

i
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
KKR & KSR INSTITUTE OF TECHNOLOGY AND SCIENCES
(AUTONOMOUS)
(Approved by A.I.C.T.E. New Delhi || Permanently Affiliated to JNTUK, Kakinada) || Accredited with ‘A’
Grade by NAAC || NBA Accreditation), Vinjanampadu (V), Vatticherukuru (M), Guntur (Dt), A.P-522017.

CERTIFICATE

This is to certify that this project report entitled “PRIVILEGE ESCALATION


ATTACK DETECTION AND MITIGATION IN C L O U D U S I N G MACHINE
LEARNING” submitted by K. VIJAYA SANTHI (23JR1DAI04) to Jawaharlal Nehru
University Kakinada, through KKR & KSR Institute of Technology and Sciences
(Autonomous) for the award of the Degree of Master of Technology in CSE (Artificial
Intelligence & Machine Learning) is a bonafide record of project work carried out by under
the supervision of Mrs. P. NEELA SUNDARI during the year 2024-25.

SUPERVISOR HEAD OF THE DEPARTMENT

EXTERNAL EXAMINER

ii
ACKNOWLEDGEMENTS

I would like to express my heartfelt profound gratitude towards Mrs. P. NEELA


SUNDARI, Assistant Professor, Department of COMPUTER SCIENCE AND
ENGINEERING, who played a supervisory role to utmost perfection, enabled me to seek
through the [Link]. (II year -II sem.) main project and for her guidance as an internal guide
methodically and meticulously.
I highly indebted to Prof. R. RAMESH, Head of the Department, Department of
Computer Science and Engineering for providing me with all the necessary support.
I render my deep sense of gratitude to Dr. P. BABU, Principal, for permitting me to
carry out my main project work.
I render my deep sense of gratitude to Dr. K HARI BABU, Director (Academics),
for approving me to transmit out my main project work
I am very much thankful to the College Management for their continuous support and
the facilities provided.
I express my gratitude towards all the teaching and non-teaching faculty members
of the Department of COMPUTER SCIENCE AND ENGINEERING for their support
and encouragement.
I would also like to express my sincere thanks to my colleague friends, other
department staff members, my parents, and family members for their enduring continuous
encouragement and assistance whenever required.

by

Student Name: K. VIJAYA SANTHI

Regd. No :23JR1DAI04

iii
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
KKR & KSR INSTITUTE OF TECHNOLOGY AND
SCIENCES
(AUTONOMOUS)
(Approved by AICTE. New Delhi || Permanently Affiliated to JNTUK, Kakinada) || Accredited with ‘A’ Grade by NAAC ||
NBA Accreditation), Vinjanampadu (V), Vatticherukuru (M), Guntur (Dt), A.P-522017

INSTITUTE VISION AND MISSION

VISION

 To produce eminent and ethical Engineers and Managers for society by imparting
quality professional education with an emphasis on human values and holistic
excellence.

MISSION

 To incorporate benchmarked teaching and learning pedagogies in curriculum.


 To ensure the all-around development of students through judicious blend of
curricular, co-curricular, and extra-curricular activities.
 To support the cross-cultural exchange of knowledge between industry and academy.
 To provide higher/continued education and research opportunities to the employees
of the institution.

iv
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
KKR & KSR INSTITUTE OF TECHNOLOGY AND
SCIENCES
(AUTONOMOUS)
(Approved by AICTE. New Delhi || Permanently Affiliated to JNTUK, Kakinada) || Accredited with ‘A’ Grade by NAAC ||
NBA Accreditation), Vinjanampadu (V), Vatticherukuru (M), Guntur (Dt), A.P-522017

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

VISION OF THE DEPARTMENT

 To become a reputed center in Computer Science and systems Engineering for


quality, competency, and social responsibility.

MISSION OF THE DEPARTMENT

 Strengthen the core competence with vibrant technological education in a congenial


environment.
 Promote innovative research and development for the economic, social, and environment.

 Inculcate professional behavior and strong ethical values to meet the challenges
in collaboration and lifelong learning.

v
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
KKR & KSR INSTITUTE OF TECHNOLOGY AND
SCIENCES
(AUTONOMOUS)
(Approved by A.I.C.T.E. New Delhi || Permanently Affiliated to JNTUK, Kakinada) || Accredited
with ‘A’ Grade by NAAC
|| NBA Accreditation), Vinjanampadu (V), Vatticherukuru (M), Guntur (Dt), A.P-522017

PROGRAM SPECIFIC OUTCOMES (PSOs)

PSO1: Application Development


Able to develop business solutions through the Latest Software Techniques and tools
for real-time Applications.
PSO2: Professional and Leadership
Able to practice the profession with ethical leadership as an entrepreneur through
participation in various events like Ideathons, Hackathons, project expos, and
workshops.
PSO3: Computing Paradigms
Ability to identify the evolutionary changes in computing using Data Sciences, Apps,
Cloud computing, and IoT.

Program Educational Objectives (PEOs)

Post-graduates of Computer Science and Engineering shall


PEO1: Demonstrate Advanced Technical Expertise
Possess in-depth knowledge of computing theories, algorithms, and emerging
technologies, enabling them to design, develop, and optimize complex software and
hardware systems.
PEO2: Excel in Research and Innovation
Engage in independent and collaborative research, contributing novel solutions to
real-world problems, and advancing the state of the art in computer science and
engineering.
PEO3: Exhibit Professional and Ethical Leadership
Lead multidisciplinary teams with integrity, communicate effectively, and uphold ethical
standards in professional, academic, and societal contexts.

vi
PEO4: Adapt to Evolving Technologies
Continuously upgrade skills to remain relevant in rapidly changing technological
landscapes, embracing lifelong learning and professional development.
PEO5: Contribute to Societal and Industrial Growth
Apply computational knowledge to address societal challenges, support sustainable
development, and drive innovation in industry and entrepreneurship.

PROGRAM OUTCOMES (POS)

PO1: Disciplinary Knowledge: Demonstrate a systematic understanding of advanced


concepts in the chosen field of engineering and apply this knowledge to solve complex
technical problems

PO2: Problem Analysis and Solution Design: Critically analyze


engineering problems, review research literature, and design
innovative solutions with an emphasis on technical feasibility,
sustainability, and societal impact.

PO3: Research and Innovation: Undertake cutting-edge research to


contribute to knowledge creation, technology development, and
innovation in engineering practices.

PO4: Professional Ethics and Responsibility: Adhere to ethical


principles and professional responsibilities while making informed
decisions in engineering practices.

PO5: Communication and Leadership: Exhibit effective


communication, technical writing, and leadership skills to collaborate
with multidisciplinary teams and engage with stakeholders.

PO6 Life-Long Learning: Recognize the need for continuous learning to stay updated
with evolving technologies and adapt to professional and societal changes.

vii
Course Outcomes (COS)

CO221.1: Perform a system of examinations to identify problems.

CO221.2: Review the literature/Related work.

CO221.3: Defining the problem & its area of domain.

CO221.4: Proposal of solution for the selected area/methodology.

CO221.5: Analysis of the proposed work & documentation.

CO221.6: Acquire collaborative learning, leadership qualities & presentation skills

Course Outcomes - Program Outcomes & Program Specific Outcome mapping

PO1 PO2 PO3 PO4 PO5 PO6 PS01 PSO2 POS3


CO221.1 2 3 3
CO221.2 2 2 3 2
CO221.3 3 2 2 2
CO221.4 1 2 2 2 2 3 2
CO221.5 2 2 3 2 2 2
CO221.6 3 2 3
3: High 2: Medium 1: Low

Program Educational Objectives – Program Specific Outcomes correlation

PSO1 PSO2 PSO3


PEO1 2 1 3
PEO2 3 2
PEO3 1 2 3
PEO4 3 2
PEO5 1 3 2

3: High 2: Medium 1: Low


CO-PO Mapping with Reasons:

1. CO221.1 is mapped with PO1, PO2, as basic knowledge of Engineering and problem Analysis
activities are highly essential to conduct examinations on existing systems that are used in
industries as a part of and to define the problem of the proposed system.
2. CO221.2 is mapped with PO1, PO2, and PO3 for identification, gathering analysis, and
classification of requirements for the proposed system, basic knowledge of engineering, and
Analysis steps, along with complex problem analysis through the efforts of teamwork to meet the
specific needs of the customer.
3. CO221.3 is mapped with PO1 and PO2, to conduct the literature review and to examine the
relevant systems to understand and identify the merits and demerits of each to enhance and
develop the proposed as needed.
4. CO221.4 is mapped with PO1, PO2, PO3, PO5, and PO6 because modularization and design of
the project are needed after requirements elicitation. For modularization and design of the
project, Basic knowledge of Engineering, Analysis capabilities, Design skills, and
communication are needed between team members as different modules are designed
individually before integration.
5. CO221.5 is mapped with PO3, PO4, PO5, and PO6 as to construct the project latest
technologies are needed. The development of the project is done individually and in groups
with well-defined communication by using engineering and management principles.
6. CO221.6 is mapped with PO5 and PO6 because, during and after completion of the project,
documentation is needed along with proper methods of presentation through understanding
and application of engineering and management principles, which in turn needs well-defined
communication between the team members with all the ethical values. Even the project
development team defines future enhancements as a part of the project development after
identifying the scope of the project.
CO-PSOs Mapping with Reasons:
1. CO221.1 is mapped with PSO1, as examining existing systems and identification of the problem
is a part of the Application Development activity, and identification of evolutionary changes in
the latest technologies.
2. CO221.2 is mapped with PSO1, as identifying and classifying the requirements is a part of
Application development and evolutionary computing changes, and also follows ethical
principles.
3. CO221.3 is mapped with PSO1, PSO3, as a review of literature is a part of the application
development activity by recognizing the computing technologies and their evolutionary changes.

ix
4. CO221.4 is mapped with PSO1 and PSO3 because modularization and logical design is also a
part of Application development and follow computing changes using Deep learning technology.
5. CO221.5 is mapped with PSO1 and PSO2 as Testing, Development, and Integration of project
activities are part of Application development and follow ethical principles.
6. CO221.6 is mapped with PSO2 for project documentation and presentation; the project team
members apply the professional and leadership quality

x
ABSTRACT

Cyber security has become more difficult due to the development of smart gadgets as
well as the recent spike in the number and complexity of cyber-attacks. Despite cloud
computing's revolutionary impact on modern business operations, the centrally managed
structure makes it harder to implement distributed security measures. As a result, the massive
amounts of data swapped between consumers and service providers increase the risk of
accidental as well as malicious data breaches. One of the most serious dangers comes from
malevolent insiders who have access to high-level systems. Through the systematic
identification of aberrant behaviors suggestive of privilege escalation, this work presents a
method that uses machine learning to detect and categorize insider threats.

Synthesis learning approaches were used across several algorithms, such as XGBoost,
LightGBM, Random Forest (RF), and AdaBoost, using a bespoke dataset generated based on
the CERT insider danger dataset, to increase detection accuracy. According to the first
findings, LightGBM had better overall accuracy than the remainder of the models. Support
Vector Machine (SVM) emerged as the clear winner in terms of classification performance,
especially when it came to spotting minor insider activities, according to further experiments.
Based on these results, it seems that identification of insider threats systems may be even
more resilient if they used a hybrid strategy that combined support vector machines alongside
ensemble models.

xi
LIST OF CONTENTS

Abstract xiii
List of Figures xvi
Acronyms xvii

DESCRIPTION OF CONTENT PAGE NO

CHAPTER-I INTRODUCTION 1–9


1.1 Introduction of the Project 1-2
1.2 Existing System 3
1.3 Problems Identified in Proposed System 3
1.4 Proposed System 4
1.5 Benefits of the Proposed System 5
1.6 Potential Users 6
1.7 Unique Features of the System 7
1.8 Demand For the Product 8
1.9 Protection of Idea 9

CHAPTER-II ANALYSIS 10-23


2.1 Literature Review 10-14
2.2 Requirement Analysis 15-17
2.3 Modules Description 18
2.4 Feasibility Study 19-20
2.5 Process Model Used 21
2.6 Hardware And Software Requirements 21
2.7 SRS Specification 22
2.8 Financial Plan for Development of 22
2.9 Project Business Plan from 23
2.10 Commercialization 23
Business Model Canvas
CHAPTER-III DESIGN PHASE 23 – 34
3.1 Design Concepts & Constraints 24
3.2 Design Diagram of the System 25
3.3 Conceptual Design 26-29
3.4 Logical Design 30
3.5 Architectural Design 31
3.6 Algorithms Design 32
3.7 Database Design 32
3.8 Module Design Specification 33-34

CHAPTER-IV SOURCE CODE AND RESULT 35 – 36


4.1 Source code 35
4.2 Results 35
4.3 Discursions 36

CHAPTER-V IMPLEMENTATION 37 – 42
5.1 Implementation Introduction 37-40
5.2 Implementation Procedure & Steps 41-42
5.3 User Manual
CHAPTER-VI 43 – 44
6.1 Conclusions 43
6.2 Future Scope of Work 44

REFERENCES 45 – 47

PUBLICATIONS 48
Details of the published article 48

BIBLOGRAPHY 49

xiii
xiv
LIST OF FIGURES
FIGURE NO. TITLE OF THE FIGURE PAGE NO.

Figure: 3.2 Data Flow Diagram 25


Figure: 3.3.1 Class Diagram 26
Figure: 3.3.2 Sequence Diagram 27
Figure: 3.3.3 Activity Diagram 28
Figure: 3.3.4 Use Case Diagram 29
Figure: 3.5 System Architecture 31
Figure: 4.2.1 Home Screen 38
Figure: 4.2.2 Dataset Training & Testing Results 39
Figure: 4.2.3 Visualization Results 40
Figure: 4.2.4 Detection Results 41

xv
ACRONYMS

ACRONYM DESCRIPTION
ML Machine Learning

AI Artificial Intelligence

SVM Support Vector Machine

KNN K-Nearest Neighbors

RF Random Forest

DT Decision Tree

GBM Gradient Boosting Machine


API Application Programming Interface

xvi
INTRODUCTION

1.1 Introduction of the project


Earlier research in network and cloud security has mainly focused on identifying
vulnerabilities or detecting cyber threats through conventional approaches such as rule-
based detection, access control monitoring, and privilege escalation analysis. Although
these techniques provide basic protection, they often struggle to correctly identify
complex and stealthy attacks. In particular, traditional methods lack the ability to
distinguish between legitimate user behavior and malicious activities that closely
resemble normal system usage. As a result, many sophisticated attacks either remain
undetected or generate excessive false alarms. The rapid growth of cloud computing has
further increased the importance of effective security mechanisms.

Cloud environments store critical data, applications, and services for


organizations, making them attractive targets for attackers. One of the most severe
threats in such environments is privilege escalation, also known as privilege
augmentation. In this type of attack, an individual gains access rights beyond those
initially assigned, allowing unauthorized operations, data manipulation, or complete
system compromise. If successful, a privilege escalation attack can jeopardize the entire
cloud infrastructure, leading to data breaches and service disruptions. Detecting privilege
escalation attacks is particularly challenging because they often exploit legitimate
credentials and normal system functionalities. These attacks may be carried out by
external attackers who obtain valid access or by insiders who misuse their authorized
privileges. Since the behavior involved may not immediately violate predefined security
rules, traditional detection systems often fail to recognize these threats in time.
Therefore, there is a growing need for intelligent and adaptive techniques capable of
identifying subtle behavioral deviations.

1
This study addresses these challenges by exploring ensemble-based machine
learning approaches for detecting insider threats and privilege escalation attempts in
cloud environments. Data-driven models are capable of learning behavioral patterns
from historical records and identifying deviations that may indicate malicious intent.
Unlike static detection techniques, these models can adapt to evolving usage patterns
and emerging threat scenarios, making them more suitable for modern cloud security
applications.

A major limitation in insider threat research is the lack of access to real


organizational data, as such information is highly confidential. To overcome this issue,
this research makes use of the “CERT Internal Threat Tools” dataset, which is widely
accepted in academic studies. The dataset is designed to simulate real enterprise
environments and includes detailed records of user activities such as login events, file
access, email communication, and device usage. In addition to activity logs, it provides
organizational details including employee roles, departments, and job responsibilities.
This combination enables realistic modeling of user behavior within an organizational
context.

Using this dataset, multiple detection models were developed to identify abnormal
activities related to unauthorized privilege usage. The objective was to replicate real-
world enterprise scenarios and accurately differentiate between normal operational
behavior and suspicious actions. Ensemble learning techniques were employed to
improve detection performance by combining the outputs of multiple models. This
approach enhances reliability by minimizing the limitations of individual classifiers and
improving overall prediction accuracy.

The proposed framework integrates both supervised learning methods and


anomaly detection techniques. Supervised models are trained using labeled historical
data containing examples of normal behavior and known malicious actions. These
models learn decision boundaries that can be applied to classify future events. In
contrast, anomaly detection methods focus on establishing a baseline of normal behavior
and identifying deviations from that baseline. This is particularly useful for identifying
previously unseen attack patterns that do not match known signatures.

2
Various behavioral features are analyzed, including access frequency, resource
usage patterns, permission requests, and deviations from role-based access norms. For
instance, sudden attempts by a user to access restricted resources outside their functional
role or repeated requests for elevated permissions without operational justification may
indicate suspicious activity. By continuously monitoring such patterns, the system can
flag potential threats at an early stage.

The results of this research indicate that ensemble-based machine learning models
can significantly enhance the detection of privilege escalation attacks in cloud
environments. These models are capable of adapting to changing user behavior and
evolving threat conditions, thereby improving detection accuracy over time. By learning
from historical data and continuously refining behavioral profiles, the proposed
approach offers a proactive defense mechanism against insider threats.

1.2 Existing System

The current system is covered in this chapter of the book. When bad actors exploit
security holes to gain elevated access to protected resources, they are committing
privilege escalation attacks, which target cloud infrastructures in particular. As a result
of their robustness and ability to handle complex datasets, machine learning models like
Random Forest and Ada Boost are perfect for identifying and reducing the impact of
such attacks. The first stage is to collect and prepare large volumes of log data from
various cloud services, including login records, API calls, and resource consumption
trends. Key attributes, such as access anomalies and user behavior patterns over time,
are recovered. Random Forest has been taught to classify events as either typical or
suspicious by using its capabilities in handling high-dimensional data and offering
knowledge about the feature value.
This is why AdaBoost is so focused on misclassified data: to make detection more
sensitive to subtle attack patterns. These models work together to identify anomalies that
may indicate an increase in privileges. Businesses may achieve scalable, real-time
protection against privilege escalation risks by integrating cloud-native products with
SIEM systems.

3
1.3 Problems Identified in Proposed System

• Cloud-based applications were everywhere within today's digital world because of


how scalable they are.
• Nevertheless, there are a lot of worries about the security and reliability of cloud-
based systems because of this.

1.4 Proposed System

With the proliferation of smart products and an increase in the frequency and
sophistication of cyberattacks, cyber security has taken on more significance. Potential
cloud's centralization limits the use of distributed services like security systems,
notwithstanding the transformative effects of cloud computing on organizations. Due to
the massive amounts of data sent between businesses with cloud service providers,
there's a substantial chance of accidental or malicious exposure of critical information.
Data sources are often the first targets of cybercriminals because they have high value
and sensitive information that they possess. Every user's privacy and security become
compromised whenever data vanishes from the cloud. An insider danger is someone in a
position of power doing something harmful. A number of organizations and companies
have established their own internal networks within reaction to the exponential growth
of the internet. The more access a malevolent insider has and the more opportunity they
have to inflict serious damage, the more susceptible the organization is to their attacks.
others with insider status are privy to resources and knowledge that others without it do
not. The identification and mitigation of insider risks could be achieved by the
application of criteria such as insider signals, detection procedures, and insider kinds. In
contrast to real- time analysis, which may spot malicious activity as it occurs,
asynchronous anomaly detection gathers log data and queries it to identify certain
patterns.

1.5 Benefits of the Proposed System

4
The proposed approach contributes to strengthening cloud cyber security in several
important ways by addressing both technical vulnerabilities and behavioral risks associated
with modern cloud infrastructures. One of the primary advantages of this framework is its
ability to enhance data security in centralized cloud environments. Cloud platforms often
store large volumes of sensitive organizational information, making them attractive targets
for attackers. By integrating advanced monitoring mechanisms and anomaly detection
techniques, the proposed system reduces the likelihood of unauthorized access, data
leakage, and accidental exposure of critical assets. Continuous analysis of system activity
enables early identification of irregular behavior, thereby improving the overall resilience of
cloud-based services.

Another significant benefit of the proposed solution is its effectiveness in mitigating


insider threats. Insider attacks pose a serious security risk because authorized users often
possess elevated access privileges and deep knowledge of organizational systems.
Traditional security measures frequently fail to detect such threats, as insider activities may
initially appear legitimate. The proposed framework addresses this challenge by analyzing
insider-related indicators, monitoring user behavior patterns, and identifying deviations
from established norms. Through behavioral profiling and anomaly detection, the system
can detect potentially harmful actions carried out by insiders and prevent them before they
result in substantial damage.

Real-time threat detection is another key strength of the proposed approach. The system
continuously monitors user interactions, access requests, and system events to identify
suspicious behavior as it occurs. This real-time capability allows immediate response
actions, such as alert generation, access restriction, or session termination, thereby
minimizing the potential impact of ongoing attacks. In addition to real-time monitoring, the
framework supports asynchronous log analysis, which examines historical activity data to
uncover long-term or slow-moving attack patterns. This dual-layer detection strategy
ensures that both immediate and persistent threats are effectively addressed.

The proposed system also enables proactive risk management by anticipating potential
security risks rather than reacting only after an incident occurs. By learning from historical
data and adapting to evolving usage patterns, the system can identify emerging threat trends
and adjust its detection models accordingly. This adaptability is particularly important in
5
dynamic cloud environments where user roles, workloads, and access requirements
frequently change. As the network grows, the framework scales efficiently without
compromising detection accuracy or performance.

Privacy protection is another essential aspect supported by the proposed approach. By


focusing on behavioral patterns and access anomalies rather than intrusive content
inspection, the system minimizes unnecessary exposure of sensitive user data. This ensures
compliance with organizational privacy policies and regulatory requirements while
maintaining a high level of security. Additionally, automated monitoring and analysis
reduce reliance on manual security interventions, lowering the risk of human error.

1.6 Potential Users

The proposed system is designed to deliver robust cyber security capabilities to a


diverse range of users that rely heavily on cloud-based infrastructures for data storage,
processing, and service delivery. As cloud adoption continues to increase across sectors,
the need for intelligent and adaptive security mechanisms has become essential. This
system addresses that requirement by offering advanced protection against both external
cyber threats and insider-related risks, making it suitable for organizations that handle
sensitive and high-value information.
Large enterprises and corporate organizations represent one of the primary user groups
for the proposed solution. These entities manage extensive volumes of confidential
business data, intellectual property, and customer information, which makes them
attractive targets for cyberattacks. Additionally, the presence of numerous employees
with varying levels of system access increases the risk of insider threats. The proposed
system helps such organizations by monitoring user behavior, detecting abnormal
activities, and preventing unauthorized privilege usage, thereby reducing the likelihood
of data breaches and operational disruptions.
Public sector organizations and government agencies can also benefit significantly
from this security framework. These institutions are responsible for managing highly
sensitive citizen information, administrative records, and, in some cases, national
security data. Any compromise of such systems may have serious legal, social, and
6
political consequences. By implementing continuous monitoring and anomaly detection,
the proposed system enhances the protection of critical government cloud infrastructures
and supports compliance with strict security and regulatory requirements.
The banking, financial services, and insurance sectors are another important group of
potential users. These industries process sensitive financial transactions, customer
identity details, and asset-related information on a daily basis. Unauthorized access or
misuse of privileges in such environments can result in financial losses and reputational
damage. The proposed solution strengthens security controls by identifying suspicious
access patterns and preventing both internal misuse and external intrusions into financial
cloud systems.

Healthcare organizations and medical institutions are also well-suited to adopt this
system. These entities store highly sensitive patient data, including medical histories,
diagnostic records, and personal identifiers. Ensuring the confidentiality and integrity of
this information is critical for patient trust and regulatory compliance. The system assists
healthcare providers by monitoring access behavior and preventing unauthorized data
exposure.

1.7 Unique Features of the System

The proposed solution distinguishes itself from traditional cloud security mechanisms
through a combination of intelligent detection strategies, adaptive architecture, and a
strong focus on insider threat mitigation. Unlike conventional security protocols that rely
primarily on static rules or predefined signatures, this system adopts a behavior-driven
approach that enables it to respond effectively to both immediate and emerging threats
within cloud environments.
One of the most notable strengths of the proposed solution is its dual-mode detection
capability. The system is designed to identify abnormal activities in real time, allowing
rapid recognition and response to malicious actions as they occur. This real-time
monitoring helps prevent active attacks from escalating into large-scale security
incidents. In parallel, the system performs long-term analytical assessment of historical
logs and user activity data. This retrospective analysis enables the discovery of subtle

7
and slowly evolving attack patterns that may remain undetected by traditional
monitoring tools. By combining instantaneous detection with historical analysis, the
solution provides comprehensive threat visibility.
A defining feature of the system is its emphasis on detecting insider threats. Insider-
related attacks are particularly challenging because they often originate from users with
legitimate access rights and operational knowledge. The proposed solution addresses
this challenge by analyzing multiple insider indicators, including behavioral trends,
access frequency, permission usage, and deviations from role-based norms. By
correlating these factors, the system can identify suspicious activities that would
otherwise appear legitimate under conventional security models.

Scalability and flexibility are integral components of the proposed architecture. The
system is designed to accommodate growing numbers of users, devices, and services
without compromising detection accuracy or system performance. This makes it suitable
for dynamic cloud environments where workloads and access patterns frequently
change. As organizational networks expand, the system adapts accordingly, ensuring
consistent protection across distributed infrastructures.
The solution also supports proactive risk management by identifying potential security
threats at an early stage. Early detection allows organizations to implement preventive
measures before vulnerabilities are exploited, significantly reducing the likelihood of
severe security breaches. This forward-looking approach enhances overall system
resilience and minimizes operational downtime.

1.8 Demand for the Product

The demand for the proposed solution has increased significantly due to a convergence
of several key factors, including the rapid expansion of cloud computing, widespread
digital transformation, and the growing sophistication of cyber threats. As organizations
continue to migrate mission-critical applications and sensitive data to cloud platforms,
the limitations of traditional security mechanisms have become increasingly apparent.
This shift has created an urgent requirement for robust, adaptive, and intelligent security
frameworks capable of addressing modern threat landscapes.

8
One of the primary drivers behind this demand is the rising frequency of security
incidents involving insider threats, data breaches, and unauthorized access. Insider
attacks are particularly challenging to detect because they often originate from users
with legitimate credentials and access privileges. Conventional security systems, which
rely heavily on static rules and signature-based detection, are often ineffective against
such threats. As a result, there is a growing need for advanced anomaly detection
systems that incorporate real-time monitoring to identify suspicious behavior as it
occurs, as well as analytical mechanisms that can uncover long-term or hidden attack
patterns.
The importance of data protection is especially pronounced in sectors such as
education, healthcare, government, and finance, where information is both highly
sensitive and mission critical. Educational institutions manage student records and
research data, healthcare organizations store confidential patient information, and
government agencies handle sensitive administrative and national security data.
Financial institutions, meanwhile, process high-value transactions and personal customer
information. Any compromise in these environments can result in severe financial
losses, legal penalties, and erosion of public trust. Consequently, these sectors require
stronger and more reliable security measures to safeguard their cloud infrastructures.
Small and medium-sized businesses (SMBs) also contribute significantly to the
growing demand for the proposed solution. While SMBs increasingly adopt cloud
services to improve efficiency and scalability, they often lack the resources to implement
complex and expensive security solutions. There is a strong need for cost-effective,
scalable, and automated security systems that can provide enterprise-level protection
without excessive operational overhead.

1.9 Protection of Idea

The proposed system idea must be protected in order to guarantee originality,


avoid unauthorized replication, and preserve its intellectual value. With its innovative
features, such as dual detection methods, proactive risk management, and threat
evaluation and analysis, this platform has the potential to completely transform cyber
security. To protect this idea, various components of the framework might require the
9
acquisition of separate intellectual property (IP) rights. Patents could protect novel
algorithms, detection models, or architectural designs, whereas copyrights may keep
academic articles, programs, and study results safe.

10
ANALYSIS

2.1 Literature Review

Topic: Analyzing data granularity levels for insider threat detection using machine
learning.
Author: D. C. Le, N. Zincir-Heywood, and M. I. Heywood
Insider threat actors are especially difficult to identify because they possess detailed
knowledge of an organization’s structure, security policies, and operational procedures,
along with legitimate access to internal networked systems. This privileged position
allows insider attacks to cause significant financial and operational damage (Le et al.,
2009) [9]. Detecting such threats is challenging due to highly imbalanced datasets, limited
availability of reliable ground-truth labels, and continuous behavioral changes over time.
Machine learning provides an effective solution by analyzing user activity at multiple
levels of granularity and identifying deviations from established behavioral norms. By
learning patterns of normal behavior, ML-based models can detect malicious insider
actions that traditional security mechanisms often overlook. Among the evaluated
techniques, the Random Forest algorithm consistently outperforms other machine learning
approaches in terms of detection accuracy, F1-score, and reduced false positive rates. The
proposed study achieves an overall detection accuracy of approximately 85% while
maintaining a very low false alarm rate, demonstrating the suitability of ensemble-based
learning methods for practical insider threat detection in modern cloud and network
environments.
Topic: Handling insider threat through supervised machine learning techniques.
Author: F. Janjua, A. Masood, H. Abbas, and I. Rashid.
A significant cyber security challenge, as highlighted by Janjua et al. [10], is preventing
malicious insiders from carrying out harmful activities within organizational systems. The
primary aim of their study is to classify emails from the TWOS dataset using multiple
machine learning techniques. Several supervised learning algorithms were applied,
including AdaBoost, Naïve Bayes (NB), Logistic Regression, K-Nearest Neighbors
(KNN), Linear Regression, and Support Vector Machine (SVM). Experimental results
indicate that the AdaBoost algorithm achieves the highest classification accuracy, reaching
approximately 98% in distinguishing malicious emails from legitimate ones. Although the
11
model demonstrates strong performance using the available dataset, the study
acknowledges limitations due to the relatively small size of the training data. Expanding
the dataset could further improve the robustness, generalization capability, and overall
effectiveness of the proposed classification approach.
Topic: Machine learning based malware detection in cloud environment using clustering
approach.
Author: R. Kumar, K. Sethi, N. Prajapati, R. R. Rout, and P. Bera
Kumar et al.[11] The need and difficulty of integrating security and resilience upon a
Cloud platform were discussed, citing the enormous number of varied applications
running on shared resources. Within the infrastructure of the cloud. A new method
for detecting malware, trend micro locality sensitive hashing (TLSH), was proposed using
the clustering principle. Their Cuckoo sandbox was used, which runs the analyses of
dynamic files in a different environment and produces the findings dynamically. Methods
such as Chi-square, random forest, and principal component analysis (PCA) are
additionally employed for feature selection. Three different classifiers are suggested, and
experimental results for both clustering as well as non-clustering approaches are produced.
Based on the results of the studies, Random Forest outperforms the other classifiers in
terms of accuracy. Concerns about cloud security have persisted for quite some time.
Particularly sensitive most valuable information is what attackers aim for while they
attack data sources. Data loss poses a significant risk to the privacy and security of all
Cloud users. Through penetrating a weak user node, inside attackers are able to obtain
access to an internal system. Hackers pose as legitimate users and launch attacks using an
internal connection to the cloud network. One method for protecting a cloud network
from intruders inside is by using Improvised LSTM. Through differentiating between
broken and new user nodes as well as malfunctioning nodes, the suggested ILSTM not
simply identifies internal attackers but additionally decreases false alarm rates.
Topic: Detecting SQL injection attacks in cloud SaaS using machine learning.
Author: D. Tripathy, R. Gohil, and T. Halabi.
Tripathy et al. [12] emphasized that conventional web and cloud applications remain
susceptible to a variety of cyber threats, with SQL injection attacks being among the most
critical, especially in software-as-a-service (SaaS) platforms. These attacks can
compromise databases, manipulate sensitive data, and disrupt services, making them a
significant concern for organizations relying on cloud infrastructure. To mitigate this risk,
12
the authors proposed a machine learning-based approach for detecting SQL injection
attempts, designing a classifier capable of identifying malicious activities within database
systems.

The study assessed the performance of several machine learning algorithms, including
Boosted Trees Classifier, TensorFlow-based Linear Classifier, Artificial Neural Networks
(ANN), and Random Forest. These models were trained and tested on datasets containing
examples of both legitimate and malicious database operations. The analysis highlighted
that write operations carried out by attackers tend to be more frequent and damaging than
read operations, underscoring the need for accurate detection mechanisms to prevent
unauthorized modifications.

Among the evaluated models, the Random Forest classifier consistently outperformed the
others, demonstrating the highest accuracy in identifying SQL injection attacks. Its
ensemble learning approach allows it to handle complex patterns in the data effectively,
reducing the likelihood of false negatives and improving overall reliability. The findings
indicate that machine learning, particularly ensemble-based methods like Random Forest,
can provide robust protection against SQL injection attacks by automatically learning and
recognizing patterns of malicious behavior. This approach offers a proactive solution for
enhancing database security in cloud and web-based applications, supporting safer
operations in environments with high volumes of user interactions and critical data storage
requirements.

Topic: Insider threat detection using an unsupervised learning method.


Author: X. Sun, Y. Wang, and Z. Shi
Sun et al. [13] highlighted the growing reliance of businesses and organizations on
networked systems, which has led to an increase in security risks. According to the 2018
Ponemon Institute “Cost of a Data Breach” study, which analyzed incidents across 15
countries and 17 industries, nearly 48% of data breaches were deliberate attacks, while
approximately 27% were caused by insider actions. These statistics underscore the
importance of detecting and mitigating both external and internal threats to maintain data
integrity and organizational security.

In response to these challenges, the study employed a tree-based feature extraction


approach to model and analyze user behavior. By organizing user activity data into a

13
structured hierarchy, the system can identify patterns that indicate abnormal or suspicious
actions. Various COPOD (Copula-Based Outlier Detection) methods were applied to
differentiate between normal behavior and potential threats, enabling the identification of
users whose actions deviate significantly from expected patterns.

The proposed detection framework demonstrated performance exceeding that of standard


unsupervised learning techniques. Its design is particularly effective for handling large
volumes of heterogeneous and complex data, making it suitable for modern organizational
environments where user activities and access patterns are diverse. Overall, this approach
provides a scalable and efficient solution for early detection of anomalous behavior,
enhancing both network security and insider threat mitigation.
Topic: Insider threat detection based on user behavior modeling and anomaly detection
algorithms.
Author: J. Kim, M. Park, H. Kim, S. Cho, and P. Kang
According to Kim et al. [14], insider hazards include authorized users' malevolent actions
including stealing sensitive information even intellectual property, fraud, or sabotage.
Threats from inside an organization's own ranks are significantly less prevalent than those
from outside the network, but they are no less dangerous. In order to identify insider risks,
researchers often employ one of three approaches. The first step is to develop a detection
system that relies on rules. Second, you may build a network graph while maintaining an
eye on how its structure changes to identify potentially malicious users. Thirdly, a
statistical or machine-learning model may be built using past data to anticipate possibly
harmful actions. Since it is very difficult to collect authentic business system logs,
researchers used the "CERT Insider Threat Tools" dataset. In addition to specific
organizational data, covering employees' departments and duties, the CERT dataset also
includes computer activities logs. In order to mimic real-world businesses, they used
machine learning to construct insider threat detection algorithms. According to the results
of the experiments, the proposed system is able to reasonably identify malicious insider
behaviors.
Topic: Detecting and preventing cyber insider threats: A survey.
Author: L. Liu, O. de Vel, Q.-L. Han, J. Zhang, and Y. Xiang
According to Liu et al. [15], cyber security breaches are becoming more common, and the

14
majority of these breaches originate from within the company. Due to their frequent
privileged network access and ability to remain concealed beneath enterprise-level
security defenses, insider attacks are notoriously difficult to detect and mitigate. Their
presentation of the many insider risks is based on their careful reading and reassembling
of material from literature. There are three distinct kinds of insider threats:
masqueraders, traitors, and accidental perpetrators. An approach to security known as
"prevention" is taking measures to either avoid or better detect potential dangers from
inside an organization. Using host, network, mix contextual data analytics as their lenses,
they use a data analytics approach to evaluating the proposed initiatives. In the meanwhile,
we review the literature, compare the research that is pertinent, and summarize the pros
and cons to help you make up your own mind.

Topic: Malware detection in cloud infrastructures using convolutional neural networks.


Author: M. Abdelsalam, R. Krishnan, Y. Huang, and R. Sandhu.

Abdelsalam et al. [24] proposed a deep learning–based malware detection approach that
analyzes raw process behavior data derived from system performance metrics. Rather than
relying on signature-based methods, their work focuses on behavioral analysis to identify
malicious activity. The study evaluates the effectiveness of Convolutional Neural
Networks (CNNs) by comparing a standard two-dimensional CNN with an enhanced
three-dimensional CNN architecture. The 2D CNN processes behavioral features without
considering temporal context and serves as a baseline model, achieving an accuracy of
approximately 79% on the test dataset. To improve detection performance, a 3D CNN
model was introduced, incorporating time windows as an additional dimension to capture
temporal patterns in process behavior. This design helps reduce mislabeling caused by
short-term behavioral fluctuations and enables more accurate learning of malware
characteristics. Experimental results demonstrate that including temporal information
significantly improves detection accuracy, confirming that CNN-based deep learning
models are effective for behavior-based malware detection in dynamic computing
environments.

2.1.2 Objectives of the System

Data sources are frequently the primary targets of cyberattacks because they store
15
sensitive and high-value information critical to organizational operations. Any loss,
exposure, or unauthorized manipulation of online data can seriously compromise security
and individual privacy. Insider threats are particularly dangerous, as they originate from
individuals who hold trusted positions and possess legitimate access to systems, enabling
them to carry out harmful actions with minimal suspicion. Therefore, the most effective
defense strategy lies in preventing attacks at an early stage and accurately identifying
their source before significant damage occurs.

To achieve this, the proposed system is evaluated using a combination of real-


world datasets and carefully designed simulations that model privilege escalation
scenarios. This evaluation approach ensures that the system is tested under realistic and
controlled conditions, reflecting both known attack patterns and potential emerging
threats. The performance of the machine learning models is assessed using standard
evaluation metrics, including accuracy, recall, and false positive rate. These metrics
provide insight into the system’s ability to correctly detect privilege escalation attempts,
minimize incorrect alerts, and maintain reliable protection. Through this comprehensive
evaluation, the effectiveness of the proposed approach in identifying and preventing
insider-driven privilege escalation attacks can be clearly demonstrated.

2.2 Requirement Analysis

2.2.1 Functional Requirements Analysis

The functional requirements detail the primary operations that the system must do
to achieve its objectives. The proposed system aimed to collect and handle log data from
several sources throughout the network of the organization. Asynchronous log analysis
serves the purpose for discovering long-term trends of attacks, while real-time anomaly
detection was necessary for instantly alerting users to malicious activities. To detect and
classify insider threats, it is essential to be able to monitor user behavior, access
privileges, and unusual activity patterns. Also, when the system notices suspicious
activity, it should alert administrators via alerts and notifications so they can respond
quickly. Organizations may also benefit from the comprehensive reports and audit logs
16
they generate for use in compliance and protection analysis. This functional criterion
guarantees that the system can actively detect, monitor, and reduce security concerns.

2.2.2 User Requirements

User requirements play a critical role in shaping the functionality and


effectiveness of the proposed system, as they reflect the expectations of both
administrators and end-users. One of the primary needs is an intuitive and user-friendly
interface that simplifies data processing, analysis, and report generation. A clear and
well-structured interface allows users to quickly interpret system outputs without
requiring extensive technical expertise, thereby improving overall usability and adoption.

Users also expect the system to detect security threats accurately and in a timely
manner while maintaining a low false positive rate. Excessive or unnecessary alerts can
disrupt workflows and reduce confidence in the system. Therefore, reliable detection with
minimal false alarms is essential to ensure that users can focus on genuine security issues
without operational interruptions.

From an administrative perspective, controlled access is a key requirement. Only


authorized personnel should be able to view sensitive reports, access dashboards, or
modify system configurations. This ensures data privacy, prevents misuse, and supports
effective system governance. Additionally, administrators require configurable
parameters that allow the system to be tailored to specific organizational security policies,
operational environments, and risk thresholds.

Scalability is another important user requirement. The system must be capable of


supporting a wide range of deployment scenarios, from small-scale environments with
limited users to large enterprises with extensive networks and complex infrastructures. As
organizations grow, the system should seamlessly adapt without compromising
performance or reliability.

By addressing these user requirements while maintaining technical robustness, the


proposed system becomes more practical, trustworthy, and adaptable. Such alignment
between user expectations and system capabilities enhances ease of deployment across
different industries and ensures long-term effectiveness in real-world security

17
environments.

2.2.3 Non-Functional Requirements

The overall quality and effectiveness of the proposed system are largely
determined by its non-functional requirements, which play a vital role in ensuring
usability, performance, reliability, and long-term sustainability. Among these
requirements, security is of the highest priority, as the system is responsible for protecting
sensitive organizational data from unauthorized access and malicious activities. Strong
encryption mechanisms must be employed to safeguard data both at rest and in transit,
while robust access control policies ensure that only authorized users and processes can
interact with critical system components. Together, these measures help maintain data
confidentiality, integrity, and trust within the organization.

System performance is another essential non-functional requirement. The system


must be capable of processing large volumes of data efficiently while maintaining
minimal latency. In cloud and enterprise environments, where data is generated
continuously from multiple sources, the ability to analyze information in near real time is
crucial for timely threat detection and response. Delays in processing or alert generation
could allow security incidents to escalate, making efficiency and responsiveness key
design considerations.

Scalability is equally important, particularly in modern networks that frequently


expand to accommodate new users, devices, applications, and data sources. The system
should be designed to scale horizontally and vertically without compromising
performance or detection accuracy. This flexibility ensures that security capabilities
remain consistent as organizational needs evolve and infrastructure [Link] and
availability are critical to maintaining continuous protection. The system must operate
with minimal downtime and be resilient to failures, ensuring uninterrupted monitoring
and threat detection. High availability mechanisms, such as redundancy and fault
tolerance, help guarantee that security services remain operational even during
maintenance or unexpected system issues.

18
Maintainability and compatibility further contribute to system quality. The system
should be easy to update, debug, and enhance in response to emerging threats or
technological advancements. Additionally, seamless integration with existing
organizational tools and security platforms is essential for efficient operations and unified
security management.
2.2.4 System Requirements

A machine with 8 GB of RAM, a quad-core CPU, 500 GB of storage, Windows


or Linux operating system, Python or Java, MySQL or MongoDB, Tensor Flow, Scikit-
learn, as well as the Django or Flask framework is required for fast anomaly detection
and secure interaction.

2.3 Modules Description

Dataset Selection & Preprocessing

This CERT Insider Threat Dataset was the main source for detecting privilege escalation
threats. Here you may see logs of authentication attempts, file access, and system
interactions. Preprocessing involves cleaning, standardizing, including extracting
characteristics to guarantee that machine learning models receive meaningful input. Key
pieces of data, such as authentication logs, user activity patterns, and have irregular
access, are selected to enhance detection accuracy.

Machine Learning Model Implementation

Supervised learning methods such as Random Forest, AdaBoost, XGBoost, LightGBM,


and SVM are implemented to ascertain that a user's behavior is typical or questionable.
By applying these models to user interactions with cloud systems, it may be possible to
identify attempts at unauthorized privilege escalation. Ensemble learning can be utilized
to improve threat detection through combining the skills of numerous models to raise
classification accuracy.
Performance Evaluation
19
Some of the most important metrics used to evaluate the models include F1-score, recall,
accuracy, and precision. Findings show that LightGBM outperforms all other models in
terms of overall accuracy, whereas Random Forest plus AdaBoost excel at detecting
specific insider danger patterns. That comparative research ensured that the most
effective model will be chosen for real-world implementation.

Real-time & Offline Detection

The detection system may work in either manner: offline or in real-time. Improper
attempts to elevate privileges are promptly alerted to the user via real-time detection.
Offline anomaly detection, one the other hand, uses historical log data to identify patterns
of misuse. Combining the two approaches improves security and reduces the possibility
of hidden dangers.

2.4 Feasibility Study

I now put up a business proposal outlining the concept in broad strokes and
providing ballpark figures for the expenses in order to gauge the project's feasibility. One
important aspect of system analysis is to determine if the proposed system can be
implemented. In order to guarantee that the company's intended system will run well, this
is essential. To carry out a feasibility study, one must have a fundamental understanding
of the system's key requirements.
In doing the feasibility assessment, three primary considerations were made:
⮚ ECONOMICAL FEASIBILITY
⮚ TECHNICAL FEASIBILITY
⮚ SOCIAL FEASIBILITY

Technical Feasibility

This study assesses the technical feasibility of the system to confirm that it fulfills
all essential technical requirements. The system is designed to function efficiently while

20
utilizing existing technological resources effectively, avoiding unnecessary strain on
infrastructure. By keeping resource demands low, it reduces deployment complexity and
eliminates the need for extensive hardware or software upgrades, making it more
practical for organizations with limited technical capacity. This efficient design enhances
user confidence, as the system can consistently deliver reliable performance without
overloading available resources. Additionally, the system requires minimal
customization, allowing for straightforward deployment with little to no modifications.
This simplicity not only accelerates implementation but also ensures that the system can
be easily integrated into real-world environments. Overall, the design demonstrates both
technical practicality and adaptability, supporting efficient operation while maintaining
reliability and ease of use across diverse organizational settings.

Operational Feasibility

The aim of this study is to evaluate the potential financial benefits of the system,
including cost reductions and possible revenue enhancements for the organization. Given
the limited budget available for research and development, it is important to provide a
clear justification for the investment. To control costs, the system was mainly built using
open-source technologies, minimizing reliance on expensive software licenses or
proprietary tools. Only necessary specialized or custom components were purchased,
further reducing development expenses. This approach ensures that the system remains
cost-effective while maintaining high performance and reliability. By carefully managing
resources and leveraging affordable technologies, the system not only delivers efficient
functionality but also offers a strong return on investment. This makes the adoption of the
system both economically sensible and strategically advantageous for the organization,
supporting its operational and financial objectives without compromising quality or
effectiveness.

Behavioral Feasibility

As part of this study, user satisfaction with the system was carefully evaluated to ensure it
meets practical needs and expectations. Training was provided to help users fully

21
leverage the system’s features, making it easier for them to understand and utilize
effectively. A user-friendly interface was prioritized so that users feel comfortable and
confident interacting with the system rather than intimidated or frustrated. Adoption
levels are closely linked to the degree of training and familiarity users gain with the new
platform. By improving user confidence and understanding, individuals are better
equipped to provide meaningful feedback, which is essential for refining the system, as
users ultimately serve as the primary evaluators and beneficiaries of its functionality.

2.5 Process Model Used

The proposed approach to system development offers a structured and methodical


framework that can significantly benefit software engineering practices. By providing
detailed documentation, planned development processes, and clearly defined stages, this
methodology is particularly well-suited for creating a reliable and secure cybersecurity
system. The development process encompasses multiple stages, including requirements
analysis, system design, implementation, testing, deployment, and ongoing maintenance.
Each phase builds upon the previous one, ensuring that all aspects of the system are
carefully planned and executed. Completing each stage thoroughly before progressing to
the next helps to verify that the system functions as intended, reduces the likelihood of
errors, and ensures that security, performance, and usability requirements are fully met.
This disciplined approach not only improves the overall quality of the system but also
facilitates easier troubleshooting, updates, and long-term maintenance, making it highly
effective for critical cybersecurity applications.

2.6 Hardware and Software Requirements


➢ Processor - i5 or i7 processor
➢ RAM - 4 GB(min)
➢ Hard Disk - 500 GB
➢ Key Board - Standard Windows Keyboard
➢ Mouse - T w o or Three Button Mouse

22
➢ Monitor - SVGA 21"
SOFTWARE REQUIREMENTS:
⮚ Operating system : Windows 7 Ultimate or above.
⮚ Coding Language : Python 3.10.
⮚ Programming language : Python.
⮚ Domain : Cyber Security

2.7 SRS Specification

In cloud-based environments, the proposed system is designed to detect and


mitigate insider threats and anomalous activities effectively. As cyber attackers
continuously evolve their techniques, organizations need a security solution that is
intelligent, scalable, and capable of protecting critical data in real time. This Software
Requirements Specification (SRS) document defines the system’s functional and non-
functional requirements, core features, operational constraints, and overall objectives. By
clearly outlining these elements, the SRS provides a structured framework that guides the
design, development, and deployment of the system, ensuring it meets organizational
security needs while maintaining reliability, performance, and usability in dynamic cloud
environments.

2.8 Financial Plan for Development of Project

The recommended cyber security system's projected cost demands for design and
implementation are mentioned through your financial plan. Everything of the essential
equipment for establishing out and evaluating the hardware and software, including
servers, storage devices, other networking gear, has been accounted over outside the
budget. Investments on software included licensing, development environments, ML
frameworks, & DBMSs. Software developers, data scientists, including cyber security
specialists whose salary are incorporated into development expenditures as well as active
in the conceptualization, coding, & testing processes. In order to keep anything running
23
smoothly, operational costs including power, internet, and possibly cloud hosting further
require being taken into account.
Furthermore, certain funds have been put aside towards updates including maintenance,
meaning things like resolving bugs, installing upgrades, and providing long-term system
support. Everyone have also factored in the costs of training and documentation in order
to guarantee your administrators and users are able to utilize the platform effectively.
Having everything considered, the accounting plan provides certain the assets have been
used properly, therefore guarantees whom the product might eventually attract in the
decades afterward.

2.9 Business Plan from Commercialization

To guarantee widespread acceptance spanning diverse sectors, a strategy plan has become
necessary regarding the marketing about the recommended cyber security solution.
Businesses, financial institutions, healthcare providers, as well as government entities
with critical information and insider threat management requirements are going to be the
solution's primary target. The software is going to be available both on premises and
through a subscription model within the cloud, allowing the system to cater to a wide
range of organizations' needs.
Brand awareness and engagement may be enhanced via digital marketing campaigns,
professional networks, and conferences. Attracting early adopters will be made easy by
the offer of sample projects together with free trial versions. Collaborations between
cloud and IT service providers can boost industry visibility. The primary means of
generating income will be via subscription fees, licensing fees, and maintenance
contracts. We will increase client pleasure and retention by providing training services,
technical support, and continual enhancements. In the computer security industry, this
commercialization approach aims to achieve scalability, sustainability, and overall
competitiveness.

2.10 Business Model Canvas

Key Partners Cloud providers, IT service firms, cyber security


consultants, research institutions
24
Key Activities System development, anomaly detection, insider threat
monitoring,
maintenance, updates
Value Propositions Real-time and asynchronous threat detection, insider risk
mitigation,
scalable and secure solution, enhanced data privacy
Customer Subscription support, training, documentation,
Relationships customer care, feedback-based improvements
Customer Enterprises, financial institutions, healthcare, government
Segments agencies, SMEs, educational institutions
Key Resources Skilled developers, ML frameworks (TensorFlow, Scikit-learn),
secure databases, cloud infrastructure

25
DESIGN PHASE

3.1 Design Concepts & Constraints

Trustworthiness, transparency, and safety are the cornerstones of the proposed


online testing platform. All student work, grades, including submission timestamps are
securely stored in the system thanks to immutable distributed ledger technology. Once
data pertaining to an exam has been securely stored, this ensures that it remains intact.
Automated grading based on predefined criteria is possible without smart contracts, which
makes the process more equitable and less prone to human prejudice. Furthermore, the
exam platform is safeguarded against fraudulent use and impersonation by including
biometric identification, which includes digital identity verification, to guarantee that
access is granted to only authorized students. In order to efficiently store and retrieve large
test files, the technology employs IPFS (Inter Planetary File System) and blockchain
working tandem to address large storage needs; this places a burden upon the network of
blockchain computers. Scalability issues, very high computational costs, network latency,
and stringent regulatory compliance are some of the architectural constraints that must
have taken into account. Overcoming such challenges is crucial for ensuring smooth
adaptation in educational institutions.

3.2 Design Diagram of the System

1. The DFD was also known as a bubble chart at one point. The following simple diagram
shows the data which enters a system, the various operations performed on the
information, and the data coming out of the system.
2. Data flow diagrams are a must-have for every modeling project. That is the
representation of the whole system. The elements of a system include the system itself,
data utilized by the system's processes, an external entity that interacts with the system,
and the flow of information inside the system.
3. The data flow diagram (DFD) illustrates the transformations and transfers of data as a
consequence of different processes. The latter is a visual representation of the data flow
from input to output, revealing changes as the data travels through the pipeline.

26
4. A bubble chart is another name for a DFD. A data flow diagram may be used to
illustrate a system at various levels of abstraction. It is possible to create data flow
diagrams (DFDs) with phases that illustrate the evolution of functional specialization and
dataflow diagram.

Figure: 3.2 Dataflow Diagram

27
3.3 Conceptual Design

3.3.1 Class Diagram

According to Class Diagram (3.3.1) shows, Users sign up, log in, and contribute
datasets or emails over analysis as shown in the Class diagram. To enable the Escalation
Attack Detection element to make attack predictions, the Dataset module takes charge for
managing the provided CSV data. The Reporting module receives the findings and
produces a comprehensive report to feed the user.

Figure: 3.3.1. Class Diagram

28
3.3.2 Sequence Diagram

According to sequence diagram (3.3.2), web interface, and system interact with
one other. Initiating the authentication process, the user logs in, and subsequently the
system provides access. A dataset stays in the system once the user submits it. We process
the emails that are submitted during detection and then return the predictions. Lastly, a
report is created by the user and subsequently shown via the interface when the system
retrieves it.

Figure: 3.3.2 Sequence Diagram

29
3.3.3 Activity Diagram

Activity Diagram (3.3.3), shows the flowchart illustrates the system’s operating
procedure, commencing with user login or registration. Following authentication is
successful, the user uploads a dataset, whereupon the system stores & confirms. The
consumer then sends an email during attack detection, and the system analyzes it to
produce a forecast. subsequently, the user prepares a report, then the system gets the
relevant data and shows the finished report.

Figure: 3.3.3. Activity Diagram

30
3.3.4 Use Case Diagram:

The Use Case diagram (3.3.4) shows, the Unified Modeling Language (UML) specifies
and produces use case diagrams, a subset of behavioral diagrams. The goal of this
application is to provide a graphical representation of a system's capabilities in relation
with its users, their intended outcomes (use cases), plus the interconnections between
these use cases. Finding out who is accountable for carrying out certain system actions is
the main purpose of the use case diagram. One approach of depicting the many actors
inside the framework was.

Figure: 3.3.4. use case diagram

31
3.4 Logical Design

The logical design of the blockchain-based online testing system focuses on


secure data management and efficient processing without delving into the physical
implementation details. All inputs from students, such as test responses, scores, or
biometric data, are validated and processed systematically by the system. Smart contracts
automatically evaluate submissions according to predefined rules, while securely
recording results and timestamps on the blockchain to ensure integrity. Integration with
the Inter Planetary File System (IPFS) allows for fast storage and retrieval of large test
files, enhancing system efficiency. Dedicated interfaces enable seamless interaction
among students, instructors, and administrators, supporting communication and system
navigation. The logical workflow comprising data submission, verification, grading, and
result distribution ensures transparency, fairness, and immutability throughout the
examination process. By structuring data flow and validation in this way, the system
provides a trustworthy and reliable platform for online assessments, safeguarding
academic integrity while leveraging the benefits of blockchain technology.

3.5 Architectural Design

The architectural design of the proposed system aims to enhance the credibility and
reliability of online exam results by leveraging a decentralized blockchain-based
architecture. This approach ensures the integrity, security, and transparency of all data
throughout the examination process. All exam-related information—including student
submissions, grades, and timestamps—is securely stored on the blockchain, taking
advantage of its immutable ledger to prevent unauthorized modifications or data
tampering.

Decentralized smart contracts play a critical role in automating the grading process
according to predefined criteria. Each action and update is permanently recorded on the
blockchain, ensuring that all steps are traceable and auditable. To further strengthen
security, access to exams is restricted to authorized students using biometric
authentication combined with digital identity verification methods. This prevents
unauthorized access and guarantees that only legitimate participants can take the
assessments.
32
The system also integrates decentralized file storage solutions, such as the InterPlanetary
File System (IPFS), to manage large test files efficiently. By distributing file storage
across multiple nodes, the system avoids overloading the blockchain while maintaining
fast and secure access to exam data. This combination of blockchain and decentralized
storage ensures both scalability and reliability.

Transparency and auditability are central to the design, allowing students, instructors, and
administrators to independently verify submissions and results. The fully traceable
workflow eliminates the possibility of fraud, manipulation, or bias, fostering confidence
in the examination process. These architectural enhancements create a secure,
accountable, and trustworthy online testing environment, reinforcing the integrity of
assessments and promoting confidence among all stakeholders. The design supports
scalability, security, and fairness, making the system suitable for widespread adoption in
educational institutions and other assessment-driven organizations.

Figure :3.5 System Architecture

33
3.6 Algorithms Design

I created this approach to discover insider threats for cloud systems that merging
synchronous anomaly detection alongside real-time monitoring. Example a first step, we
collect a variety of log data, such as authentication, access, file action, and network
activity records. After processing, normalizing, and feature extracting such raw data, you
may look for crucial insider indicators like unusual access times, data transfers, or
attempts to elevate privileges without permission.
The detection framework consists of two additional layers:
Detection of Anomalies that Occur Late Periodically, log data are analyzed using
anomaly detection technologies to spot suspicious patterns of activity. Identifying
behavioral shifts over time is done using machine learning classifiers, such as Random
Forest, LightGBM, and SVM.
Stream-based analysis enables real-time threat analysis by continuously monitoring
system activity immediately identifying dangerous actions such as mass file deletion or
unauthorized access. Immediate alerts are activated to prevent further injury.
Ensemble learning models (ELMs) enhance accuracy via broad generalization, whereas
support vector machines (SVMs) enhance recognition of small insider behaviors. The
mix of methods sorts user actions into three categories: normal, suspicious, and
malicious, and then uses that information to provide a threat probability score that may be
used to support rapid mitigation strategies.

3.7 Database Design

The database architecture of the proposed system is designed to securely store, retrieve,
and reliably process sensitive data related to user activities, logs, and threat detection
results. Relational database models, such as MySQL or PostgreSQL, are frequently
employed since to their reliability and capabilities above structured queries. However, a
NoSQL integration, like MongoDB, could be employed for large-scale unfiltered log data
storage.

34
3.8 Module Design Specification

Dataset Selection & Preprocessing

The CERT Internal Threat Dataset was the main source for detecting privilege escalation
threats. Here you may see logs of authentication attempts, file access, other system
interactions. Preprocessing involves cleaning, standardizing, and extracting
characteristics to guarantee appropriate machine learning models receive meaningful
input. Authentication logs, user activity patterns, and login irregularities are selected as
crucial pieces of information to increase the detection accuracy.

Machine Learning Model Implementation

Supervised learning methods such as Random Forest, AdaBoost, XGBoost, LightGBM,


and SVM can be utilized to ascertain that a user's behavior is typical or questionable. By
applying these models to user interactions with cloud systems, it may be possible to
identify attempts at unauthorized privilege escalation. By integrating the strengths of
many models, ensemble learning improves threat detection by causing classifications
more accurate.

Performance Evaluation

Some of the most important metrics used to evaluate the models include F1-score, recall,
accuracy, and precision. Findings show that LightGBM outperforms all other models in
terms of overall accuracy, however Random Forest and AdaBoost excel at detecting
specific insider threat patterns. This comparative research ensured that the most
successful model will be chosen for real-world implementation.

35
Real-time & offline Detection

The detection system may work in one of two ways: offline or in real-time. Improper
attempts to elevate privileges are promptly alerted to the individual responsible via real-
time detection. Internet-based anomaly detection, upon the other hand, uses historical log
data to recognize patterns of misuse and reduces the possibility of hidden dangers.

36
37
CODING & OUTPUT SCREENS

4.1 SOURCE CODE

SAMPLE CODING

from tkinter import *


import tkinter
from tkinter import filedialog
from tkinter import simpledialog
from tkinter import ttk
from [Link] import askopenfilename
import os
import cv2
import soundfile
import librosa
import numpy as np
from [Link] import to_categorical
from [Link] import MaxPooling2D
from [Link] import Dense, Dropout, Activation, Flatten, GlobalAveragePooling2D,
BatchNormalization, AveragePooling2D, Input, Conv2D,
UpSampling2D from [Link] import Convolution2D
from [Link] import Sequential, load_model, Model
import pickle
from sklearn.model_selection import train_test_split
from [Link] import ModelCheckpoint
import keras
from [Link] import accuracy_score
from numpy import dot
from [Link] import norm

main = [Link]()
[Link]("Face Images Synthesis from Speech Using Conditional Generative Adversarial
Network") #designing main screen
[Link]("1000x650")

global filename, model, X, Y, names, model

def extract_feature(file_name, mfcc, chroma, mel):


with [Link](file_name) as sound_file:
X = sound_file.read(dtype=np.float32)
sample_rate=sound_file.samplerate
if chroma:
stft=[Link]([Link](X))
result=[Link]([])
if mfcc:
mfccs=[Link]([Link](y=X, sr=sample_rate, n_mfcc=40).T, axis=0)
result=[Link]((result, mfccs))
if chroma:
chroma=[Link]([Link].chroma_stft(S=stft, sr=sample_rate).T,axis=0)
result=[Link]((result, chroma))
if mel:
mel=[Link]([Link](X, sr=sample_rate).T,axis=0)
result=[Link]((result, mel))
sound_file.close()
return result
def loadDataset():
global filename, dataset
filename = [Link](initialdir=".")
[Link]('1.0', END)
[Link](END,filename+" loaded\n\n")
def cleanDataset():
global X, Y, names,
filename [Link]('1.0',
END)

39
X = []
Y = []
names = []
for root, dirs, directory in
[Link](path): for j in
range(len(directory)):
mfcc = extract_feature(root+"/"+directory[j], mfcc=True, chroma=True, mel=True)
name = directory[j].replace(".wav", ".jpg")
img =
[Link]("Dataset/Faces/"+name) img
= [Link](img, (200, 200))
mfcc = [Link](mfcc, (10, 6, 3))
mfcc = [Link](mfcc, (200,
200)) [Link](mfcc)
[Link](img)
[Link](mfcc)
[Link](img)
[Link](directory
[j])
[Link](directory
[j])
print(str([Link])+"
"+str([Link])) X = [Link](X)
Y = [Link](Y)
names = [Link](names)
[Link]('model/[Link]',X)
[Link]('model/[Link]',Y)
[Link]("model/[Link]",
names)
[Link](END,"Face & Speech features extraction completed from dataset\n")
[Link](END,"Extracted Speech Features = "+str(X[0])+"\n\n")
def trainModel():
[Link]('1.0', END)
global X, Y, names,
model
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2) #split dataset into train
and test
input_img = Input(shape=(200, 200, 3))
x = Conv2D(64, (3, 3), activation='relu', padding='same')
(input_img) x = MaxPooling2D((2, 2), padding='same')(x)

x = Conv2D(32, (3, 3), activation='relu',


padding='same')(x) x = MaxPooling2D((2, 2),
padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu',
padding='same')(x) generator = MaxPooling2D((2, 2),
padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')
(generator) x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu',
padding='same')(x) x = UpSampling2D((2, 2))(x)
x = Conv2D(64, (3, 3), activation='relu',
padding='same')(x) x = UpSampling2D((2, 2))(x)
discriminator = Conv2D(3, (3, 3), activation='sigmoid',
padding='same')(x) model = Model(input_img, discriminator)
[Link](optimizer='adam', loss='mean_squared_error')
# Compile the model
[Link](optimizer='adam',
loss='binary_crossentropy') if
[Link]("model/gan_weights.hdf5") == False:
model_check_point = ModelCheckpoint(filepath='model/gan_weights.hdf5', verbose = 1,
save_best_only = True)
hist = [Link](X_train, y_train, batch_size = 64, epochs = 20, validation_data=(X_test,
y_test), callbacks=[model_check_point], verbose=1)
f = open('model/gan_history.pckl', 'wb')
[Link]([Link], f)
[Link]()
else:
model.load_weights("model/gan_weights.hdf5")
[Link](END,"CGAN Model Training Completed on Face & Speech
Features") def getScore(mfcc):
41
print([Link]
e) predict =
None accuracy
= 0 person =
None
for i in range(len(X)):
acc = dot(X[i].ravel(),
mfcc)/(norm(X[i].ravel())*norm(mfcc)) if acc > accuracy:
if acc < 1:
accuracy =
acc
predict = Y[i]
person =
names[i]
person = [Link](".")
[0] return accuracy, predict,
person
def generateFaces():
[Link]('1.0', END)
global X, Y, names,
model
filename = [Link](initialdir="testSpeech")
mfcc = extract_feature(filename, mfcc=True, chroma=True,
mel=True) mfcc = [Link](mfcc, (10, 6, 3))
mfcc = [Link](mfcc, (200,
200)) test = []
[Link](mfcc)
test =
[Link](test)
predict = [Link](test)
accuracy, predict, person =
getScore([Link]()) predict =
[Link](predict, (300, 300))
[Link](predict, 'Accuracy : '+str(accuracy), (10, 25),
cv2.FONT_HERSHEY_SIMPLEX,0.7, (255, 0, 0), 2)
42
[Link](predict, 'Person ID : '+str(person), (10, 55),
cv2.FONT_HERSHEY_SIMPLEX,0.7, (255, 0, 0), 2)
[Link]("Generated Face", predict)
[Link](0)
def close():
[Link]()
font = ('times', 16, 'bold')
title = Label(main, text='Face Images Synthesis from Speech Using Conditional Generative
Adversarial Network', justify=LEFT)
[Link](bg='lavender blush',
fg='DarkOrchid1') [Link](font=font)
[Link](height=3,
width=120)
[Link](x=100,y=5)

[Link]()

font1 = ('times', 13, 'bold')


uploadButton = Button(main, text="Upload VoxCeleb Dataset", command=loadDataset)
[Link](x=10,y=100)
[Link](font=font1)
processButton = Button(main, text="Normalized & Shuffle Dataset", command=cleanDataset)
[Link](x=310,y=100)
[Link](font=font1)
cganButton = Button(main, text="Train Propose CGAN Model", command=trainModel)
[Link](x=570,y=100)
[Link](font=font1)
predictButton = Button(main, text="Generate Faces from Speech", command=generateFaces)
[Link](x=10,y=150)
[Link](font=font1)
exitButton = Button(main, text="Exit",
command=close) [Link](x=330,y=150)
[Link](font=font1)
font1 = ('times', 12, 'bold')
text=Text(main,height=22,width=140)
scroll=Scrollbar(text)
43
[Link](yscrollcommand=[Link]
) [Link](x=10,y=200)
[Link](font=font1)
[Link](bg='light coral')
[Link]()

44
4.2 Results

Figure: 4.2.1 Home Screen

The "Privilege Escalation Attack Detection and Mitigation in Cloud Using Machine
Learning" system provides a secure and intuitive user interface, beginning with a protected
login screen. The interface features a cybersecurity-themed design, incorporating cloud
visuals and clear registration options. Users are presented with straightforward fields for
entering their username and password, ensuring easy and secure access to the cloud-based
detection platform. This design prioritizes simplicity and usability, allowing users to
quickly log in while maintaining strong security measures. By combining visual clarity
with functional security elements, the interface offers a reliable entry point for interacting
with the system’s machine learning–based threat detection and mitigation features.

45
Figure: 4.2.2 Dataset Training & Testing Results

The table presents the results of both training and testing for multiple
machine learning models. Among them, Support Vector Machines (SVM) achieved the
highest accuracy at 98.01%, outperforming the other models. Random Forest and Decision
Tree classifiers also demonstrated strong performance, each attaining accuracies above
95%. K-Nearest Neighbors (KNN) and Gradient Boosting models provided consistent and
reliable results, with both achieving accuracies exceeding 93%. These findings indicate that
while SVM delivers the most precise predictions for the given dataset, ensemble and tree-
based methods like Random Forest and Decision Tree also offer robust performance.
Overall, all evaluated models show strong potential for detecting and mitigating privilege
escalation attacks in cloud environments.

46
Figure: 4.2.3 Visualization Results

The bar chart compares the performance of various machine learning models for attack
detection. Support Vector Machines (SVM) achieved the highest accuracy at 98.02%,
outperforming all other models. Decision Tree and Random Forest classifiers followed
closely, with accuracies ranging between 95% and 96%. Gradient Boosting and K-Nearest
Neighbors (KNN) also demonstrated strong and consistent performance, each exceeding
93% accuracy. Overall, the chart highlights that while SVM delivers the most precise
detection, tree-based and ensemble methods like Random Forest, Decision Tree, and
Gradient Boosting provide reliable alternatives, making them suitable for identifying and
mitigating privilege escalation attacks in cloud environments.
47
Figure: 4.2.4 Detection Results

The table presents the results of privilege escalation attack detection across various email
categories. Legitimate emails, such as academic announcements and regular language
usage, are correctly identified as “Escalation Attack Not Found,” demonstrating accurate
differentiation of safe content. In contrast, suspicious messages, including those generated
by bulk email software, are correctly flagged as “Escalation Attack Found.” These results
indicate that the system effectively distinguishes between benign and potentially harmful
emails, ensuring reliable detection of privilege escalation attempts while minimizing false
positives for normal correspondence.

4.3 Discussions

The created system for detecting privilege escalation attacks in cloud settings is
very reliable and performs admirably according to its assessment, which depends on
machine learning. It guarantees regulated utilization of cloud services via providing a safe
& user- friendly login platform. Among the models tested experimentally, SVM
outperforms the others in recognizing risks associated with escalation, with an accuracy
around 98%. Other models were Random Forest, Decision Tree, Gradient Boosting, as
well as K Neighbors Classifier. The fact that other models consistently and reliably
achieve accuracy scores over 93% is indicative of their strong performance.

48
By graphically demonstrating that SVM outperformed Decision Tree & Random
Forest, the arrow in the graph adds credence to this comparison. By successfully
differentiating between harmful and normal communications, the findings from the email-
based attack prediction module showcase the model's practical usefulness. Academic and
informative emails are appropriately tagged as safe, however suspicious material like
mass email program advertising is designated as an attack. Considering it successfully
learns patterns linked to privilege escalation attempts while decreasing false positives has
become a sign of its effectiveness.

49
IMPLEMENTATION

5.1 Implementation Introduction

Throughout the implementation phase, a crucial process, the proposed design is


turned into a working system. During this phase of application development, tasks such
as programming overall module setup are included. Additionally, machine learning
algorithms that may identify internal threats and block chain protocols that provide
transparency along with security are included. The database was created to make it easy
to store, retrieve, and manage sensitive information. The use of smart contracts also
biometric identification helps to increase reliability and simplify operations. A system's
accuracy, efficiency, and scalability may be assured via rigorous testing. Connecting
theoretical design with practical application, implementation finally delivers a secure,
transparent, and efficient method of gaining real-time use.

5.2 Implementation Procedure & Steps

1. Data Collection & Preprocessing Module

Log data, such as API calls, file access logs, user authentication records, and system
interactions, are collected by this component from cloud environments. In order to
prepare the data into machine learning analysis, it is preprocessed by filling within
missing values, standardizing characteristics, and identifying important features such as
trends in user conduct and access abnormalities.

2. Feature Extraction & Selection Module

Key characteristics are retrieved, including the frequency of logins, patterns of resource
use, and role-based entitlements. In order to reduce model complexity and improve
detection accuracy, feature selection approaches may be used to identify the most
essential features which trigger privilege escalation assaults.

50
[Link] Learning Model Training Module

This lesson teaches several ML models to identify typical and unusual user actions, such
as Random Forest, AdaBoost, XGBoost, and LightGBM. Anomaly patterns suggesting
attempts for privilege escalation are detected by the models as they learn from prior
attack data and adapt.

[Link]-Time & Offline Detection Module

This component may be used in two ways: first, in real-time, by constantly watching user
actions and sending out warnings when something out of the ordinary is detected; second,
offline, by looking through logs to find patterns of privilege escalation over time. Both
methods improve the efficacy of security in the long run.

[Link] & Reporting Module

An interactive dashboard showing security logs, analytical reports, along with real-time
threat warnings is provided by this module. Accurate insights, vulnerability tracking, and
reports for forensic investigation and compliance audits are all under the purview of
security administrators.

5.3User Manual

This method utilizes block chain technology and machine learning to securely identify
insider threats. Logging in is the first step in the authorization process, which includes
digital identity verification and biometric authentication. Following importing datasets in
CSV or JSON format, users may train and test their models using popular methods such
as LightGBM, Random Forest, and SVM. Accuracy metrics plus visual representations of
the results are provided. In real time, the identification module records any suspicious
insider activity and saves the results in a database. Efficient report generation, trend
viewing, and threat monitoring is possible for users. To get more information, check out
the built-in help area.

51
CONCLUSIONS AND FUTURE SCOPE OF WORK

6.1 Conclusions
The malicious insider poses a significant threat to the organization due to their
elevated access and capacity to wreak significant damage. Unlike outsiders, the ones on
their own have official and privileged access to every data and resources. The aim of this
project set out to develop machine learning algorithms with the capability to detect and
classify insider assaults. This research made use of a dataset which has been modified
from several files taken out of the CERT dataset. Outcomes were better when four
separate machine learning methods were applied to the dataset. The methods that were
being discussed here are LightGBM, XGBoost, Random Forest, and AdaBoost. Using
these supervised machine learning methods enhanced classification accuracy, according
to the experimental results in that research. Among the competing methods, LightGBM
has the highest success rate at 97%, followed by RF at 86%, AdaBoost at 88%, and
XGBoost at 88.27%. The proposed models may have improved future performance if the
dataset is expanded to include additional variables and new insider threat patterns are
tracked. This might lead to fresh research directions for identifying and classifying insider
attacks in various organizational contexts.
Companies that use machine learning models make more reliable business
decisions, and those decisions are better because of the improved model results. Reducing
the potentially substantial cost of mistakes is possible via increasing model accuracy.
According to studies, ML enables humans to input computer algorithms massive amounts
of data, as well as the algorithms may then use that data to provide suggestions,
assessments, and conclusions.

52
6.2 Future Scope of Work

Possible future developments include making threat identification better in real-


time, using deep learning for anomaly detection, increasing generalization, and adapting
to changing threats through adaptive learning. It is suggested that, in order to enhance
privilege escalation detection in dynamic cloud environments, cross-cloud security
frameworks be developed, explainability be enhanced, and integration with advanced
SIEM systems be made.

53
REFERENCES

[1] U. A. Butt, R t, R. Amin, H. Aldabbas, S. Mohan, B. Alouffi, and A. Ahmadian, ‘‘Cloud-


based email phishing attack using machine and deep learning algorithm,’’ Complex Intell. Syst.,
pp. 1–28, Jun. 2022.
[2] D. C. Le and A. N. Zincir-Heywood, ‘‘Machine learning based insider threat modelling
and detection,’’ in Proc. IFIP/IEEE Symp. Integr. Netw. Service Manag. (IM), Apr. 2019, pp. 1–
6.
[3] P. Oberoi, ‘‘Survey of various security attacks in clouds based environments,’’ Int. J.
Adv. Res. Comput. Sci., vol. 8, no. 9, pp. 405–410, Sep. 2017.
[4] A. Ajmal, S. Ibrar, and R. Amin, ‘‘Cloud computing platform: Performance analysis of
prominent cryptographic algorithms,’’ Concurrency Comput., Pract. Exper., vol. 34, no. 15,
p. e6938, Jul. 2022.
[5] U. A. Butt, R. Amin, M. Mehmood, H. Aldabbas, M. T. Alharbi, and N. Albaqami,
‘‘Cloud security threats and solutions: A survey,’’ Wireless Pers. Commun., vol. 128, no. 1,
pp. 387–413, Jan. 2023.
[6] H. Touqeer, S. Zaman, R. Amin, M. Hussain, F. Al-Turjman, and M. Bilal, ‘‘Smart home
security: Challenges, issues and solutions at different IoT layers,’’ J. Supercomput., vol. 77, no.
12,
pp. 14053–14089, Dec. 2021.
[7] S. Zou, H. Sun, G. Xu, and R. Quan, ‘‘Ensemble strategy for insider threat detection from
user activity logs,’’ Comput., Mater. Continua, vol. 65, no. 2, pp. 1321–1334, 2020.
[8] G. Apruzzese, M. Colajanni, L. Ferretti, A. Guido, and M. Marchetti, ‘‘On
theeffectiveness of machine and deep learning for cyber security,’’ in Proc. 10th Int. Conf. Cyber
Conflict (CyCon), May 2018, pp. 371–390.
[9] D. C. Le, N. Zincir-Heywood, and M. I. Heywood, ‘‘Analyzing data granularity levels
for insider threat detection using machine learning,’’ IEEE Trans. Netw. Service Manag., vol. 17,
no. 1, pp. 30–44, Mar. 2020.
[10] F. Janjua, A. Masood, H. Abbas, and I. Rashid, ‘‘Handling insider threat through
supervised machine learning techniques,’’ Proc. Comput. Sci., vol. 177, pp. 64–71, Jan. 2020.
[11] R. Kumar, K. Sethi, N. Prajapati, R. R. Rout, and P. Bera, ‘‘Machine learning based
malware detection in cloud environment using clustering approach,’’ in Proc. 11th Int. Conf.
Comput., Commun. Netw. Technol. (ICCCNT), Jul. 2020, pp. 1–7.
[12] D. Tripathy, R. Gohil, and T. Halabi, ‘‘Detecting SQL injection attacks in cloud SaaS
using machine learning,’’ in Proc. IEEE 6th Int. Conf. Big Data Secur. Cloud (BigDataSecurity),
Int. Conf. High Perform. Smart Comput., (HPSC), IEEE Int. Conf. Intell. Data Secur. (IDS), May

54
2020,
pp. 145–150.

55
[13] X. Sun, Y. Wang, and Z. Shi, ‘‘Insider threat detection using an unsupervised learning
method: OPOD,’’ in Proc. Int. Conf. Commun., Inf. Syst. Comput. Eng. (CISCE), May 2021, pp.
749–754.
[14] J. Kim, M. Park, H. Kim, S. Cho, and P. Kang, ‘‘Insider threat detection based on user
behavior modeling and anomaly detection algorithms,’’ Appl. Sci., vol. 9, no. 19, p. 4018, Sep.
2019.
[15] L. Liu, O. de Vel, Q.-L. Han, J. Zhang, and Y. Xiang, ‘‘Detecting and preventing cyber
insider threats: A survey,’’ IEEE Commun. Surveys Tuts., vol. 20, no. 2, pp. 1397–1417, 2nd
Quart., 2018.
[16] P. Chattopadhyay, L. Wang, and Y.-P. Tan, ‘‘Scenario-based insider threat detection
from cyber activities,’’ IEEE Trans. Computat. Social Syst., vol. 5, no. 3, pp. 660–675, Sep.
2018.
[17] G. Ravikumar and M. Govindarasu, ‘‘Anomaly detection and mitigation for wide-area
damping control using machine learning,’’ IEEE Trans. Smart Grid, early access, May 18, 2020,
doi: 10.1109/TSG.2020.2995313.
[18] M. I. Tariq, N. A. Memon, S. Ahmed, S. Tayyaba, M. T. Mushtaq, N. A. Mian, M.
Imran, and M. W. Ashraf, ‘‘A review of deep learning security and privacy defensive
techniques,’’ Mobile Inf. Syst., vol. 2020, pp. 1–18, Apr. 2020.
[19] D. S. Berman, A. L. Buczak, J. S. Chavis, and C. L. Corbett, A survey of deep learning
methods for cyber security,’ Information, vol. 10, no. 4, p. 122, 2019.
[20] N. T. Van and T. N. Thinh, ‘‘An anomaly-based network intrusion detection system
using deep learning,’’ in Proc. Int. Conf. Syst. Sci. Eng. (ICSSE), 2017, pp. 210–214.
[21] G. Pang, C. Shen, L. Cao, and A. V. D. Hengel, Deep learning for anomaly detection: A
review,’ ACM Comput. Surv., vol. 54, no. 2, pp. 1–38, Mar. 2021.
[22] A. Arora, A. Khanna, A. Rastogi, and A. Agarwal, ‘‘Cloud security ecosystem for data
security and rivacy,’’ in Proc. 7th Int. Conf. Cloud Comput., Data Sci. Eng., Jan. 2017, pp. 288–
292.
[23] L. Coppolino, S. D’Antonio, G. Mazzeo, and L. Romano, Cloud security: Emerging
threats and current solutions,’ Comput. Electr. Eng., vol. 59, pp. 126–140, Apr. 2017.
[24] M. Abdelsalam, R. Krishnan, Y. Huang, and R. Sandhu, Malware detection in cloud
infrastructures using convolutional neural networks,’ in Proc. IEEE 11th Int. Conf. Cloud
Comput. (CLOUD), Jul. 2018, pp. 162–169.
[25] F. Jaafar, G. Nicolescu, and C. Richard, ‘‘A systematic approach for privilege escalation
prevention,’’ in Proc. IEEE Int. Conf. Softw. Quality, Rel. Secur. Companion (QRS-C), Aug. 2016,
pp. 101–108.
[26] N. Alhebaishi, L. Wang, S. Jajodia, and A. Singhal, ‘‘Modeling and mitigating the insider

56
threat of remote administrators in clouds,’’ in Proc. IFIP Annu. Conf. Data Appl. Secur. Privacy.
Bergamo, Italy: Springer, 2018, pp. 3–20.
[27] F. Yuan, Y. Cao, Y. Shang, Y. Liu, J. Tan, and B. Fang, Insider threat detection with
deep neural network,’ in Proc. Int. Conf. Comput. Sci. Wuxi, China: Springer, 2018, pp. 43– 54.
[28] I. A. Mohammed, ‘‘Cloud identity and access management A model proposal,’’ Int. J.
Innov. Eng. Res. Technol., vol. 6, no. 10, pp. 1–8, 2019.
[29] F. M. Okikiola, A. M. Mustapha, A. F. Akinsola, and M. A. Sokunbi, A new framework
for detecting insider attacks in cloud-based e-health care system,’ in Proc. Int. Conf. Math.,
Comput. Eng. Comput. Sci. (ICMCECS), Mar. 2020, pp. 1–6.
[30] G. Li, S. X. Wu, S. Zhang, and Q. Li, ‘‘Neural networks-aided insider attack detection for
the average consensus algorithm,’’ IEEE Access, vol. 8, pp. 51871–51883, 2020.
[31] A. R. Wani, Q. P. Rana, U. Saxena, and N. Pandey, ‘‘Analysis and detection of DDoS
attacks on cloud computing environment using machine learning techniques,’’ in Proc. Amity Int.
Conf. Artif. Intell. (AICAI), Feb. 2019, pp. 870–875.
[32] N. M. Sheykhkanloo and A. Hall, ‘‘Insider threat detection using supervised machine
learning algorithms on an extremely imbalanced dataset,’’ Int. J. Cyber Warfare Terrorism, vol.
10, no. 2, pp. 1–26, Apr. 2020.
[33] M. Idhammad, K. Afdel, and M. Belouch, ‘‘Distributed intrusion detection system for
cloud environments based on data mining techniques,’’ Proc. Comput. Sci., vol. 127, pp. 35– 41,
Jan. 2018.
[34] P. Kaur, R. Kumar, and M. Kumar, ‘‘A healthcare monitoring system using random
forest and Internet of Things (IoT),’’ Multimedia Tools Appl., vol. 78, no. 14, pp. 19905– 19916,
2019.
[35] J. L. Leevy, J. Hancock, R. Zuech, and T. M. Khoshgoftaar, ‘‘Detecting cybersecurity
attacks using different network features with LightGBM and XGBoost learners,’’ in Proc. IEEE
2nd Int. Conf. Cognit. Mach. Intell. (CogMI), Oct. 2020, pp. 190–197.
[36] R. A. Alsowail and T. Al-Shehari, ‘‘Techniques and countermeasures for preventing
insider threats,’’ PeerJ Comput. Sci., vol. 8, p. e938, Apr. 2022.
[37] B. Alouffi, M. Hasnain, A. Alharbi, W. Alosaimi, H. Alyami, and M. Ayaz, ‘‘A
systematic literature review on cloud computing security: Threats and mitigation strategies,’’
IEEE Access, vol. 9, pp. 57792–57807, 2021.

57
PUBLICATION
Kande. Vijaya Santhi, P. Neela Sundari, (2025) Privilege Escalation Attack Detection
and Mitigation in Cloud Using Machine Learning, Journal of Neonatal Surgery,14(4),
536-541

58
BIBILOGRAPHY

1. W. Stallings, Network Security Essentials: Applications and Standards, 6th ed.


Pearson Education, 2017.
2. R. L. Krutz and R. D. Vines, Cloud Security: A Comprehensive Guide to
Secure Cloud Computing. Wiley Publishing, 2010.
3. M. Bishop, Computer Security: Art and Science, 2nd ed. Addison-Wesley, 2018.
4. C. P. Pfleeger, S. L. Pfleeger, and J. Margulies, Security in Computing, 5th ed.
Pearson Education, 2015.
5. K. Hashizume, D. G. Rosado, E. Fernández-Medina, and E. B. Fernandez, Cloud
Computing Security Issues and Solutions. Springer Publishing, 2014.

WEBSITES VISITED

[1] Wikipedia. (2024). Adversarial machine learning. In Wikipedia.


[Link]
[2] Elastic. (n.d.). privilege escalation ml windows rare user run as event detection
rule. GitHub. [Link]

59

You might also like