0% found this document useful (0 votes)
32 views5 pages

Python Algorithms for Cybersecurity

The document discusses the use of Python for algorithm design in cybersecurity, highlighting its advantages such as extensive libraries, clean syntax, and dynamic typing. It emphasizes the importance of algorithm design for predictive policing and cybersecurity applications, detailing various algorithms like quicksort and binary search, and their implementations. The paper also explores the performance of algorithms across different programming languages, showcasing Python's efficiency in handling cybersecurity tasks.

Uploaded by

Palani Manoj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views5 pages

Python Algorithms for Cybersecurity

The document discusses the use of Python for algorithm design in cybersecurity, highlighting its advantages such as extensive libraries, clean syntax, and dynamic typing. It emphasizes the importance of algorithm design for predictive policing and cybersecurity applications, detailing various algorithms like quicksort and binary search, and their implementations. The paper also explores the performance of algorithms across different programming languages, showcasing Python's efficiency in handling cybersecurity tasks.

Uploaded by

Palani Manoj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

See discussions, stats, and author profiles for this publication at: [Link]

net/publication/336406416

Algorithm design in Python for cybersecurity

Conference Paper · September 2019

CITATIONS READS

2 13,752

4 authors, including:

Kristijan Kuk Petar Milic


University of Criminal Investigation and Police Studies University of Priština - Kosovska Mitrovica
67 PUBLICATIONS 232 CITATIONS 47 PUBLICATIONS 173 CITATIONS

SEE PROFILE SEE PROFILE

Milan Gocic
University of Nis
99 PUBLICATIONS 3,888 CITATIONS

SEE PROFILE

All content following this page was uploaded by Petar Milic on 10 October 2019.

The user has requested enhancement of the downloaded file.


Algorithm design in Python for cybersecurity
Kristijan Kuk1, Petar Milić2, Petar Spalević2, Milan Gocić3
1
University of Criminalistic and Police Studies, Department of Computer Science
2
University of Priština – Kosovska Mitrovica, Faculty of Technical Sciences
3
University of Niš, Faculty of Civil Engineering and Architecture
E-mail: [Link]@[Link]

Abstract. Python is one of the most sought-after of the students. Before the algorithmic implementation,
programming languages for cybersecurity due to the it is necessary ti choose the programming language.
extensive library of powerful packages that support The police apply statistical or machine learning
rapid application development, clean syntax code and algorithms to data from police records on time, location,
modular design as well as automatic memory and nature of past crimes, to look for potential patterns
management and dynamic typing capability. A good in order to find possible occurrence of crime in the
understanding of algorithm design is very important future. Predictive policing does not replace conventional
because when a flaws is found out in the written code, policing methods (e.g. problem-oriented policing,
then a step back to the design phase is necessary and intelligence-led policing or hotspot policing) but
redesign of the algorithm is an obvious need. In this enhances these traditional practices by applying
paper, we describe how Python can be used to develop advanced statistical models and algorithms for National
algorithms and implement his modules to be applicable Institute of Justice (NIJ) in 2014. Perry (2013) claims
in the cybersecurity domain. Using suggested that predictive algorithms [1] can be used to identify
approaches in algorithm design for cybersecurity, their members of criminal groups that show an elevated risk
existing capabilities are increased for tweaking, of a violence outbreak between them [2].
customizing, or outright developing of own tools. Malicious pattern detection engine is a real life
example of searching and sorting algorithms [3]. SQL
Injection is perhaps one of the most common
1 Introduction
application layer attacks. In the study [4] by Liban and
Programming is one of the most important part of the Hilles in 2014, a SQL-injection vulnerability scanning
cybersecurity. Today, the need for cybersecurity tool named MYSQL Injector was developed, for the
professionals is more than obvious. Cybersecurity is not purposes of automatic creation of SQL-injection attacks
just about using a customised operating system with using time-based attack with Inference Binary Search
hundreds of tools to find vulnerabilities — it is Algorithm. Denial of Service (DoS) is a very important
something more than that. Programming skills are problem that needs to be dealt seriously in security
always welcome if you want to be a top-notch you have domain. Khan and Traore propose a regression analysis
to think like a hacker and look for all the possible ways based model [5] that can prevent algorithmic
a hacker could exploit any system, and this includes complexity attacks and they demonstrate their model on
development too. The field of cybersecurity is huge and quick-sort algorithm in the case of DoS attacks.
cannot be constrained into a number of predefined
fundamentals of forensic investigators, ethical hackers,
3 Using traditional data structures and
security analysts etc. Programming knowledge proves
essential for analyzing software for vulnerabilities, algorithms as part of the AI systems for
identifying malicious software, and other tasks required cybersecurity
for cybersecurity analysts. What to draw from this Strategy to learn an algorithm contents follows these
advice is that programming knowledge gives you an steps: write pseudocode, implement it with any
edge over other security professionals without those programming language following pseudocode, test
skills. Cybersecurity professionals might use their correctness by checking output with different inputs,
coding skills to write tools that automate certain security and analyse complexity. Development and usage of data
tasks. structures and algorithms represents an initial step in
forensic investigation which further reveals holes in
2 Areas of application of algorithms cyber space pointing out which places in the code and in
the infrastructure need to be fixed [6]. Keeping in mind
Computer science is the study of problems, problem- that attackers have a lot of tools to make attacks more
solving, and the solutions that come out from the and more sophisticated. This is where Python comes to
problem-solving process. Given a problem, a computer the fore with its ease of coding which is mostly like
scientist’s goal is to develop an algorithm, a step-by- plain English, and that most activities can be automated
step list of instructions for solving any instance of the using Python scripts. In following paragraphs we stress
problem that might arise. Algorithms are finite out which data structures and algorithms can be
processes that if followed will solve the problem. implemented in Python in the area of cybersecurity [7].
Algorithms should be taught using the native language

ERK'2019, Portorož, 243-246 243


3.1 Algorithms Table 2. The pseudocode for Binary search algorithm

3.1.1 Quick sort BINARYSEARCH (A, start, end, x)


if start <= end
Quick sort is one of the algorithms from the family of middle = floor((start+end)/2)
sorting algorithms and techniques. Quicksort first if A[middle]==x
chooses a pivot and then partitions the array around this return middle
pivot. In the partitioning process, all the elements
smaller than the pivot are put on the one side of the if A[middle]>x
pivot and all the elements larger than it on the other return BINARYSEARCH (A, start, middle-1, x)
side.
if A[middle]<x
return BINARYSEARCH (A, middle+1, end, x)

return FALSE // in case, element is not in the array

3.1.3 Adjacency Matrix / Storing Graphs


We can represent a graph in many ways. An example of
data structures can be Adjacency Matrix and Adjacency
List. An adjacency matrix is a way of representing a
Figure 2. Visual representation of the Quick sort algorithm
with example graph G = {V, E} as a matrix of booleans. A matrix is a
two-dimensional array. The idea here is to represent the
Table 1. The pseudocode for Quick sort algorithm
cells with a 1 or 0 depending on whether two vertices
PARTITION(A, start, end) QUICKSORT(A, are connected by an edge. The image below shows a
pivot = A[end] start, end) graph and its equivalent adjacency matrix.
i = start-1 if start < end
q=
for j in start to end-1 PARTITION(A,
if A[j] <= pivot start, end)
i = i+1 QUICKSORT(A,
swap(A[i], A[j]) start, q-1)
swap(A[i+1], A[end]) QUICKSORT(A,
//swapping pivot q+1, end) Figure 4. Visual representation of the Adjacency Matrix
return i+1 construction with example

3.1.2 Binary search Table 3. The pseudocode for storing graphs algorithm
In Binary search, we iterate over an array to find if an ADJACENCYMATRIX (n)
element is present in a list or not. If the element to be for i in 1 to n
found is equal to the middle element, then we have for j in 1 to n
already found the element, otherwise, if it is smaller, A[i][j]=0 // create empty matrix-2D array
then we know it is going to lie on the left side or on the
right. The idea is to keep comparing the element with MAKEEDGE (to,from, undirected)
the middle value. This way with each search we if undirected ==true
eliminate one half of the list. A[from][to]=1
A[to][from]=1
3.1.4 Eratosthenes - Prime Numbers
The sieve of Eratosthenes is one of the most efficient
ways to find all primes smaller than n when n is smaller
than 10 million or so. A prime number is one which is
only divisible by 1 and itself. We have to find out
whether it has any divisors, and is therefore composite.
Figure 3. Visual representation of the Binary Search algorithm In this case, we would discard it as a prime number. The
with example Greek mathematician Eratosthenes designed a quick
way to find all the prime numbers. It’s a process called
the Sieve of Eratosthenes.

244
languages, without need for a user to install support on
local computer.

Table 5. Results of measurements for Eratosthenes/Prime


algorithm

Language Time in seconds Memory used


C# 0,02 131,136
C++ 0,02 15,224
Java 0,08 3153,92
JavaScript 0,37 2370,56
PHP 0,04 82,88
Python 2 0,01 23,336
Figure 5. Visual representation of the Eratosthenes/Prime
algorithm with example Python 3 0,03 17,96

Table 4. The pseudocode for Eratosthenes/Prime As we can see from Table 5. the values about spent time
algorithm with natural language description for execution of algorithm and consumed memory are
given. Regarding the Python programming language, in
logical_type А[101] our experiments we used version 2 and novel version 3
А[0]=false,А[1]=false in order to check whether there are some progress in
for i=2,..,100 repeat speed of program execution. The obtained results
А[i]== true suggest us that Python 3 not only performs the
for i=2,..,sqrt(100) repeat algorithm faster, but also consumes less memory.
if А[i]== true Among all tested programming languages, Python uses
ј=i+i least memory. JavaScript, Java and C# showed worst
while ј<=100 repeat results, but this is not surprising keeping in the mind the
А[j]=false fact that these programming languages are executed on
ј=ј+i client side. Worst results in time needed for execution
for i=0,..,100 repeat achieved JavaScript, while Java takes the most memory.
if А[i]== true Libraries in programming languages always makes
print i our code easy to develop, so here we are going to
discuss some library functions in Python to work upon
3.2. Measuring Performance of Algorithms: Use case prime numbers. SymPy is a Python module which
on Eratosthenes/Prime algorithm contains some really practical prime number related
library functions. Method primerange(a, b) by SymPy
Prime numbers are fundamental to the most common
generates a list of all prime numbers in the range [a, b].
type of encryption used today: the RSA algorithm. The
However, Python cross platform programming language
RSA and Elliptic Curve asymmetric algorithms are
can be used as a script or application. It comes with pip
based on prime numbers. These numbers have
packet manager, which allows easy installation and
interesting properties that make them well suited to
exchange of packages – useful library modules available
cryptography. Cryptography is all about number theory,
in Python Package Index (PyPI) repository [8]. By using
and all integer numbers (except 0 and 1) are made up of
Python cybersecurity professional can create a quick
primes, so you deal with primes a lot in number.
reponse to a cyber attack through it’s vast treasure of
Fermat's Little Theorem states that the following
libraries.
equation is true for any prime number m, and any whole
number that is not a multiple of m.
4 Using sequential data structures in
(1) Python and Python extension modules
According to the description given in the previous Data structures are complex types of data that include
paragraphs, we present the results of experiments multiple elements of different names and different
obtained by testing Eratosthenes/Prime algorithm types. They are located in the same block within the
implementation in different programming languages. memory as well as main program, and are accessed
The purpose of this experiment was to check which through a single cursor or an identifier. For example, if
programming language gives the best performances of we have a simple data structure that has two elements,
implementation. The experiment was performed via title and the length of the song, the identifier of the
concurrent programming website CodeChef. This structure is the title field.
platform supports large number of programming Knowledge of Python sequential data structures is
necessary for creating functional penetration testing

245
tools and applications in cybersecurity [9]. Fundamental applications that deals with of most today's
understanding of the language’s data structures that are cybersecurity issues.
applicable to penetration testing and cybersecurity are a With unavoidable fact that technology, threats are
must for those who want to advance in this area. Some constantly evolving, if cybersecurity profesionall skills
of the fundamental data structures that can be don't evolve with them, they will become ineffective
implemented in the Python are array, queue and linked and irrelevant, unable to provide the vital defenses that
list. organizations increasingly require.
Python programming languages implement extension
modules and libraries that adds a possibility of using Acknowledgment
data structures. Some of these modules and libraries are: This paper is part of the research projects no. TR 32023,
[10]: TR 35026 and subproject 3 in project no. III 47016,
supported by the Ministry of Education, Science and
 NumPy module:. Provides efficient operation Technological development of the Republic of Serbia.
on arrays of homogeneous data. One of the
operation is [Link]. Although Python has
Literature
built-in sort and sorted functions to work with
lists, [Link] uses a quicksort algorithm. [1] W.L. Perry et al.: Predictive policing: the role of crime
forecasting in law enforcement operations. 2013. Santa
 Pandas: It includes sophisticated methods for Monica, CA:RAND.
data structure manipulation. By using [2] L. Bennett Moses, & J. Chan: Algorithmic prediction in
hierarchical axis indexing it provides an policing: assumptions, evaluation, and accountability.
intuitive way of working with high- Policing and Society, 28(7), 806-822, 2018.
dimensional data in a lower-dimensional data [3] N. Singh, C. B. Kaverappa, J. D. Joshi: Data Mining for
Prevention of Crimes. In: Yamamoto S., Mori H. (eds)
structures. Human Interface and the Management of Information.
 Bisect module: Binary search is a fast Interaction, Visualization, and Analytics. HIMI 2018.
algorithm for searching sorted sequences. It is Lecture Notes in Computer Science, vol 10904. Springer,
Cham
available in a standard Python module for
[4] A. Liban, & S. M. Hilles: Enhancing Mysql Injector
binary searches, named 'bisect.' Module
vulnerability checker tool (Mysql Injector) using
[Link] provides support for maintaining a inference binary search algorithm for blind timing-based
list in sorted order without having to sort the attack. In 2014 IEEE 5th Control and System Graduate
list after each insertion. Research Colloquium (pp. 47-52). IEEE.
 NetworkX module: Python library for [5] S. Khan, & I. Traore: A prevention model for algorithmic
complexity attacks. In 2005 International Conference on
studying graphs and networks. Operation from Detection of Intrusions and Malware, and Vulnerability
numpy_matrix return a graph from numpy Assessment (pp. 160-173). Springer, Berlin, Heidelberg.
matrix. NetworkX uses an adjacency dictionary [6] J. M. de Fuentes, L. González-Manzano, J. Tapiador, &
representation. The main emphasis of P. Peris-Lopez: PRACIS: Privacy-preserving and
aggregatable cybersecurity information sharing.
NetworkX is to avoid the whole issue of
Computers & Security, 69, 127-141, 2017.
hairballs. The use of simple calls hides much
[7] A. Epishkina, & S. Zapechnikov: A syllabus on data
of the complexity of working with graphs and mining and machine learning with applications to
adjacency matrices from view. cybersecurity. In 2016 Third International Conference on
Digital Information Processing, Data Mining, and
5 Conclusion Wireless Communications (DIPDMWC) (pp. 194-199).
IEEE.
Python has grown to become one of the top [8] S. Šandi, T. Popović, & B. Krstajić: Python
programming languages in the world, with more implementation of IEEE C37. 118 communication
developers than ever now using it for IT sector. On the protocol. ETF Journal of Electrical Engineering, 21(1),
other hand, Python language is predominantly used in 108-117, 2015.
cybersecurity because of the powerful packages it has to [9] K. J. Knapp, C. Maurer, & M. Plachkinova: Maintaining
offer. Traditional programming languages like C, C++ a Cybersecurity Curriculum: Professional Certifications
etc. force programmers to deal with details of data as Valuable Guidance. Journal of Information Systems
structures and supporting routines, rather than algorithm Education, 28(2), 101-114, 2017.
design. Python represents and algorithm-oriented [10] [Link] accessed July 2019.
language that has been sorely needed in education about
cybersecurity domain. Our research showed that
implementation of analyzed algorithms in Python
enables development of flexible and functional

246

View publication stats

Common questions

Powered by AI

Python is particularly suited for cybersecurity applications due to its clean syntax, modular design, and extensive library support. This allows for the rapid development and customization of security tools. Its dynamic typing and automatic memory management facilitate ease of coding, making it accessible for developing complex applications without deep overhead from data structure management, which is often required in traditional languages like C or C++. Additionally, the availability of libraries such as SymPy for mathematical computations, NumPy for efficient data handling, and NetworkX for network analysis enable cybersecurity professionals to implement effective solutions quickly .

Integration of Python libraries significantly enhances cybersecurity tool development by providing pre-built functions that focus on specific tasks. For instance, libraries such as NumPy and Pandas offer efficient data manipulation capabilities, which are crucial in processing large datasets for analysis. SymPy offers tools for computing with symbolic mathematics, useful for cryptographic computations. NetworkX allows for sophisticated analysis and visualization of network graph structures. This modularity and ease of use minimize development time and complexity, allowing cybersecurity professionals to focus on designing specific functionalities rather than reinventing the wheel for common operations .

Python offers several advantages over traditional programming languages like C and C++ for algorithm design in cybersecurity. Its high-level syntax promotes readability and reduces the complexity involved in writing and understanding code, making it accessible for quick prototyping and iteration. Python's extensive standard libraries and third-party modules simplify complex tasks such as data manipulation and network analysis. This contrasts with C and C++, where developers often spend significant time on managing data structures and low-level operations. Python's abstraction from these details allows cybersecurity professionals to focus on developing and deploying functional and flexible security tools .

The steps involved in learning and implementing an algorithm using Python include writing pseudocode, implementing it in Python, testing the algorithm's correctness with various inputs, and analyzing its complexity. This process is effective in cybersecurity because it encourages a structured approach to problem-solving. Writing pseudocode helps in understanding the logic without getting bogged down in syntax. Implementing it in Python allows leveraging its powerful libraries and syntax simplicity for quick prototyping. Testing and complexity analysis ensure that the algorithm is robust and efficient, critical for real-time cybersecurity applications where efficiency can significantly impact threat management .

Python's favorability for algorithm implementation in cybersecurity stems from its clear and succinct syntax, which facilitates the quick development and iteration of scripts. Its robust library ecosystem offers specialized packages for various cybersecurity needs, like data analysis with Pandas and NumPy, cryptography-related computations with SymPy, and network analysis with NetworkX. Python's design minimizes the complexity of managing data structures and low-level operations, which are often burdensome in languages like C or C++. This focus on easing algorithm implementation without sacrificing performance aligns well with the dynamic nature of cybersecurity threats .

Understanding traditional data structures is vital when implementing AI systems for cybersecurity in Python, as these structures form the backbone of efficient algorithm implementation. For example, arrays, queues, and linked lists serve as foundational constructs in developing models that manage data flow and storage, critical for processing large volumes of data in AI tasks. They enable efficient data manipulation and storage, which is essential for real-time analysis in cybersecurity applications. Knowledge of these data structures ensures optimized use of resources, allowing developers to create scalable and responsive AI systems that can adapt to evolving cybersecurity threats .

Algorithms play a crucial role in predictive policing by analyzing large datasets to identify potential patterns and forecast crime locations or individuals at risk of offending. These algorithms, such as statistical or machine learning models, enhance traditional policing methods—like hotspot and problem-oriented policing—by providing data-driven insights that guide resource allocation and strategy formulation, ultimately enabling more proactive and targeted law enforcement interventions .

Adjacency matrices are used in cybersecurity to represent and analyze network structures, which are essential for understanding connections and vulnerabilities within networks. They help visualize and manage complex relationships between network nodes, which is crucial for threat detection and prevention. Quicksort algorithms, on the other hand, are employed to organize data efficiently, allowing cybersecurity systems to operate faster and more effectively by prioritizing threat data processing. By sorting data, organizations can also speed up searches and analyses necessary for identifying security breaches or irregular patterns .

Quicksort is a sorting algorithm that is suitable for arranging elements in order, with an average time complexity of O(n log n), making it efficient for handling large datasets. It works by choosing a pivot and partitioning the other elements into two sub-arrays according to whether they are less than or greater than the pivot. In contrast, binary search is a searching algorithm used to find an element's position in a sorted array with a time complexity of O(log n), operating by repeatedly dividing the portion of the array that could contain the target in half. While quicksort is used to sort arrays, binary search is used to efficiently search within a sorted array .

Programming skills give cybersecurity professionals an edge by enabling them to develop custom tools for automating security tasks, identifying vulnerabilities, and analyzing threats more effectively. Those with coding abilities can write scripts to automate repetitive tasks, analyze software for weaknesses, and create new security solutions. This capability to tailor tools to specific needs, rather than relying only on existing software, provides a significant advantage in responding to complex and novel cybersecurity challenges. Furthermore, understanding how systems are coded enhances the ability to anticipate and prevent potential exploitation methods used by hackers .

You might also like