Open Source Tools for SecDevOps Pipeline
Open Source Tools for SecDevOps Pipeline
A Master's Thesis
Submitted to the Faculty of the
Escola Tècnica d'Enginyeria de Telecomunicació de
Barcelona
Universitat Politècnica de Catalunya
by
Martí Canyelles Toledano
In partial fulfilment
of the requirements for the degree of
MASTER IN TELECOMMUNICATIONS ENGINEERING
The objective of this project is to study the open source tools that an operations team, following
a SecDevOps methodology, should implement in its development infrastructure, to increase
security. The project is based on a simple application that did not follow best security practices,
and to which an infrastructure has been built around it, to solve this situation.
The idea arises from the concern that, in a context of global digitalization, where in many
cases, security mechanisms have not been applied in the best possible way, certain
technologies and best practices must be applied to increase the level of security in the
applications.
The work is based on a microservices architecture, using containers to implement the
applications, obtaining efficient, secure and scalable systems. In addition, a single sign on
(SSO) authentication system has been added through OpenID Connect, a container
monitoring system, a log centralization system, code scanners and container scanners, among
others.
The project allows to see the change that the application has undergone during the course of
this process and concludes with an assessment of the operation of each of the tools used.
2
Possibly,
the most common error of a smart engineer
is to optimize a thing that should not exist.
Elon Musk
3
Acknowledgments
I would like to thank Josep Pegueroles, my supervisor for this project, for the help given. Also,
thanks to my colleagues Manel, Lluís and Enric for all the support, help, and part of their time
that they have given me whenever I have needed it.
4
Review history and approval record
5
Table of contents
Abstract .................................................................................................................................................. 2
Acknowledgments .................................................................................................................................. 4
Review history and approval record ....................................................................................................... 5
Table of contents .................................................................................................................................... 6
List of figures .......................................................................................................................................... 8
List of tables ......................................................................................................................................... 10
1 Introduction........................................................................................................................................ 11
1.1 Contextualization and objectives ................................................................................................ 11
1.2 Requirements and specifications ............................................................................................... 12
1.3 Methods and procedures ............................................................................................................ 12
1.4 Work plan ................................................................................................................................... 13
1.5 Deviations from the initial plan and incidents ............................................................................. 16
2 State of the art ................................................................................................................................... 18
2.1 DevOps and SRE ....................................................................................................................... 18
2.2 SecDevOps ................................................................................................................................ 19
2.3 Microservices: the new architecture .......................................................................................... 22
2.3.1 Containers .......................................................................................................................... 22
2.3.2 Microsegmentation ............................................................................................................. 24
2.4 Best Practices, methodologies and new technologies ............................................................... 25
2.4.1 OWASP ............................................................................................................................... 25
2.4.2 CI/CD ................................................................................................................................... 25
2.4.3 Docker Containers ............................................................................................................... 26
2.4.4 GitHub Security ................................................................................................................... 29
2.4.5 Code Scanning .................................................................................................................... 29
[Link] CodeQL ........................................................................................................................ 30
[Link] Semgrep ....................................................................................................................... 30
[Link] SonarQube ................................................................................................................... 31
[Link] Bridgecrew.................................................................................................................... 31
2.4.6 Secrets Management with Docker Swarm .......................................................................... 31
2.4.7 Self update dependencies ................................................................................................... 33
2.4.8 Container Scanning ............................................................................................................. 33
2.4.9 Docker Monitoring and centralized logging ......................................................................... 34
2.4.10 LOGLEVEL ........................................................................................................................ 35
2.4.11 SSO Login ......................................................................................................................... 36
2.4.12 WAF................................................................................................................................... 38
3 Project Development ......................................................................................................................... 40
3.1 ToDo App ................................................................................................................................... 40
3.2 GitHub Security .......................................................................................................................... 41
3.2.1 Security Policy ..................................................................................................................... 41
3.2.2 Securing every step of the development process ............................................................... 41
3.3 Code Scanning ........................................................................................................................... 43
6
3.3.1 CodeQL ............................................................................................................................... 43
3.3.2 Semgrep .............................................................................................................................. 43
3.3.3 SonarQube .......................................................................................................................... 47
3.3.4 Bridgecrew........................................................................................................................... 53
3.4 Application of best practices for secrets ..................................................................................... 57
3.5 Secrets extraction from Docker images ..................................................................................... 60
3.6 Self update dependencies .......................................................................................................... 64
3.7 Container Scanning .................................................................................................................... 66
3.7.1 Grype of Anchore: ............................................................................................................... 66
3.7.2 Trivy of Aqua Security: ........................................................................................................ 67
3.8 Docker Monitoring and centralized logging ................................................................................ 70
3.9 LOGLEVEL ................................................................................................................................. 76
3.10 SSO Login ............................................................................................................................... 77
3.11 Two-Factor Authentication ....................................................................................................... 81
3.12 WAF ......................................................................................................................................... 83
3.13 Cracking users passwords ....................................................................................................... 89
4 Results .............................................................................................................................................. 92
5 Budget ............................................................................................................................................... 94
6 Conclusions and future development ................................................................................................ 95
Bibliography .......................................................................................................................................... 98
Appendices ......................................................................................................................................... 103
Appendix 1 ..................................................................................................................................... 103
Appendix 2 ..................................................................................................................................... 105
Appendix 3 ..................................................................................................................................... 106
Appendix 4 ..................................................................................................................................... 107
Appendix 5 ..................................................................................................................................... 108
Appendix 6 ..................................................................................................................................... 109
Appendix 7 ..................................................................................................................................... 110
Appendix 8 ..................................................................................................................................... 112
Appendix 9 ..................................................................................................................................... 113
Appendix 10 ................................................................................................................................... 114
Appendix 11 ................................................................................................................................... 115
Appendix 12 ................................................................................................................................... 117
Appendix 13 ................................................................................................................................... 118
Appendix 14 ................................................................................................................................... 119
Appendix 15 ................................................................................................................................... 120
Appendix 16 ................................................................................................................................... 121
Appendix 17 ................................................................................................................................... 122
Glossary ............................................................................................................................................. 127
7
List of figures
Figure 1: Planned Gantt chart of Work Package 1 ............................................................................... 15
Figure 2: Planned Gantt chart of Work Package 2 ............................................................................... 15
Figure 3: Planned Gantt chart of Work Package 3 ............................................................................... 16
Figure 4: Planned Gantt chart of Work Package 4 ............................................................................... 16
Figure 5: Final Gantt chart of Work Package 1 .................................................................................... 16
Figure 6: Final Gantt chart of Work Package 2 .................................................................................... 16
Figure 7: Final Gantt chart of Work Package 3 .................................................................................... 17
Figure 8: Final Gantt chart of Work Package 4 .................................................................................... 17
Figure 9: Application Life Cycle [own source based on [1]].................................................................. 19
Figure 10: Application Life Cycle [own source based on [1]]................................................................ 20
Figure 11: SecDevOps cyclic process [2]............................................................................................. 20
Figure 12: Virtual Machines vs Containers [own source] ..................................................................... 23
Figure 13: Top 10 list 2017 and 2021 from OWASP[own source based on [6]] ................................... 25
Figure 14: First look to the ToDo App .................................................................................................. 40
Figure 15: GitHub Security options ...................................................................................................... 41
Figure 16: Security policy of the repository .......................................................................................... 41
Figure 17: GitHub Dependency Graph part 1/2.................................................................................... 42
Figure 18: GitHub Dependency Graph part 2/2.................................................................................... 42
Figure 19: Dependabot alert for jinja2 version ..................................................................................... 42
Figure 20: GitHub result for Code Scanning ........................................................................................ 43
Figure 21: GitHub workflows view ........................................................................................................ 43
Figure 22: Semgrep token on Repository Secrets ............................................................................... 44
Figure 23: GitHub workflows with Semgrep ......................................................................................... 44
Figure 24: Logs of Semgrep workflow .................................................................................................. 44
Figure 25: Semgrep rules ..................................................................................................................... 44
Figure 26: Semgrep Dashboard ........................................................................................................... 45
Figure 27: Semgrep alerts .................................................................................................................... 45
Figure 28: Semgrep alert 1................................................................................................................... 45
Figure 29: Semgrep alert 2................................................................................................................... 46
Figure 30: Semgrep alert 3................................................................................................................... 46
Figure 31: Semgrep alert 4................................................................................................................... 46
Figure 32: Created SonarQube app on GitHub ................................................................................... 47
Figure 33: Details of SonarQube app on GitHub.................................................................................. 47
Figure 34: Secrets added in the repository .......................................................................................... 48
Figure 35: Configuration steps for SonarQube..................................................................................... 48
Figure 36: SonarQube dashboard ........................................................................................................ 48
Figure 37: SonarQube alert 1: where is the risk? ................................................................................. 49
Figure 38: SonarQube alert 1: what is the risk? ................................................................................... 49
Figure 39: SonarQube alert 1: Assess the risk..................................................................................... 50
Figure 40: SonarQube alert 1: how can you fix it? ............................................................................... 50
Figure 41: SonarQube alert 2: where is the risk? ................................................................................. 51
Figure 42: SonarQube alert 2: what is the risk? ................................................................................... 51
Figure 43: SonarQube Code Smell alerts ............................................................................................ 52
Figure 44: SonarQube alert 3: where is the issue? .............................................................................. 52
Figure 45: SonarQube alert 3: what is this an issue?........................................................................... 52
Figure 46: SonarQube dashboard after fixing issues ........................................................................... 53
Figure 47: Bridgecrew alert dashboard ................................................................................................ 53
Figure 48: Bridgecrew supply chain graph ........................................................................................... 54
Figure 49: Bridgecrew alert 1 ............................................................................................................... 54
Figure 50: Bridgecrew alert 2 ............................................................................................................... 54
Figure 51: Bridgecrew alert 3 ............................................................................................................... 55
Figure 52: Check for docker user ......................................................................................................... 55
Figure 53: Bridgecrew alert 3 ............................................................................................................... 56
8
Figure 54: Docker inspect output ......................................................................................................... 57
Figure 55: Secrets exposed with Docker inspect ................................................................................. 58
Figure 56: Docker secret creation ........................................................................................................ 58
Figure 57: Docker secret verification .................................................................................................... 58
Figure 58: Dive dashboard ................................................................................................................... 60
Figure 59: First 2 Docker images analysed .......................................................................................... 61
Figure 60: GitGuardian shield results of image 1 ................................................................................. 61
Figure 61: GitGuardian shield results of image 2 ................................................................................. 62
Figure 62: Last 3 Docker images analysed .......................................................................................... 62
Figure 63: GitGuardian shield results of image 3 ................................................................................. 62
Figure 64: GitGuardian shield results of image 4 ................................................................................. 62
Figure 65: GitGuardian shield results of image 5 part 1 ....................................................................... 62
Figure 66: GitGuardian shield results of image 5 part 2 ....................................................................... 63
Figure 67: Issue opened by Renovate with all the possible updates ................................................... 64
Figure 68: Grype output of database ToDo App .................................................................................. 66
Figure 69: Grype output of webservice ToDo App ............................................................................... 66
Figure 70: Trivy output of database ToDo App .................................................................................... 67
Figure 71: Trivy output of webservice ToDo App ................................................................................. 68
Figure 72: Trivy output of webservice ToDo App after Python update ................................................. 69
Figure 73: Grype output of webservice ToDo App after Python update ............................................... 69
Figure 74: Trivy output of webservice ToDo App after Python 2nd update .......................................... 69
Figure 75: Grafana dashboard ............................................................................................................. 73
Figure 76: Grafana dashboard, upper left part ..................................................................................... 74
Figure 77: Grafana dashboard, upper right part ................................................................................... 74
Figure 78: Grafana dashboard, medium left part ................................................................................. 74
Figure 79: Grafana dashboard, medium right part ............................................................................... 75
Figure 80: Grafana dashboard, bottom left part ................................................................................... 75
Figure 81: Grafana dashboard, bottom right part ................................................................................. 75
Figure 82: Keycloak admin login .......................................................................................................... 77
Figure 83: Keycloak dashboard............................................................................................................ 77
Figure 84: Keycloak user test creation ................................................................................................. 77
Figure 85: Keycloak oauth2-proxy client creation ................................................................................ 78
Figure 86: Keycloak oauth2-proxy client credentials ............................................................................ 78
Figure 87: Oauth2-proxy login .............................................................................................................. 79
Figure 88: Keycloak user login ............................................................................................................. 79
Figure 89: ToDo App after being authenticated ................................................................................... 79
Figure 90: [Link] ......................................................................................................................... 80
Figure 91: ToDo app with logout button ............................................................................................... 80
Figure 92: End of logout process ......................................................................................................... 80
Figure 93: Logout from Keycloak ......................................................................................................... 80
Figure 94: Keycloak 2FA configuration ................................................................................................ 81
Figure 95: First time 2FA Keycloak user .............................................................................................. 81
Figure 96: Keycloak user login with 2FA .............................................................................................. 82
Figure 97: Shadow Daemon dashboard............................................................................................... 83
Figure 98: Shadow Daemon profile for the ToDo APP part 1 ............................................................. 84
Figure 99: Shadow Daemon profile for the ToDo APP part 1 ............................................................. 84
Figure 100: ToDo APP requests captured by the WAF........................................................................ 87
Figure 101: GitHub Issue about json bug in Python Connector ........................................................... 88
Figure 102: Keycloak password policies .............................................................................................. 89
Figure 103: Keycloak credential database ........................................................................................... 89
Figure 104: Keycloak user_entity database ......................................................................................... 90
Figure 105: Hashcat output for password cracking .............................................................................. 91
Figure 106: Cracked hashes with the clear passwords ........................................................................ 91
Figure 107: Initial infrastructure ............................................................................................................ 92
Figure 108: Final infrastructure ............................................................................................................ 93
Figure 109: Final infrastructure ............................................................................................................ 97
9
List of tables
Table 1: Review history and approval record ......................................................................................... 5
Table 2: Work Packages and Internal Tasks........................................................................................ 15
10
1 Introduction
11
The work is mainly divided into two large blocks. The first block, the State of the Art, is where
the existing technologies are put into context. And the methodologies that IT teams are using
around the world and the elements that should exist in a SecDevOps pipeline are studied,
explaining the best practices that must be followed to ensure a high level of security throughout
the process. The second block consists on the implementation of the concepts learned in the
previous block, to develop a secure infrastructure around the application.
The main objectives of the project are:
● Analyse DevOps, SecDevOps methodologies and the Site Reliability Engineering
(SRE) role.
● Analyse the best security practices for development teams.
● Study the tools to add to a SecDevOps pipeline.
● Implement a SecDevOps pipeline.
● Secure an application with the pipeline implemented.
12
● Semgrep: it is a code scanner, which is characterized by its scanning speed and for
being open source.
● SonarQube: it is a static scanning tool, compatible with 29 programming languages
and can be integrated with CI/CD tools such as GitHub Actions, Azure DevOps or
Jenkins.
● BridgeCrew: is a software that brings together a set of tools, among which there are
static code scanners such as Checkov.
● Dive: it is an open source software that shows all the layers of the scanned containers,
indicating in each layer all the files it contained.
● GitGuardian Shield: it is an open source program that detects secret patterns both in
repositories and directly in Docker images.
● Renovate: It is an open source bot that analyses the dependencies of the repository in
search of possible updates.
● Grype: it is a container scanner that performs deep inspection of images and exposes
all identified vulnerabilities.
● Trivy: it is a container scanner, which analyses operating system packages and
application dependencies.
● Grafana: it is an open source application to visualize metrics of a large amount of data,
thanks to its panel interface.
● InfluxDB: it is an open source database management system, developed in Go.
● Telegraf: it is an open source server agent that collects metrics such as RAM usage
or CPU load.
● Grafana Loki: it is an open source log aggregation system to centralize the logs
received by an exporter.
● Promtail: it is an open source server agent that collects the content of the logs.
● Keycloak: it is an open source software that allows the management of users,
credentials and permissions, among other features, being compatible with SSO.
● OAuth2 Proxy: it is a reverse proxy that provides authentication through external
providers such as Keycloak, Google or GitHub.
● Shadow Daemon: it is a hybrid and open source WAF that detects, registers and blocks
attacks on web applications.
● Hashcat: it is an open source password recovery program.
13
IT1.3: Microservices
IT1.4: Containers
IT1.5: Microsegmentation
IT1.6: OWASP
IT1.7: CI/CD
IT1.8: Containers: Best Practices
IT1.9: GitHub Security
IT2.1: Vault
IT2.2: Research Code Scanning
IT2.3: CodeQl
IT2.4: Semgrep
IT2.5: SonarQube
IT2.6: Research Secret management
IT2.7: Docker Swarm and .env file
IT2.8: GitGuardian
IT2.9: Self update Dependencies
IT2.10: Research Container Scanning
IT2.11: Trivy
IT2.12: Grype
14
Final date: 08/08/2022
15
Figure 3: Planned Gantt chart of Work Package 3
16
Figure 7: Final Gantt chart of Work Package 3
1. WP1 was 3 days late from the scheduled final date, due to extra days needed to finish IT1.8.
This was because it took longer than expected, understanding and testing the best practices
for Docker. This fact meant a change of 3 days in all subsequent dates.
2. WP2 was delayed by 6 days from the expected completion date, 3 days due to the
accumulated delay of WP1 and 3 more days due to changes in the expected duration of the
IT, such as:
● IT2.1 was to study and integrate the secrets management software Vault, however,
after research and testing, the idea was dropped because it did not integrate well with
Docker-Compose. Even so, more time was spent than expected.
● It also took longer than expected with IT2.3 and IT2.6.
● An unplanned IT, IT2.13, was added because scheduled code scanners had not
returned the expected results for the [Link] files.
● On the other hand, IT2.10 could be done in less time than expected. In addition, some
tasks of this WP could be parallelized, reducing the total time of WP2.
The accumulated delay meant a change of 6 days in all subsequent dates.
3. WP3 had a delay of 8 days from the expected completion date, 6 days due to the
accumulated delay of WP1-WP2 and 2 more days due to the problems found in IT3.5 with the
implementation of Docker and the addition of the logout button. However, IT3.4 was able to
be completed faster than expected.
The accumulated delay meant a change of 8 days in all subsequent dates.
4. WP4 had a delay of 2 days from the expected completion date, because although there
were 8 days of delay, it could be carried out in parallel with WP3. So, despite IT4.2 taking
longer than expected due to WAF issues, the overall result was a reduction in the expected
time.
The modifications that have been made in the planning of the tasks were expected, given the
lack of knowledge about the difficulty of each task. It has not been necessary to modify the
objective of any task, although IT2.13 had to be added, and finally IT2.1 was not implemented,
because it was discarded.
17
2 State of the art
1
References in this section [66], [67]
2
References in this section [68]–[70]
18
In the same way as with the DevOps methodology, an SRE advocates for automation and
monitoring, in order to reduce the time from when a developer applies a code change, until it
is finally applied in production.
2.2 SecDevOps3
SecDevOps is a methodology that seeks to integrate security throughout the life cycle of an
application, maintaining the advantages of the DevOps methodology. With the above
explained, let's imagine that a company has applied the DevOps methodology and is
developing an application. This development consists of different stages, that can be seen in
figure 9.
● Plan: the stage in which the requirements of the application are specified.
● Code: the stage in which the code is developed.
● Test: the stage in which the tests are carried out.
● Package: the stage in which the app is packaged.
● Release: when we upload the new version
● Deploy: When the application is deployed or installed in one or more environments
prepared for testing, security checks should be passed here.
This is where the problems arise. Very likely, the security team will encounter problems that
will cause a setback in the application development cycle, to return into the “Code” stage, with
the consequent delays in completion times. In addition, once the “Deploy” stage is reached
again, the security team must analyse the application again, to ensure that the changes have
been applied correctly. To solve this inefficient process, SecDevOps methodology arises.
The change in the life cycle of an application following the SecDevOps methodology consists
of applying highly precise security controls, in each “shift left”, that is, in each stage change.
These security controls in each “shift left” can be seen in figure 10.
3
References in this section [1]
19
Figure 10: Application Life Cycle [own source based on [1]]
1. The architecture of the application is analysed, such as the authentication process, the
connection with external servers, the encryption, etc. "Tickets/stories" specifically
dedicated to security are generated.
2. The security team trains the development team in security tools such as static code
analysis and, above all, in typical attacks on the type of application being developed
(for example, DDoS or SQL Injection). Automatic bad practice detection tools can also
be applied, such as Git-Secrets that allow to analyse that a git commit does not contain
critical data such as passwords or tokens.
3. Automatic tests are run, that seeks vulnerabilities that put the application's security in
risk.
4. Analysis of the packaged application. For example, vulnerability scanning of external
libraries, or in the case of using Docker, security analysis of the images used.
5. On many systems, the analysis from the previous point is executed at this point.
6. A dynamic analysis of the application is performed.
7. Once the application is already deployed, the security tests that are carried out are
from RedTeam.
8. Log analysis to detect possible attacks.
20
In SecDevOps, the use of the following vulnerability analysis models is common: [4], [5]
● SAST: Static Application Security Testing, analysis of the source code, which can be
carried out during the development stage, to notify the programmer that a vulnerability
has been detected and can be solved before changing the stage. Since in this type of
analysis the program has access to the code, it is called "White Box Testing".
● DAST: Dynamic Application Security Testing, application monitoring at runtime in pre-
production environments. This vulnerability analysis differs from SAST, because it
does not have access to the source code, which is why it is called "Black Box Testing".
● IAST: Interactive Application Security Testing, application monitoring at runtime in
production environments. This analysis is executed in the same way as the Black Box,
but with knowledge of the source code, which is why it is called “Grey Box Testing”.
21
2.3 Microservices: the new architecture 4
Microservices is an architectural approach based on dividing what could previously be a single
application, into as many parts as possible (each of these parts it is called microservice). The
advantage of making this division is to obtain independent services that work through APIs
that are well-defined, so that they can be reused in other circumstances. In addition, being
much smaller blocks, they are simpler, maintainable, faster to develop and, above all, very
scalable.
Scalability is key in today's application development, especially in the cloud. With the
monolithic architecture (pre-microservices), if a particular process in an application
experienced a spike in demand, the entire application had to be scaled. With this new
architecture, just scaling the affected microservice is enough. In addition, we improve the
reliability of the system (a key characteristic for DevOps and SRE methodologies), since a
basic factor to reduce the impact of an error of a process is the reduction of dependent
processes in the application.
The main advantages of applying this new architecture are:
● Simplicity: the simplicity of the code allows testing new ideas easily, achieving a good
implementation of Continuous Integration and Continuous Developing (CI/CD).
● Reusability: each service becomes a block with a very specific task. This block can be
used in different applications to solve the same problem. Thanks to this feature,
innovation is increased, since the code does not have to be written from scratch, and
it also improves security, since it prioritizes general implementations over local
implementations. It is much more difficult to update security policies and keep code
free of vulnerabilities in the local code of an application, compared to general code,
shared and used by most applications.
● Scalability: each service can be scaled independently.
● Reliability: reduces the impact of errors by reducing the number of dependent
processes.
● Agility: work is divided into smaller teams responsible for each of the microservices.
Each team has a much better understanding of how their microservices work, and
therefore development is much faster.
2.3.1 Containers 5
Containers are an ideal tool for developing applications with microservices architecture.
To understand what containers are and why they are replacing the use of virtual machines,
we must first understand what a virtual machine is. A virtual machine in general terms is a
software capable of emulating a real computer. This software allows operating systems to be
installed on top of other operating system or to isolate a system from the physical machine or
other virtual machines. To achieve this, the virtual machine emulates virtual hardware on top
of the machine's operating system, in order to run another kernel and another operating
system on top of this emulation. Each virtual machine has its own operating system and does
not share even the libraries with the other virtual machines or with the host operating system.
Obviously, this process turns out to be very inefficient.
4
References in this section [71]
5
References in this section [72]
22
In this scenario, containers appear, which are isolated software units, but are capable of
sharing libraries with other containers and with the operating system. Containers do not need
to run a complete operating system, they simply share the kernel between them, with essential
parts of the operating system and run the chosen application with its dependencies. This
makes containers a compact and faster solution.
In figure 12, the differences between the architecture of virtual machines and of containers
can be seen. As reflected, the second case is much more compact and therefore much more
efficient with resources.
23
5. Smaller attack surface: a best practice in the use of containers is to use only the
necessary libraries for the service that is going to be executed in that container. To this
end, it is common to use images such as Alpine or Busybox, very minimalist images,
which usually weigh around 5 MB. By having very few libraries, it is more difficult for
vulnerabilities to appear, and if they do appear, they are fixed quickly.
6. Portability: containers are self-contained and can be defined, for example, within a
yaml file. Being able to ensure that the same container can be executed on any other
machine.
7. Isolation: Although a server can have dozens of containers running, if not configured
otherwise, they are completely isolated from each other. Therefore, reliability is
guaranteed (if a server goes down, the others can continue) and security (a
compromised container should not suppose a security problem for other containers,
as long as there are applied best practices and micro-segmentation, as explained
later).
8. Improved developer productivity: Containers allow developers to have predictable,
defined and isolated environments where they can run their code and knowing that
they will get the same results as on any other machine where they run that same
container. It solves the famous problem "On my machine it works". The time
developers spend debugging errors and bugs is reduced, so more time can be spent
optimizing and improving features.
In this work, it will be use Docker. Docker is open source software to automate the creation of
containers. To manage the Dockers, the Docker Compose orchestrator will be used.
2.3.2 Microsegmentation 6
An application with microservices architecture also has also its complexities. Front a large set
of containers that must provide a service for an application, connectivity between the different
services is very important. Consequently, the application may be susceptible to lateral
displacement within the container cluster, which means that an attacker could perform
illegitimate access to containers, through a vulnerable container.
Micro-segmentation is a solution or mitigation to this problem. It consists of the implementation
of a granular control of access to the network. With it, network administrators implement
security policies that limit traffic based on the principles of zero trust and least privilege. In this
way, the aim is to minimize the attack surface and reduce damage and speed up recovery.
Generalizing, the great advantage in micro-segmentation is that it forces network
administrators to know the network and the dependencies between services very well. Each
service should only be able to do what it really needs to work. This allows having a dependency
scheme and recovery plans against attacks. Resilience is the ability to quickly recover from
attack or failure. To create resilient microservices, it is necessary to understand which are the
key services that we want to protect against an attack, which of them contain essential data
or are critical, since many others depend on them. In this way, strategies can be generated to
protect the network against an attack and reduce the consequences.
An example of micro-segmentation implementation, taking advantage of the disposability of
containers, is to delete a container and create a new instance of it, if it is detected that it has
violated the network security rules that have been established.
6
References in this section [73], [74]
24
2.4 Best Practices, methodologies and new technologies
2.4.1 OWASP7
The Open Web Application Security Project (OWASP) is an open source project that aims to
find the main causes that make web applications insecure and combat them.
The OWASP Foundation is made up of businesses, educational organizations, and individuals
from around the world, and supports and manage OWASP projects. The results of this
foundation are articles, methodologies, documentation, and tools that are published for free to
improve Internet security.
One of the most recognized OWASP documents is the OWASP Top 10, which lists the ten
most important security risks in web applications. It is updated in a period of 3-4 years. This
document wants to raise awareness about the main problems that affect the security of web
applications, so developers can take measures to solve them.
In figure 13 can be seen the Top 10 of the 2017 document and the 2019 update.
Figure 13: Top 10 list 2017 and 2021 from OWASP[own source based on [6]]
In order to develop secure web applications, it is essential to know and understand the 10
most relevant risks related to the security of web applications.
2.4.2 CI/CD8
Continuous Integration (CI) is a development practice where developers integrate code into
shared repositories multiple times a day. In addition, these code deliveries (push) include
automatisms to test the quality of the code, in search of compilation issues, security concerns,
etc.
Continuous Developing (CD) is a development practice that consists of periodically introducing
changes to the code in production. These code deliveries must be secure, but above all, they
must be deployed quickly, so that the code in production is always up-to-date.
7
References in this section [75]
8
References in this section [76]
25
This methodology has been very successful, thanks to the advantages it brings, such as the
ability to continuously develop with user feedback, the reduction of errors when bringing
applications to production or the better collaboration between development and operations
teams.
Following a SecDevOps methodology, security can and should be applied into the CI/CD
cycle. The continuous application of tests, search for vulnerabilities in external libraries, code
scanners, container scanners, scanners of exposed secrets, among many other tests, improve
the security of applications, while reducing development time by detecting errors in early
stages.
This integration allows the developer to save work, since they are executed automatically. In
addition, they ensure that the developer receives almost instantaneous alerts produced by the
changes that he/she has just made in the code, which allows him/her to correct them much
more easily. The automation of the deployment of applications in production also allows to
eliminate human error in this step, reducing a point of failure.
The security measures applied must be adapted to each environment. In a development
environment, for example, it is allowed to enable debug options to obtain the maximum
information from the application. While when it is wanted to pass the application to production,
the security automations should not allow it if it has debug options enabled.
9
References in this section [77]–[80]
26
Update image version
Although it is recommended to pin the version of the base image, when new versions appear,
they should be tested and updated, to add the latest security patches.
Small-footprint
It is recommended to seek to reduce the size of the containers as much as possible. It should
be checked if there are images with the alpine or slim flavour, which are much smaller images
than usual. In addition, when new packages are installed, flags such as --no-install-
recommends can be applied, which prevents extra packages from being installed with apt, or
commands such as “rm -rf /var/lib/apt/lists/ *” to remove the list of packages.
The following command can help identify the layers that occupy the most in each image and
thus be able to seek to reduce them.
Multi-stage Builds
Linked to the previous point, the use of multi-stage is recommended to achieve containers
without unnecessary elements and as small as possible.
FROM python:3.9-slim
WORKDIR /notebooks
COPY --from=builder /wheels /wheels
RUN pip install --no-cache /wheels/*
Non-root user
By default, Docker containers run as the root user. If the application running inside the
container is compromised, the attacker can gain control over the root user, which would be a
serious security problem.
It is recommended to create a user:
27
Or in Alpine images:
No new privileges
Docker makes use of the Linux capabilities to reduce root user permissions to only those it
really needs. Even so, whether the container is using a root user or a non-root user, if there is
a compromised application, an attacker could escalate privileges, so the use of the no-new-
privileges flag is recommended, to deny any request for new privileges while the container is
running. [7], [8]
Read Only
Whenever possible, it is recommended to take advantage of the fact that Docker allows to limit
the user's privileges to only having read permissions, through the read_only flag. This practice
can further reduce the options an attacker would have if he/she gain access inside the
container [7].
Healthcheck
The use of the healthcheck instruction is recommended to detect a malfunction of the running
container. With this instruction, Docker periodically checks the correct functioning of the
container and in case of failure, it can discard the container and create another one.
healthcheck:
test: curl --fail [Link] || exit 1
interval: 10s
timeout: 10s
start_period: 10s
retries: 3
Options:
● test: verification command
● interval: interval to test
● timeout: maximum time to wait for the response.
● start_period: how long to wait before starting the healthcheck.
● retries: maximum retries before marking the test as failed.
COPY instead of ADD
The use of COPY instead of ADD is recommended, unless this second statement is
specifically required. Although they look similar, COPY copies local files or directories from
the Docker host to the image, while ADD does not only do the same, it also decompresses
the compressed files and can download external files.
28
Only One Process Per Container
To take advantage of the main benefits of containers, such as scalability or reusability, only
one process should be executed in each container.
Don't Store Secrets in Images
Secrets in Docker images are accessible by specialized tools, even though they are deleted
during the container build process. In section 2.4.6 the dangers of this bad practice and the
solutions are explained in more detail.
10
References in this section [81]
29
time it takes to bring an application to production. They are not a magic solution, manual
reviews are still necessary, but they add another layer of security to the development process.
[Link] CodeQL11
GitHub integrates a code scanner called CodeQL into its own platform. Once configured, it
scans every push for security vulnerabilities and alerts the developer before going live. Using
GitHub Actions, it can be applied a CodeQL scan with any trigger wanted. CodeQL is a
semantic code analyser that treats code like a database, modelling bugs and vulnerabilities
as requests in order to detect erroneous patterns. It is capable of increasing the number of
vulnerabilities found, thanks to the application of machine learning, to make these requests.
Although it is not open source, it has been used because it is free for open source projects,
and it integrates perfectly with GitHub, which is the most used code repository. For this project
it has been used the default configuration (which includes more than 2000 queries written by
GitHub Security Lab), but there is the possibility to customize the requests to specifically focus
on the main concerns of each project.
GitHub also provides a code scan to detect secrets, tokens, and other sensible data, which
should never be included in the code for security reasons, but is usually included in the testing
phase.
[Link] Semgrep12
CodeQL is not the only code scanner out there. Semgrep is a static analysis tool for code,
characterized by its scanning speed and open source. It contains more than 2000 default scan
rules, and it is possible to add custom rules for each project. To create new rules, it has a
graphical interface (Semgrep Playground) where it can be added the code and write the rule,
to check its correct behaviour and adapt the rule so that it specifically detects the required
patterns. Semgrep can analyse more than 20 programming languages, among which is
Python, which is the one used in this work. It is also capable of analysing Dockerfile and
docker-compose files, although its analysis is still in an experimental phase.
Semgrep can be used locally, by terminal, although for this work it is wanted to take advantage
of the integration with GitHub Actions and use its web platform with its dashboard. There can
be seen all the security alerts, from all the linked repositories, and it provides very useful
statistics for SecDevOps teams. It allows viewing metrics such as the rate of alerts fixed by
developers or the rate of ignored alerts. Detected vulnerability alerts are classified into four
impact categories: Low, Medium, High and Critical. Those with the lowest impact are
information or warning alerts, while the highest are errors or serious security vulnerabilities.
Differences with CodeQL
CodeQL requires a buildable environment, while Semgrep runs directly on the source code,
and this increase his scanning speed. On the downside, it doesn't have some advanced
analysis features like cross-procedural data flow analysis that CodeQL has.
Semgrep is open source, while CodeQL is not.
Regarding ease of use, Semgrep rules are similar to the code that the developer uses and
CodeQL rules have a specific language that must be learned.
11
References in this section [81], [82]
12
References in this section [83]
30
[Link] SonarQube13
SonarQube is a static scanning tool, compatible with 29 programming languages and that
allows integration with the main CI/CD systems, such as GitHub Actions, Azure DevOps or
Jenkins.
Differences with CodeQL and Semgrep14
Semgrep is open source, SonarQube has an open source version (without all the languages,
or the more advanced scanning features), and CodeQL is not open source.
The syntax of writing custom scan rules is much easier with Semgrep, as they are similar to
the code that the developer is using. With CodeQL it is necessary to learn a specific language
and with SonarQube writing is restricted to a few specific languages and requires knowledge
of Java.
Semgrep is faster than the other two tools.
CodeQL can be used for free in the GitHub cloud with GitHub Actions, Semgrep also offers
this option, as well as an online dashboard. SonarQube does not allow this option, and a
previous installation of the scanning server must be done, on a local server, on a cloud server
or on SonarQube's own servers with the paid plan.
SonarQube is a more mature software than CodeQL and Semgrep and has differentiating
features to facilitate the work of developers. Stand out features, such as the detection of
technical debt and “code smell”, to apply best practices in programming. Metrics such as
duplicate lines of code, as well as the ability to make an approximation of the time required to
solve each alert. Another great advantage of SonarQube is the information that the developer
receives with each alert, with the possible risks associated with that alert, an explanation of
how to make a risk assessment specifically for that project and finally a solution proposal.
[Link] Bridgecrew15
Bridgecrew is a software that brings together a set of tools to automate the security of code
and its infrastructure. Among its tools, there are static scanners like Checkov. Bridgecrew is
compatible with 29 programming languages, has integration with the main cloud services and
has an open source version, which is the one that will be used. After the analysis, it performs
a periodic scan and send the results by email.
Among its features, it stands out the detection of errors and vulnerabilities, the detection of
bad practices in the code and the ability to show a graphical interface with all the dependencies
of the applications, which allows to have an overview of the points of failure of each app.
13
References in this section [84]
14
References in this section [85]
15
References in this section [86][86]
31
Viable options to secure the management of secrets will be discussed below.
Environment variable in the [Link]
As explained, this is not a best practice (although it is very common), as it compromises
sensitive information with the version control system, making this information visible to anyone
with access to the repository or file. Also, environment variables are not the safest method, as
explained in the third option. With this setup, a container's secrets can be read using the
docker inspect command.
Embedded in Docker images
This option is also a bad practice. Container images must be reusable and secure. Docker
images are made up of layers, so even if secrets are removed during the container creation
process, they can still be accessed using analysis tools, as can be seen in section 3.5.
Environment variables file
By creating a .env file, it can be defined environment variables. This file should be ignored in
version control systems, for instance, with a .gitignore. This prevents users with access to the
repository from reading the secrets.
As an advantage, it is a very simple and functional system. It is compatible with third-party
images that have not applied best practices, since it also uses environment variables. As
disadvantages, for maximum security cases, this system is not the most secure, since all the
environment variables, and with them the secrets, can be accidentally exposed in a memory
dump in the logs. With this setup, a container's secrets can be read using the docker inspect
command, which can be used by users with root permissions or with permissions to use
docker.
Docker Secrets16
Docker Swarm is a container orchestrator that, among other tools, contains Docker Secrets,
which allows to manage secrets securely. Docker secrets encrypt secrets both in transit and
at rest and only makes them accessible to the specified services, via a temporary file at
/run/secrets/<secret_name>. This file is a tmpfs which is stored only in ram, not on disk. With
this configuration, a container's secrets cannot be read by the docker inspect command. In
addition, it is possible to limit the users that can read these secrets (for example, the root user),
so this user can read the secret to perform a task, like downloading code from a repository,
then switch to a user without permissions to run the application and thus avoid that if an
attacker manages to break into the container, he could read the secret. [9]
Docker Swarm contains two types of nodes, the managers, which manage the states of the
cluster, and the workers, which execute the tasks. Docker Swarm uses the Raft distributed
consensus algorithm for communication between the different nodes and saving the system
state. The Raft logs are encrypted, since they contain, among other variables, the secrets of
Docker Secrets. To increase security, Swarm can be blocked, to prevent leaks in case a
manager container is compromised. When swarm is locked, a key is generated that must be
saved and manually supplied when the Docker daemon is restarted (or to decrypt the logs).
16
References in this section [87]
32
To block Swarm, it is only necessary to use this command: [10]
However, Docker Swarm is complex to use and does not support many Docker features, such
as the creation of some internal networks. For this reason, in this work Docker Swarm will be
used to check its behaviour, but in the final version an .env file will be used to manage the
secrets.
17
References in this section [64], [88], [89]
33
Grype supports the following image operating systems: Alpine, Amazon Linux, BusyBox,
CentOS, Debian, Distroless, Oracle Linux, Red Hat (RHEL), and Ubuntu.
It supports the following programming languages: Ruby, Java, JavaScript, Python, Dotnet,
Golang, PHP, and Rust.
Trivy of Aqua Security:
Trivy is a tool less known than others in container scanning, but it has been chosen because
it was the one that gave the best result in the comparisons found. Trivy analyses operating
system packages and application dependencies. Among its features, it stands out that it is
simple to install, compatible with CI/CD and has a great precision in images based on Alpine
and CentOS. Trivy is also capable of detecting misconfigurations in different systems, such
as Terraform or Kubernetes, in addition to detecting exposed secrets [14], [15].
18
References in this section [90]–[92]
34
InfluxDB: is an open source database management system, developed in Go. InfluxDB can
be paired with Telegraf, which is why it has been chosen for this work. [17]
Telegraf: is an open source server agent that collects metrics such as RAM usage or CPU
load. It allows this data to be grouped and sent, usually to a database, such as InfluxDB. It
has a large number of plugins to increase its functionality. [18]
Grafana Loki: is an open source log aggregation system, to centralize the logs received by an
exporter, such as Promtail. In this work, it serves as a data source for Grafana, to be able to
centralize and visualize the logs in the Grafana dashboard, along with the metrics. [19]
Promtail: is an open source server agent that collects the content of the logs. Allows to send
this data to Grafana Loki. [20]
2.4.10 LOGLEVEL
Logs are an essential part of programming, however, bad practices in their use are common.
Although there are many aspects related to the good implementation of logs, this section will
focus on their security, on how to ensure that the logs do not leak sensitive information in a
production environment.
Use log libraries:
Using printf or similar commands to display the information in the logs is a bad practice. There
are professional libraries with all the tools that are necessary to improve security, as can be
seen in the next paragraph.
Set levels in each log:
With professional libraries, it is possible to indicate what level each log is. This practice is
essential to be able to limit the logs, depending on the environment. In a debug environment,
it is wanted to see a large number of logs, while in a production environment it should only see
logs with priority equal to warning/notice or higher.
The different levels that normally exist are the following, although each library uses its own
version:
● TRACE: for trace logs. These logs should only be used in a development environment,
to track errors.
● DEBUG: to record what is happening in the application. These logs should only be
used in a development environment, to troubleshoot errors.
● INFO: to record user actions or states of the application. They are purely informative
and can be ignored.
● NOTICE: the level at which the program will run when it is in production. Logs at this
level are notable events that are not considered an error.
● WARNING: to log the events that could become application crashes. Errors or
unexpected events, which allows the application to continue working.
● ERROR: to log errors that cause certain functions to be unavailable.
● FATAL/CRITICAL: to record the events that cause a crash on a crucial service of the
application or all of it, and therefore, the application does not fulfil its objective.
Select log level:
The libraries offer the ability to set the log level through different systems, such as through
environment variables. For example, in a development environment, a Debug or Trace level
would be used, while in a production environment, a Notice or Warning level would be used.
35
This way, the developer automatically sees all the logs he needs, and when the same code is
executed in production, only the necessary logs are seen.
Do not log sensitive information:
It is essential to ensure that an application is not logging sensitive information. Of course, like
passwords, but also other information like session identifiers, authorization tokens or any PII
(Personal Identifiable Information). [21], [22]
19
References in this section [93]
36
OIDC (Open ID Connect):20
OpenID Connect is an identity layer that acts on top of the OAuth 2.0 protocol and allows users
to verify their identity through an external server. It is much easier to configure than its
competition (SAML), as well as to detect the origin of errors. Thanks to the use of OAuth, the
system can authorize or not the user's access to certain resources.
There are two key elements in the communication in an SSO system:
OpenID Provider (OP): it is the one who finally takes care of user authentication. It has access
to the database with the users and confirms that they are who they say they are. The user is
redirected from the Relying Party (RP) to login.
In this work, it is proposed to use the open source program Keycloak as OP. This program
allows the management of users, credentials and permissions, among other features, being
compatible with SSO. By default, it uses as identity and authorization federation protocols,
SAML v2 and OAuth 2.0 with OpenID Connect. [24], [25]
Relying Party (RP): is an OAuth 2.0 application client or proxy, which requires user
authentication and claims from an OpenID Connect provider. The RP is responsible for
redirecting the user to the OpenID Provider (OP) and once it has been authenticated, in case
of a successful response, the RP allows the user to access its service. [25], [26]
In this work, the RP that is proposed to be used is oauth2-proxy, that will give access to the
ToDo application. OAuth2 Proxy is a reverse proxy that provides authentication through
external providers such as Keycloak, Google or GitHub. [27]
In OAuth 2.0, there are different grant types, which are how an application obtains an access
token. Each grant type is optimized for a particular use case.
● Authorization code: server-side applications
● Implicit: applications running on the user's device
● Resource Owner Password Credentials: trusted applications
● Client Credentials: application API Access
The authorization code grant is the most used and is the one that it will be implemented in this
project. Its connection flow is as follows: [28]
1. The user navigates through the application, proxy or website they want to access (RP).
2. The RP redirects the user to the OP, where the user logs in with their credentials.
3. Once the user has logged in, he is redirected back to the RP along with an authorization
code.
4. The RP sends the authorization code to the OP.
5. The OP responds to the RP by sending the access token and the ID token.
6. Finally, if the RP has all the information it needs, the user is given access. Otherwise, it
uses the access token to access the UserInfo Endpoint.
20
References in this section [94]
37
2.4.12 WAF21
A Web Application Firewall, or WAF, is a software that aims to protect web applications by
filtering and monitoring HTTP traffic destined to the application.
A WAF protects against the main application layer attacks, such as: cross-site forgery, cross-
site-scripting (XSS), file inclusion and SQL injection, among others. It works at layer 7 of the
OSI model, so it is only designed to deal with attacks from this layer. A WAF must be
accompanied by other tools that protect attacks focused on other layers of the OSI model,
such as layer 3.
The protection of a WAF focuses on the application of rules or policies, to determine if the
traffic is safe or malicious. These rules can be general, or they can be customized for each
application.
WAFs can be categorized in multiple ways. If we categorize them by the blocking system,
there are 3 types of WAF. Blocklist WAFs protect against known attacks. They seek to detect
certain patterns in the requests that indicate that it is a malicious request. On the other hand,
allowlist WAFs are based on a list of allowed options, and only accept traffic that matches their
rules. Both types of WAFs have their advantages as well as their disadvantages, which is why
hybrid WAFs, which combine both blocking systems, are common.
Another way to categorize them is according to their implementation. Network-based WAFs,
usually hardware-based, achieve very low latency, although they are very expensive. Host-
based WAFs allow for complete integration within the code of each application. They are less
expensive than the previous ones and allow a higher level of customization, although they are
more complex to implement and maintain. Finally, cloud-based WAFs offer a solution that is
very easy to implement and that is updated automatically, although they involve a monthly or
annual cost.
Differences between an Intrusion Prevention System (IPS), a Web Application Firewall (WAF),
and a Next Generation Firewall (NGFW):22
An IPS is generally focused on layers 3 and 4 of the OSI model, although it can also analyse
layer 7. It has a much broader goal, wanting to detect many types of attacks on many
protocols, such as DNS, SMTP, TELNET, RDP, SSH and FTP. The detection system can be
based on signatures, rules or network anomaly detection.
A WAF protects the application layer and is specifically designed to analyse all HTTP and
HTTPS requests. It analyses the requests before they reach the application, preventing the
application from being compromised. Is a great option to fight against most of the
vulnerabilities of the OWASP Top 10, the list with the most common web application
vulnerabilities.
An NGFW, unlike the previous two, seeks to secure a local-area network from unauthorized
access to prevent the risk of attacks. So, it protects the user against the application.
Shadow Daemon:23
It is an open source WAF that detects, logs and blocks attacks on web applications. It is a
hybrid WAF, it contains a blacklist (blocklist) and a whitelist (allowlist). Contains default
21
References in this section [95]
22
References in this section [96]
23
References in this section [97], [98]
38
blacklist rules, but it is possible to add custom rules. It also contains a learning mode, in which
it scans the requests for as long as this mode is activated and, considering these requests as
not malicious, automatically creating rules for the whitelist.
Among its characteristics, the following stand out:
● High coverage of programming codes. It has been the only open source WAF found
that supports Python, and specifically Flask. Supports: PHP, Perl (CGI, Mojolicious,
and Mojolicious::Lite), Python (CGI, Django, Werkzeug, and Flask)
● Accurate detection through its hybrid system of blacklist, whitelist and integrity
verification.
○ Blacklist: search for known attack patterns using regular expressions.
○ Whitelist: search for irregularities in user input, according to rules.
○ Integrity verification: verification of the checksums of the executed scripts.
● It only blocks the dangerous part of malicious requests, to reduce the impact of false
positives.
● It is embedded within the application, to ensure that the scanned data is exactly the
same as the application's input data and prevent obfuscation from the attacker.
● It contains a “Generator”, which, together with the “Learning” mode, allows the program
to automatically create rules based on the example requests.
Some of the attacks that Shadow Daemon protects from are:
● SQL injections
● XML injections
● Code injections
● Command injections
● Cross-site scripting
● Local/remote file inclusions
● Backdoor access
Given the increase in cyberattacks in recent years, and although the application should be
secure by itself, a WAF it is wanted to be added to this project, to obtain an additional layer of
security. [29]
39
3 Project Development
The [Link] file used to run the first model of the application can be consulted in
Appendix 1. It is a common initial configuration in a first phase of development, although in
many cases, bad practices extend over a long period of the life of applications, leading to
serious security problems in the future. During the course of this work, the evolution of this
configuration can be seen, giving way to the incorporation of best practices and security
measures with which the applications should be developed.
40
3.2 GitHub Security24
For this project, it has been decided to use Git as a version control tool and GitHub as a code
repository. The repository is public ([Link] In order to
obtain a secure life cycle throughout the development of the application, security measures
must be applied from the beginning. That is why the following GitHub Security options have
been enabled, as can be seen in figure 15.
24
References in this section [81]
41
The dependency graph has been enabled, which it allows to quickly see all the dependencies
of the application. The Dependabot has also been enabled, which compares the dependency
map with the GitHub Advisory Database and not only notifies of the affected repositories, but
also makes a pull request with the possible solution, making it easier for developers to keep
the application updated, and reducing the time spent on these tasks.
Figures 17 and 18 show the dependency graph that has appeared in the project, after
activating this feature.
In figure 19 can be see the alert that the Dependabot made after it was enabled. Initially, it
was planned to force these alerts, but it has not been necessary, since the base application
has already triggered these alerts.
The Dependabot itself offers the solution to this version problem, update the jinja2 version in
the [Link] file. The pull request has been accepted, and the vulnerability has been
correctly patched.
42
3.3 Code Scanning
3.3.1 CodeQL
CodeQL has been configured in the project repository, using GitHub Actions, to make CodeQl
scan the code on each push or pull request.
In figures 20 and 21, it can be seen how after the first code scan, a vulnerability has been
detected in the code.
This security alert has a high severity, because it may allow an attacker to execute arbitrary
code through the debugger. It is common to use debugger mode while an application is being
developed, but it is essential to change this mode when the application goes into production.
That is why it is very important to create these automatic analysis systems, to avoid human
errors. Thanks to the alert, the debug mode has been changed to False, and if necessary, it
will be changed again.
3.3.2 Semgrep
CodeQL is not the only static scanning tool out there, as explained in section 2.4.5. In order
to build a complete SecDevOps pipeline, a second scan tool has been added to Semgrep.
It will be implemented the automatic scanning process with GitHub Actions. Even so, the code
is executed and tested on the configured servers (in this case, on GitHub), so Semgrep does
not obtain the code.
Semgrep has been installed through the GitHub marketplace, so it automatically creates the
GitHub Action configuration file that will be executed in each push. The GitHub Action
configuration code can be found in Appendix 2. [31]
43
Figure 22: Semgrep token on Repository Secrets
The SEMGREP_APP_TOKEN secret has also been added to the repository configuration, as
can be seen in figure 22. By adding the [Link] file, the two expected static scans are
executed, as it is shown in figure 23.
Semgrep offers a dashboard which permit to analyse the results. However, through the
Semgrep GitHub Action logs, it shows also the results, as it can be seen in the figure 24.
Moreover, notifications can be configured via Slack or email.
In the rule board, extra rules have been added to the default ones. The Python, Docker and
Secret Detection rules packs have been chosen, as it is shown in the figure 25.
44
Figures 26 and 27 show the Semgrep dashboard and alerts.
Following the alerts provided by Semgrep, the different code changes have been applied:
Avoid_app_run_with_bad_host (figure 28):
This warning says that running the flask application with host = [Link] can publicly expose
the server and other network users can access it. Since Docker containers are used to run the
application, it is necessary to continue with this configuration, in order to access the server
from the computer's operating system. However, without this parameter, the server can only
be accessed from within the container itself.
45
use-frozen-lockfile-pip (figure 29):
This alert, categorized as “information”, says how to use the “pip install” command to
guarantee reproducible and deterministic compilations. But it is not necessary to use the
proposed flag, since a [Link] file is used, which already specifies the versions of
each library.
no-new-privileges (figure 30):
This warning says that the no-new-privileges flag is not being used. As explained in section
2.4.3, it is a best security practice to apply this flag.
The [Link] code has been updated with the new flag in both services.
security_opt:
- no-new-privileges:true
This warning says that the read_only flag is not being used. As explained in section 2.4.3, it is
best security practice to apply this flag whenever possible. The Database Docker obviously
needs writing permissions, but the webservice Docker doesn't.
The [Link] code has been updated with the new flag in the demo_webservice
service.
read_only: true
46
There are two alerts about the root user, which have been left for later, in the Bridgecrew
section.
3.3.3 SonarQube
In addition to CodeQL and Semgrep, it has been wanted to go one step further in building a
complete SecDevOps pipeline, and a third tool has been added that complements the
scanning of the two previous tools.
Unlike Semgrep and CodeQL, SonarQube does not have an automatic installation process,
the files have been configured manually.
To use SonarQube in its open source version, it must be used locally or on an own server.
The desired result is that in each push, a trigger executes on GitHub Servers the code
analysis. However, the only solution that SonarQube offers for the free version is to send the
source code from GitHub to an own server with SonarQube already installed. A server like
this, that should have a public domain, is not available for this project. Although, in order to
have an ecosystem as realistic as possible, it has been decided to configure this not viable
option (without being able to test it) and also to implement a local solution that does not depend
on GitHub Actions.
In order to integrate SonarQube with GitHub, a GitHub app has been created for
authentication, as can be seen in figures 32 and 33. It has been configured following the steps
indicated in the official documentation. [32], [33]
47
The workflow has also been configured in GitHub Actions and the SONAR_TOKEN has been
added in GitHub secrets, as it is shown in figure 34. The GitHub Actions configuration code
can be found in Appendix 3.
As explained, this configuration is ideal, but for reasons of resources it cannot be carried out,
therefore, the workflow has been deactivated so that it does not try to connect to a non-existent
server in each push. The local installation has been carried out. The local application consists
of two parts, the server and the client. For the server, SonarQube has been installed using
Docker Compose. The resulting [Link] file can be consulted in Appendix 4. [34]
Once installed, a new project has been added and configured to analyse the project's code,
as can be seen in figure 35. [32]
For the installation of the client part, SonarQube client has been downloaded through a ZIP
file and the following command has been executed from the bin folder: [35]
./sonar-scanner -[Link]=sqp_9995ea7f8a6c2eb0319192d0ed96c7acdd9a88eb
-[Link]=Thesis-Locally -[Link]=[Link]
-[Link]=/Users/Admin/Desktop/Thesis/ -[Link]=/Users/Admin/Desktop/
Once the scan has been executed, the results in figure 36 have been seen when accessing
the local SonarQube server:
48
The bugs found by the Security Hotspot have started to be patched:
The first alert it is shown in figures 37-40:
49
Figure 39: SonarQube alert 1: Assess the risk
As it can be seen, it shows the place in the code where the vulnerability has been detected,
explains the possible risks associated with this vulnerability, indicates how to assess the risks
in this project and finally proposes a solution.
SonarQube facilitates the work of developers, increasing their productivity and reducing the
toil associated with each task, reducing the chances for alerts of being ignored due to lack of
time.
The following solution has been applied:
50
The second alert it is shown in figures 41 and 42:
To avoid lengthening this work, the "Assess the risk" and "How can you fix it" tabs will not be
shown again in this alert.
An attempt has been made to apply CSRF protection, using the following code in [Link]:
app = Flask(__name__)
csrf = CSRFProtect()
csrf.init_app(app)
SECRET_KEY = [Link](32)
[Link]['SECRET_KEY'] = SECRET_KEY
However, when applying this configuration, post requests to the application stop working. A
solution has not been found. It is probably due to the fact that the way of managing the
requests (through fetch) is old and should be updated, although it is out of the scope of this
project.
51
After seeing the Security Hotspot alerts, it has been started to fix the Code Smell alerts. In this
case, SonarQube tells approximately the time it can take to resolve all the alerts, 20 minutes,
as it can be seen in figure 43.
The first and second alerts are shown in figures 44 and 45:
In this case, it warns of a possible bad practice of the “raise” method to throw exceptions,
since the exceptions are not caught in the same function and can lead to runtime errors if both
functions are not called properly. The code has been reviewed, and it has been decided to
ignore the alerts, since the functions are being called correctly in the [Link] file, with the
corresponding “catch” function and this exception is used to mark the parameter boxes in red,
when the input is not correct.
The last alert refers to the comment “TODO”, which is usually used for pending tasks, although
in this case it refers to the name of the application. Therefore, the alert has been ignored.
52
After applying the explained solutions, the result when running the analysis again it is shown
in figure 46.
3.3.4 Bridgecrew
With the previously implemented scanning tools, many of the bad practices in Dockerfiles have
not been detected. For this reason, scanning with Bridgecrew has also been implemented.
Bridgecrew has been installed through GitHub Actions, although it does not require a
configuration file. By default, when it has access to the repository, it performs a code scan
once a day. [36]
It has been accessed to the dashboard, and with the first scan, the alerts of figure 47 have
appeared. Also, the dependency graph from figure 48.
53
Figure 48: Bridgecrew supply chain graph
It alerts of a vulnerability in the Click package, which is used in version 7.0.0. The alert
indicates that the CVE ID of the vulnerability is PRISMA-2021-0020. Identifiers with the word
PRISMA are vulnerabilities that do not have a CVE ID. These are usually publicly discussed
vulnerabilities and are never assigned to a CVE, which is why some security organizations
(such as Bridgecrew) assign them the PRISMA code.
In the case of PRISMA-2021-0020, it is based on a GitHub Issue opened by a user in the Click
repository. Reviewing the changes in version 8.0.0 of Click, it can be seen how error 1752,
corresponding to this vulnerability, is fixed. After doing this search, the version of Click of the
project, has been updated to solve this alert. [37]–[39]
The second alert, in figure 50, warns about a misuse of the ADD instruction in the webservice's
Dockerfile. As explained previously, it is preferable to use COPY whenever the ADD is not
specifically required.
Fixed Dockerfile:
COPY [Link] /
54
A similar alert about the database Dockerfile ADD command has also been fixed.
The third alert, in figure 51, warns about the container is running with root user. As explained
in section 2.4.3, if the application running inside the container is compromised, the attacker
can gain control over the root user, which would be a serious security problem.
A new user without root privileges has been created, and the user has been changed at the
end of the configuration. [40]
USER app
It has also been verified that the end user is the expected one (with a uid=1000, instead of 0),
as it is shown in figure 52.
55
The fourth alert, in figure 53, warns about that it is not configured a healthcheck to check the
correct behaviour of the webservice containers and the database container.
56
3.4 Application of best practices for secrets
In containers until now, the secrets have been passed via environment variables written in
plain text in the [Link]. As explained in section 2.4.6, this is a clear security
problem, since any user with access to the GitHub repository, can read all the secrets used.
The .env file system has been applied to store secrets, for the ease of use and because it
already represents an improvement in security compared to what was previously available.
The process that has been followed to apply this method, for instance, with SonarQube
containers, is explained below, although the same procedure has been applied to all
containers.
The .env file has been created:
SONAR_JDBC_PASSWORD=sonar
POSTGRES_PASSWORD=sonar
And the syntax of the [Link] has been modified, so instead of having the value
of the secrets in the flat, now the container can obtain them from the environment variables.
[42]
SONAR_JDBC_PASSWORD: "$SONAR_JDBC_PASSWORD"
POSTGRES_PASSWORD: "$POSTGRES_PASSWORD"
As explained above, secrets can be read with the Docker Inspect command, by using the .env
file as a method of managing sensitive information, as can be seen in figure 54 and 55.
57
Figure 55: Secrets exposed with Docker inspect
For the ToDo App, it is wanted to test Docker Secrets using the Docker Swarm orchestrator.
Although later an .env file will be also used for ease and due to the incompatibilities that Docker
Swarm has with some functions.
At first, Swarm has been activated:
The secret has been created through a file, so that the secret will not remain in the history, as
it is shown in figure 56.
The created secret has been verified, as it can be seen in figure 57:
environment:
MYSQL_ROOT_PASSWORD_FILE: /run/secrets/todo_mysql_root_pass
MYSQL_DATABASE: demo
environment:
MYSQL_USER: root
58
MYSQL_PASSWORD_FILE: /run/secrets/todo_mysql_root_pass
MYSQL_DATABASE: demo
secrets:
todo_mysql_root_pass:
external: true
For the webservice, it has been necessary to adapt the [Link] code, to change the way of
getting the password from the environment variables, to do it now from the temporary file
created by Docker Secrets:
[Link] = connect(host="demo_db",
And to run and remove the containers with Docker Swarm, the following two commands have
been used:
59
3.5 Secrets extraction from Docker images
Docker containers work in separate layers, each one applying changes to the previous one
and each one being cached. If an instruction that affects the fifth layer is changed, the cache
is used until layer four, and from the fifth, all are executed again until the end. Due to the lack
of knowledge about the internal workings of Docker containers, serious security errors are
made, such as using clear secrets, with the peace of mind that they will be eliminated before
the container is finished to build. For what has been explained, this process does not work like
this, each layer is cached, so the values of the previous layers can be accessed and, therefore,
the secrets can be accessed. Other times the secrets end up in the containers by mistake,
because the relevant tests and reviews are not carried out.
There are programs that help understand this operation, as well as being excellent tools for
analysing Docker images. In this work, it has been decided to use the Dive program. This open
source software shows all the layers of the scanned container, indicating in each layer all the
files it contains. [45], [46]
Command used for the installation (for macOS):
Figure 58 shows dive interface. At the top left are all the layers. In the middle left, the details
of the selected layer and in the lower left, the general details of the container image. On the
right side, all the files and directories of the selected layer are shown.
A real life example of the danger of exposing secrets within Docker images is the Supply Chain
Attack on Codecov. On January 31, 2021, hackers managed to update Codecov's official
code, thanks to credentials they got from a Docker image. Codecov is a tool that helps test
applications in his CI/CD infrastructure. The attackers went unnoticed, obtaining the
environment variables of all clients that were using Codecov, until April 1, when one of
60
Codecov's clients alerted Codecov, and finally the company discover the attack. Stolen
environment variables included server credentials, database information, GitHub tokens, and
more. The attackers managed, among other, to enter the GitHub repositories of companies
such as the company Twilio and clone all their information. [47]
Docker Hub is a good example of the bad practices that are being used in containers. Many
images in the world's largest Docker image repository have serious vulnerabilities.
GitGuardian:25
In this project, it was decided to make a small check of the exposed secrets that can be found
in randomly chosen containers in Docker Hub. To do this scan, GitGuardian Shield has been
used, an open source program that detects secret patterns both in repositories and directly in
Docker images.
For the installation, it has been used:
As a random method of selecting the images, it has been decided to go to the most recent
images section and scan the newest ones, as it can be seen in figure 59. None of the scanned
images it has been an official Docker image or verified image. Obviously, official or verified
images have passed tests and have better security, although the corresponding security
checks should also be carried out.
Surprisingly, it has been found an exposed secret in the first image that has been scanned.
A repository token has been found, as it can be seen in figure 60. The content of this image
and the file that contains the token it can be seen in figure 58.
25
References in this section [46], [99]–[102]
61
In the second scanned image, no secret has been found, as it can be seen in figure 61.
After having found an exposed secret between the two scanned containers, it has been
decided to analyse another 3 containers. To choose them, the Docker Hub page has been
reloaded and the most recent 3 have been used, which are shown in figure 62.
In the first two images, no secret has been found, as it can be seen in figures 63 and 64.
However, in the third one, although errors have appeared when scanning the image, it has
been detected what appears to be an authentication token. The result can be seen in figures
65 and 66.
62
Figure 66: GitGuardian shield results of image 5 part 2
Of course, this is not a representative test of the reality of Docker Hub. But it does show the
reality that studies have shown. GitGuardian did a study about secrets exposed in Docker
Hub. After analysing 2,000 Docker Hub images, they found that 7% of them contained at least
one exposed secret. [48]
Although some of the previously integrated tools, such as GitHub's secret scanning or
Semgrep, already have secret scanning functionality, GitGuardian Shield has been integrated
into GitHub Actions, to increase detection capabilities.
The [Link] file has been created in .github/workflows and can be found in Appendix 5
A GitGuardian account and a token have been created. The token has been used to create a
GitHub Secret named GITGUARDIAN_API_KEY. With the first push, no exposed secrets have
been detected.
The operation has been tested, exposing a false AWS secret that has been added in a new
file called [Link]
[default]
aws_access_key_id = AKIAYVP4CIPPIM3C4IM2
aws_secret_access_key = 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
output = json
region = us-east-2
The secret has only been detected by Semgrep and by GitGuardian Shield, but not by GitHub
Secret Scanning, which demonstrates the need to use more than one tool with the same utility,
to reduce false negatives.
A false password has also been added in the same file as a comment.
#pass=1234
#pass 5678
But in this case, no tool has detected it. So, we can see that the system is not ideal and there
are still basic vulnerabilities that are not detected.
63
3.6 Self update dependencies
To keep the dependencies up to date, in this work it has been chosen to use Renovate, an
open source tool that automatically reviews the code in each push and checks that the
dependencies are up-to-date. In the case of detecting a more current version of a dependency,
Renovate it will make a pull request or open a GitHub Issue with the proposed updates.
Renovate has been installed via the GitHub Marketplace. [49]
In pull requests, it recommends updating first to minor versions. For example, from 1.1.5 to
1.9.0. Then it also proposes to update to major versions, for example, from 1.9.0 to 2.2.1. This
is due to give the developer the possibility to check that, with every update, the program follows
working. The GitHub Issue created by Renovate, is the one shown in figure 67.
Thanks to Renovate pull requests, the following dependencies have been updated:
● mysqlclient to v1.4.6
● WTForms-JSON to v0.3.5
● WTForms to v2.3.3
● Werkzeug to v0.16.1
● mariadb to v10.5.3
● six to v1.16.0
● actions/checkout action to v3
● Flask to v2
● Jinja2 to v3
● MarkupSafe to v2
● WTForms to v3
● Werkzeug to v2
● itsdangerous to v2
● mysqlclient to v2
WTForms upgrade to v3 has not been done, because it is incompatible with existing code.
Knowing this, the development team could start working to adapt the code to the new version,
in order to update soon, while there are no known security problems related to the version
used.
Figure 67: Issue opened by Renovate with all the possible updates
64
During the development of the project, new alerts have appeared to update the new versions
that had come out, this has allowed the project to be kept up to date.
● For example, on August 02, 2022, 3 alerts were received about:
○ Updated Flask from 2.1.3 to 2.2.0
○ Updated Python image from 3.10.5-alpine3.16 to 3.10.6-alpine3.16
○ Updated Grafana image from 9.0.5 to 9.0.6.
● On August 04, 2022, to update Flask to 2.2.1.
● While doing Dockers monitoring, to update influxDB image from 1.8.3 to 1.8.10.
● On August 09, 2022 to update Werkzeug to 2.2.2 and Flask to 2.2.2
● On August 10, 2022, after doing Dockers monitoring to update Grafana image to 9.0.7
More alerts were also received, but due to incompatibility issues, they were not implemented.
65
3.7 Container Scanning
The results are a very long JSON and difficult to read quickly, although very detailed. It has
not been possible to find a counter of the vulnerabilities found, although after a manual
analysis it has been seen that 627 vulnerabilities have been detected in the database scan,
while 25 vulnerabilities have been detected in the Webservice image.
66
Webservice Container vulnerabilities found by Grype:
"vulnerabilities": {
67
Figure 71: Trivy output of webservice ToDo App
In this case, it can be seen how the results displayed are much simpler and easier to read. In
addition, these results are the ones that come out by default, without any configuration
process.
Comparison between Trivy and Grype:
As expected, thanks to the comparisons and reviews, there is a big difference between the
vulnerability detection of both.
Trivy has found 6 vulnerabilities in the Webservice and 8 in the database, while Grype has
found 25 in the webservice and 627 in the database.
Regarding those of the Webservice, Trivy has found 4 with High severity and 2 with Critical
severity, while Grype has found 4 with Medium severity, 19 with High severity and 2 with
Critical severity. In addition to have found a different number of vulnerabilities, those found,
differ among them. Of the 6 vulnerabilities found by Trivy, Grype has only found 3 of them
(and are the least severe).
Given these results, it seems clear that the container scanner must be carried out with multiple
tools, to expand the detection capacity.
With the large number of vulnerabilities found, it has been decided to update the base images
of the containers. There were new versions, but had not been detected by the Renovate
application. They have been updated to the latest stable versions. The Webservice Docker
base image it has been updated to python:3.10.5-alpine3.16 and the database Docker image
to mariadb:10.8.3. With the update, an error arises in the database that is solved by the
command “mysql_upgrade” from within the container. [50]
68
After the update, the images have been analysed again and the result can be seen in images
72 and 73.
Figure 72: Trivy output of webservice ToDo App after Python update
Figure 73: Grype output of webservice ToDo App after Python update
Thanks to the update, Trivy has not detected any vulnerability in the Webservice and only one
in the database (CVE-2022-29162). Grype has detected two vulnerabilities in the Webservice
(CVE-2022-28391, both related to the same CVE) and 52 in the database.
Based on these alerts, the development team should check which are real and which are false
positives, and correct the corresponding ones. Even so, it is surprising that taking the latest
stable versions of official Docker images, so many alerts appear. With non-stable versions,
these vulnerabilities may already be fixed, as python:3.11.0 and mariadb:10.9.0 are currently
under development. Or maybe, these alerts are false positives.
On 08/02/2022, as explained previously, the Renovate Bot warns that there was a new version
of the Python alpine image, 3.10.6-alpine3.16. After updating, both Trivy and Grype detect a
new vulnerability, CVE-2021-46828, as it can be seen in figure 74. No change was detected
in the other vulnerabilities.
Figure 74: Trivy output of webservice ToDo App after Python 2nd update
However, on the next scans without changing anything of the image, both failed to detect the
new vulnerability. Therefore, it could be a false positive.
69
3.8 Docker Monitoring and centralized logging26
This project wants to find a joint solution to the two problems explained in section 2.4.9 about
Dockers monitoring and log centralization. Given the desire to develop a joint solution to group
all the infrastructure metrics and logs in one place, it has been used the following open source
tools: Grafana, InfluxDB, Telegraf, Grafana Loki and Promtail. It has been decided to start
developing a test model in a virtual machine with Ubuntu, because of the doubt if the system
would work as expected.
Telegraf’s installation: [51]
$ dpkg -L influxdb
$ influx version
26
References in this section [103], [104]
70
The /etc/telegraf/[Link] file has been edited, and the following lines have been
uncommented:
[[[Link]]]
endpoint = "unix:///var/run/[Link]"
container_name_include = []
container_name_exclude = []
$ influx
Next, it has been verified that by configuring the Grafana panels it was possible to achieve the
expected results, previously configuring InfluxDB as a data source.
Once the monitoring model has been successfully completed, the log centralization system
has been configured. To do this, it was wanted to continue integrating it with the rest of the
work and use a configuration with Docker Compose, which allows easy execution in any
environment.
The following directory structure has been created
$ mkdir docker_volumes/grafana
$ mkdir docker_volumes/loki
$ mkdir docker_volumes/promtail
$ cd docker_volumes
The [Link] file has been created, which can be consulted in Appendix 8.
The configuration files loki/[Link] and promtail/[Link] have been created
and can be consulted in Appendix 9 and Appendix 10.
The previously installed version of Grafana has been stopped, since now the Dockerized
version will be used.
71
Finally, the [Link] file has been executed, and its correct operation has been
verified by visiting the addresses [Link] and [Link]
$ docker compose up -d
"log-driver": "loki",
"log-opts": {
"loki-url": "[Link]
"loki-batch-size": "400"
And finally, the Docker service has been reset to apply the changes.
At this point, it has been possible to centralize the logs of the chosen containers in the same
place, which, is the same place where there are the Docker metrics. But InfluxDB and Telegraf
have been installed as a test mock-up on the host and not in the [Link] file. In
order to follow best practices and unify the installation, a [Link] has been
configured with the entire unified monitoring and log centralization system.
The [Link] file resulting from adding InfluxDB and Telegraf can be found in
Appendix 11. [54]
It has been added the following files:
A [Link]: [54]
GF_INSTALL_PLUGINS=grafana-clock-panel,briangann-gauge-panel,natel-plotly-panel,grafana-simple-json-
datasource
72
A [Link]: [54]
INFLUXDB_DATA_ENGINE=tsm1
INFLUXDB_REPORTING_DISABLED=false
$ mkdir data_influxdb
$ docker-compose up -d
Next, the Grafana dashboard has been configured to display all the desired graphs. In order
not to lengthen this work, its configuration will not be detailed, but in the repository there is the
JSON file that describes Grafana Dashboard [55]. It has not been included in the appendices,
due to its big length.
The result it is shown in figure 75.
73
In the upper left part or in figure 76, it can be seen the number of existing containers, those
that are running, those that are stopped and finally the total number of downloaded images.
In the upper right part or in figure 77, it can be seen the running containers and the time they
have been running.
In the middle left part or in figure 78, it can be seen the graph of CPU usage by the containers,
as a percentage. Below the graph, there is a table that specifically indicates for each container,
its minimum, maximum and average consumption.
74
In the middle right part or in figure 79, it can be seen the graph of memory usage by the
containers. Below the graph, there is a table that specifically indicates for each container, its
minimum, maximum and average consumption.
At the bottom left or in the figure 80, there are the logs of the ToDo App webservice container.
At the bottom right or in the figure 81, there are the ToDo App database container logs.
75
3.9 LOGLEVEL27
In the configuration used until now, it was not generally applied a loglevel, so no distinction
was made between the logs that appeared in the development environment and those that
appeared in production. In addition, it was used a development server and not a production
server for the webservice.
To solve this situation and apply the measures explained in section 2.4.10, it has been
configured a production server (waitress) and it has been applied a loglevel in the [Link] file
of the webservice. It works by detecting if there is an environment variable named LOGLEVEL
that indicates the level to be used, otherwise it applies the production configuration with
Warning level by default.
The library used is “logging” and has the levels CRITICAL, ERROR, WARNING, INFO and
DEBUG. An .env file has been created to set the DEBUG level in the development
environment. It has been applied a 2-second wait before starting the connection to the
database, to avoid connection errors when the connection is not ready. Although Docker has
the “depends_on” function, it does not wait for a container to be ready, only for it to be running,
for this reason the timeout has been implemented. [56]
The configuration applied to the [Link] file:
import logging
import time
try:
LOGLEVEL = [Link]('LOGLEVEL').upper()
[Link](level=LOGLEVEL)
except:
[Link](level='WARNING')
[Link](2)
environment:
LOGLEVEL: "$LOGLEVEL"
It has been added an environment variable, to establish the default debug level in the
development environment.
LOGLEVEL=DEBUG
27
References in this section [105], [106]
76
3.10 SSO Login 28
As explained in section 2.4.11, an SSO system is an ideal authentication system for this
project. The protocol that will be used to implement it will be OpenID Connect.
In order to carry out this system, Keycloak has been installed as a user, credentials, and
permissions manager. For the installation, it has been decided to use [Link] to
give continuity to the rest of the project and have a system implemented in containers. The
[Link] used can be consulted in Appendix 12.
Once Keycloak has been installed, the control panel can be accessed through port 8080.
Keycloak login and dashboard it is shown in figure 82 and 83.
A user named test has been created, as can be seen in figure 84.
28
References in this section [107], [108]
77
A client has been configured for oauth2, as can be seen in figure 85, adding the redirection
URL and enabling client authentication, so that Keycloak asks for credentials when logging in.
The client secret has been queried, in order to configure the oauth2 proxy, as it is shown in
figure 86.
For the installation of the oauth2 proxy, it has been tried to also be in a container, using Docker
Compose. The file that has been used can be consulted in Appendix 13.
However, the connection could not be established, due to the following error:
As it has been found after investigating and trying different options, there is a problem with
Keycloak or Oauth2 in container format, which does not allow the connection between the two.
78
Not being able to find a solution for this problem, it has been decided to install Oauth2 locally
(on the host). In this case, the host is a Mac OS computer, however, the configuration would
be the same on Linux and only the way of downloading the proxy would change.
$ /usr/local/opt/oauth2_proxy/bin/oauth2-proxy --config=/usr/local/etc/oauth2-proxy/oauth2-
[Link]
The button to login from image 87, has been selected and the credentials of the created user
have been entered, as shown in figure 88.
The credentials have been accepted and Keycloak has given access to the application, as
seen in figure 89.
79
A button to logout has also been added. To logout, the oauth2_proxy cookie must be removed,
and the user must be redirected to the Keycloak logout page. If only one of these steps is
done, by going back to http:localhost:4180, the user might be able to log back in without
credentials.
However, redirecting the user to Keycloak and delete de cookie in the same step, did not
delete the cookie. Therefore, an intermediate step with a logout page has been created.
The logout page has been added to the [Link] file:
@[Link]("/logout")
def logout():
resp = make_response(render_template("[Link]"))
return resp
<button onclick="[Link]='[Link]
logout
</button>
The file [Link] has been created, which can be consulted in Appendix 15.
With the changes made, the logout procedure is as follows:
Then, if user tries to access the ToDo App, it asks for the credentials again.
80
3.11 Two-Factor Authentication
To increase the security of user authentication, it has been decided to enable two-factor
authentication system (2FA) by default. The system works by requesting a unique and
temporary code, each time the user must log in with their credentials. This code can only be
received on a device through an app, an SMS message, etc.
With this authentication method, security is improved a lot, because a potential cybercriminal
must not only know the user's credentials, but also have access to their device (usually the
mobile phone). In this case, the Google Authenticator application will be used to test the
operation.
To configure this option, it has been access to the Authentication tab on Keycloak, then the
Required Actions option has been selected, and the Default Action box has been activated in
the Configure OTP line, as shown in figure 94. [57]
From now, when new users log in for the first time, the prompts shown in the figure 95 will
appear.
81
After scanning the QR code and entering the code and the name of the device, the user can
access to the application. The next time the user logs in, the message shown in Figure 96 will
appear.
After entering the code, the user can access the application.
82
3.12 WAF29
As explained in section 2.4.12, although the applications should be secure by themselves, it
was wanted to add a WAF in this project, to obtain an additional layer of security. The WAF
that has been installed has been Shadow Daemon.
The official GitHub repository has been downloaded:
$ cd shadowctl
$ sudo ./shadowdctl up -d
However, an error has occurred with the shadowctl file. The rest of the installation and the
execution of the Docker Compose has worked correctly, but since the execution of the
shadowctl file did not complete, there were no environment variables created. The problem
has been solved by manually creating the environment variables in the [Link].
This is also a better implementation, since all the necessary configuration is grouped in a
single file. Taking advantage of the modification, the [Link] has been joined
with the ToDo application [Link], obtaining a single file from which to manage
all the configuration. The Shadow Daemon UI port has also been changed, as it was the same
one used by Keycloak.
A user for Shadowd has been created using the following command, from inside the container:
The containers have been rebooted and localhost:8200 has been accessed, where the
credentials of the created user have been entered. With this user, it has been possible to
access the Shadow Daemon dashboard, that it is shown in figure 97.
29
References in this section [109]–[114]
83
It has been accessed to management tab and selected the profiles option, where the profile
has been configured with the corresponding data, as it can be seen in figures 99 and 98.
[shadowd_python]
profile=1
key=test
The file template is located in the downloaded python connector, in the misc/examples
repository. The rest of the options have been left commented.
84
It has been added Shadowd library to [Link].
shadowd==3.0.2
The necessary configuration has also been added to [Link], so that it connects with the
Shadow Daemon.
@app.before_request
def before_req():
input = InputFlask(request)
output = OutputFlask()
Connector().start(input, output)
Once the containers has been rebuilt, an attempt has been made to access the ToDo app, but
it was not possible due to an error accessing the Shadow Daemon connector configuration. It
was missing to create the volume in the [Link], so that this file could be
accessed. It has also been necessary to give read permissions for this file.
volumes:
- /etc/shadowd:/etc/shadowd
When trying to access the application again, an error occurs again. This time, through the
browser it has only been possible to see “Internal Error” and there is no log in the webservice
container. Ways are sought to increase the level of verbose, to detect the problem. Finally,
the following changes have been applied.
The debug mode has been activated in the [Link] file and the path for the logs has
been enabled:
; Possible Values: 0 or 1
; Default Value: 0
debug=1
85
; Sets the log file, but it is only used if debug is enabled.
log=/User/ShadowdLogs/[Link]
…
A volume for the logs has been created in the [Link] file
volumes:
- ../../ShadowdLogs/:/user/ShadowdLogs/
Finally, the source of the error has been detected, although it has only appeared in the
[Link] file. The error that has appeared is:
status = [Link](
[Link]((host, port))
The configuration has been verified to ensure that it is correct. It has been searched in the
official documentation for a solution to this problem without success. No information has been
found on internet about a similar problem either. After several attempts, configuration changes,
log searches… It has been decided to open a GitHub Issue in the Python connector repository
([Link]
It has been possible to speak with the main developer, and after several attempts the problem
has been detected. In the connector configuration, the default IP is [Link]. Although this IP
is correct and both the webservice container and the Shadow Daemon have the corresponding
ports open, the connection cannot be made with this IP. In order to establish the connection,
it is necessary to use the internal IP of the container.
Even if the Shadow Daemon containers are assigned to the same virtual network as the ToDo
App containers and the IP of the [Link] is updated with the IP that Docker assigns to
the Shadow Daemon (for example, it can be queried with Docker Inspect), it will not work.
Since the containers must be rebuilt and the internal IP of the containers is dynamically
assigned by Docker and is not always the same.
86
In order to work around this issue, static IP addresses have been assigned to each container.
In the case of the Shadow Daemon container, the following configuration has been added to
the [Link]:
networks:
demo_network:
ipv4_address: [Link]
[shadowd_python]
profile=1
key=test
host=[Link]
Finally, it has been possible to access the application without problems. In addition, from the
Shadow Daemon dashboard, it has been possible to see the requests that the webservice has
received, as it can be seen in figure 100.
87
However, when trying to make a post (for example, writing a new task in the ToDo App), the
server responds with a 500 error. After activating all the debug parameters again, I get this
error:
status = [Link](
json_data = [Link](input_data)
return _default_encoder.encode(obj)
return _iterencode(o, 0)
This error is from the shadowd library used. An issue has been found in the Python connector
repository ([Link] which explains this issue.
The main developer confirmed the bug on November 24, 2021 and said it would be fixed soon,
as it can be seen in figure 101, but it is not resolved yet.
No solution has been found to resolve this bug. The main developer has been asked for more
information, and the answer from him has been that he will try to solve it soon. The connector
code of the [Link] file has been commented to avoid the error in the HTTP post of the ToDo
App.
88
3.13 Cracking users passwords30
The poor choice of passwords by users is one of the main causes of cyberattacks. Many users
use common passwords, which are easily hacked. The following statistics exposes the current
problem with passwords:
● 24% of Americans have used passwords like “password”, “Qwerty” and “123456”. [58]
● 67% of all Americans use the same password for different online accounts. [58]
● The password “123456” is now used by more than 23 million people. [58]
● More than 60% of workers use the same password for their work and personal
applications. [58]
For a company, this may be the number one vulnerability it faces, so it is very important to
take action. Keycloak offers the possibility to add many policies to the passwords, as it is
shown in figure 102. This is a way of making it difficult for users, to use easy passwords. For
instance, force them to change the password, not allow a recently used password to be used,
not let the password be the username, etc. It is essential to use these policies to strengthen
security against cyberattacks. [59]
Even then, it's hard to prevent workers from following bad practices for password creation and
management. Training on how passwords and password managers should be is essential. In
this project, it is wanted to go one step further and develop a system to detect users with more
vulnerable passwords in order to force them to change it in time. The system consists on
obtaining Keycloak password hashes and trying to crack them using a large database of
leaked passwords. Of course, this practice is considered as an educational exercise, if it was
carried out in a real environment, it could suppose a violation of the privacy of users and its
legality should be studied. A similar result, but without violating users privacy, can be achieved
configuring the Keycloak “Password Blacklist” policy, that does not allow users to use a
password that is banned.
The steps followed have been the next below:
The Keycloak database has been accessed and the tables it contained have been analysed,
as it can be seen in images 103 and 104. After examining them, the necessary two have been
found, the credential and the user_entity.
30
References in this section [115]
89
Figure 104: Keycloak user_entity database
A volume has also been created in the Keycloak database container, in order to access the
“/dump” directory where the dumps will be saved.
Once the names of the tables and the value of their columns are known, the data has been
downloaded.
Dump of the credentials (hash and salt) of all “password” type entries: [60]
$ psql -d keycloak -U keycloak -c "copy(SELECT secret_data FROM credential WHERE type = 'password') to
stdout" > /dump/dump_hash
Dump of the user_id of all the “password” type entries, in order to associate the cracked
passwords with the user ID:
$ psql -d keycloak -U keycloak -c "copy(SELECT user_id FROM credential WHERE type = 'password') to stdout"
> /dump/dump_id
Dump of user data, in order to associate their ID, with their username and email address:
$ psql -d keycloak -U keycloak -c "copy(SELECT id, email, first_name, last_name, username FROM user_entity)
to stdout" > /dump/dump_username_id
The “passwords” file has also been created with some test passwords and the real passwords
of Keycloak users. To do this test, the users' passwords have been configured to be the same
as their username. In a realistic case where we didn't know the users' passwords, one of the
most common top 100,000 or top 1,000,000 password files would be downloaded and used.
For example the SecLists from Daniel Miessler. [61]
A Python file, called hash_parser.py, has been developed. It formats the information in the
dump_hashes file so that the Hashcat program can read the information, and can be found in
Appendix 16. The hash_patser.py has been executed, and the resulting file it has been called
format_hashes.
90
The Hashcat program has been executed, which tries to break the hashes.
● The -m 10900 option indicates that the hashes are calculated with the pbkdf2-sha256
algorithm, as indicated by the credential table in the Keycloak database.
● The -a 0 option indicates that it is wanted to do a word list attack.
● The -r /usr/share/hashcat/rules/[Link] option indicates the file path of the
permutation rules that will be applied to the input list of passwords.
● -o cracked indicates that the result should be saved in the cracked file.
● passwords is the file with all possible passwords.
● --force is a necessary option on the computer used, for the program to work.
After executing the command, all the hashes have been cracked, as it can be seen in figure
105, and the results have been saved to the cracked file, in “hash:password” format, as it is
shown in figure 106.
91
4 Results
Instead, figure 108 shows the final state of the infrastructure, with all the tools added both
locally, so that developers can use them manually, and in the cloud, so that they are executed
automatically with each push. In addition, the SSO integrated in the application can be seen.
The WAF appears, although as explained above, until the bug with the Python connector is
fixed, it cannot be used.
92
Figure 108: Final infrastructure
93
5 Budget
Below is a brief study of the cost of carrying out the project. The calculated cost is based on
the estimated time prior to carrying out the work, although an assessment is also made of
whether the initially proposed value has been adjusted to reality.
Practically all the cost is due to the hours dedicated to the development of the project.
Considering the minimum salary indicated by the UPC internship agreements for master's
degrees, which is €10 for an hour, having planned to work a total of 80 days with a 5-hour
workday each day, the labour cost is €4,000.
In addition to the labour cost, it should also be taken into account that a computer with an
approximate value of €2,400 has been used, if it has a useful life of 5 years and taking into
account amortization, the cost of the computer during the 3 months of duration of the
investigation and implementation, is €120.
The software tools used do not imply a cost, because they are free and open source programs.
It has not been necessary to rent a space to work, but electricity has been used for lighting
and for the computer used. The lighting has a consumption of 15 W and the computer of 60
W. Taking into account the planning of 80 days with 5 hours a day, the estimated consumption
is 30 kWh. If the average price of electricity in Spain between May and July was €0.30/kWh
[62], the cost is €9. We must also add the cost of the Internet connection, which is €30/month,
therefore, it has meant a total of €90.
Consequently, the total cost of the project with the initial planning is €4,219.
With the planning readjustments, it was finished two days later than expected, increasing the
labour cost to €4,100 and the electricity cost to €9.23, therefore the final cost of the project it
has been €4,319.23.
94
6 Conclusions and future development
With the results presented, it can be verified that the proposed objectives have been achieved:
● Analyse DevOps, SecDevOps methodologies and the SRE role.
● Analyse the best security practices for development teams.
● Study the tools to add to a SecDevOps pipeline.
● Implement a SecDevOps pipeline.
● Secure an application with the pipeline implemented.
It has been possible to study the open source tools that an operations team, following a
SecDevOps methodology, must implement in its development infrastructure, to increase
security. These tools have been successfully tested and have helped to secure the test
application.
It has been able to see the changes that the application has undergone during the course of
this process. The code in the repository with the base application was 412 lines long. After the
project, using GitHub's “Comparing Changes” tool, it can be seen that 32 lines have been
removed and 3,445 new lines have been added. That is, the resulting repository has only 11%
similarities with the initial repository. [63]
The technologies added to the work have covered many technological disciplines, scanners
of all kinds have been used, but also a WAF, SSO authentication systems with OIDC,
databases have been accessed, and password hashes have been cracked. Docker Hub's
public container layers have been analysed and secrets extracted from them, and monitoring
and log centralization systems for containers have also been implemented. And this is a
consequence of where the trend is heading in the cybersecurity sector. Security systems, and
therefore the engineers who must develop and maintain them, must have a comprehensive
vision of the technologies and protect the applications in all their aspects. Security engineers
have to predict, as much as possible, where the market is headed and what the next
challenges will be. This is only possible with precise knowledge of a large set of technologies
that are being used today.
I would like to value the use of best practices in the implementation of the project, a general
use of containers has been made (through Docker Compose) with all the advantages that this
entails, such as standardization, scalability, isolation and ease of installation in other
environments. Also, an effort has been made to create the mock-ups in a real environment,
using an environment like GitHub Actions, instead of doing all the tests locally. Doing it locally
might not have shown some of the errors or inconveniences of the technology used, in addition
to not being able to take advantage of the automatisms. On the other hand, I want to highlight
the high level of modularity of the system used, which allows tools to be added or removed
without any negative impact on the pipeline. It is possible to completely change the application,
and the system would still work. Also, it is possible to change Keycloak for an OP in the cloud,
like JumpCloud or Google, and the authentication system would also work without having to
make any changes. This feature is very important to ensure the viability of these systems in
the long term, not having to depend exclusively on a supplier or product such as Keycloak.
95
Here is a brief assessment of the tools used:
● Cloud Code scanners: CodeQL has not given the expected results, it has only been
able to detect a basic vulnerability in the entire project. Both Semgrep and Bridgecrew
have been able to detect many, but not all, of the Docker Compose misconfigurations.
Bridgecrew has been more precise than Semgrep, although the latter offers extra
features such as the secret scanner. Also, the alerts of both are not accurate. For
example, if the user inside the container has write permissions, an alert will appear
regardless of the type of container it is, whether it is a web server or a database,
creating many false positives.
● Container scanners: both tools have been useful, although they are not perfect. None
of them is capable of detecting all vulnerabilities. Trivy detects much less, although it
is much easier to use. Grype detects many vulnerabilities, but from Trivy's results, it
seems that it doesn't detect all of them. In addition, it has not been possible to verify if
they are false positives, so it is difficult to compare their results.
● Metrics and logs: the five tools have allowed to obtain very good results that can help
a lot the development team to easily visualize their infrastructure, as well as to detect
possible failures before they occur.
● Other tools added in GitHub Actions: Renovate has proven to be very useful,
despite being a simple tool. GitGuardian has not fully met expectations, since it is not
capable of detecting passwords in comments, despite being a specific tool for detecting
secrets. Shadow Daemon has not been sufficiently tested due to the bug found. The
fact of choosing open source applications is sometimes a limitation, as may not give
the expected results. There are free but not open source WAF options that could be a
great alternative, such as Cloudflare.
● SSO: the result of both tools used have been as expected. They have worked correctly
and provide a great added value by being able to centralize the authentication of any
service in a single system.
● Local tools: both Dive and Hashcat have fulfilled the expected objectives. SonarQube
has the disadvantage that it cannot be run on GitHub Actions servers like other code
scanners, but it can be automated if it is installed on its own server, which should not
be a problem for a company. Although it is not the one that has detected the most
alerts, it has been able to detect very specific security vulnerabilities of big importance,
in addition, the information reported in each alert is very complete.
Referring to future developments, the next step in the project would be to go one step further
with the technologies used. Migrate the system to a more complex container orchestrator like
Kubernetes, but with many more features. With Kubernetes, it would be possible to integrate
Vault, a software to store and secure secrets that allow a high level of security. Among its
functions, it stands out the possibility of creating rotating secrets, which automatically change
every certain period of time. Or the ability to create a secret only when it is wanted to use it,
and have it self-destruct after use, transparent to the user.
Tools that have not given the expected result, such as container scanners, should also be
replaced to find other more reliable options. Also, an alternative WAF should be implemented,
if the Shadow Daemon bug remains unresolved.
Controls against SQLinjection and Loginjection attacks could be added to further protect the
application. Another very interesting security measure would be the implementation of
honeypots, so that Keycloak would store more than one password, together with the real
password of each user. The extra passwords would only be known by Keycloak, and it would
have a system to know which ones are false and which one is true. If the user ever tries to log
96
in with one of those false passwords, it would mean that the database has been leaked, and
an attacker is trying to impersonate a user.
Next, the infrastructure resulting from this work, with all the technologies implemented, is
shown again in figure 109, in order to be able to visualize the technologies discussed in this
section.
97
Bibliography
[1] ‘Un informático en el lado del mal: SecDevOps: Una explicación en cinco minutos (o poco más)
#SecDevOps’. [Link]
(accessed May. 13, 2022).
[2] ‘SecDevOps: A Practical Guide to the What and the Why - Plutora’.
[Link] (accessed May. 13,
2022).
[12] ‘Docker Image Security: Static Analysis Tool Comparison | [Link] | Alfredo Pardo’.
[Link]
vs-clair-vs-trivy/ (accessed Jul. 09, 2022).
[13] ‘Open Source CVE Scanner Round-Up: Clair vs Anchore vs Trivy | BoxBoat’.
[Link] (accessed Jul. 10, 2022).
[15] ‘aquasecurity/trivy-action: Runs Trivy as GitHub action to scan your Docker container image for
vulnerabilities’. [Link] (accessed Jul. 29, 2022).
[16] ‘What is Grafana? Why Use It? Everything You Should Know About It | [Link]’.
[Link] (accessed
Jul. 10, 2022).
[21] ‘Logging Best Practices: The 13 You Should Know | DataSet’. [Link]
commandments-of-logging/ (accessed Jul. 10, 2022).
[22] ‘Logging Levels: What They Are & How to Choose Them - Sematext’. [Link]
levels/ (accessed Jul. 10, 2022).
98
[23] ‘Autenticación - Wikipedia’. [Link] (accessed Jul. 16, 2022).
[25] ‘What is the difference between SP- and IdP-Initiated SSO? - Procore’.
[Link] (accessed Jul.
18, 2022).
[27] ‘Koyeb - Add Authentication to your Apps using OAuth2 Proxy’. [Link]
authentication-to-your-apps-using-oauth2-proxy (accessed Jul. 18, 2022).
[28] ‘OAuth 2.0 & OpenID Connect (OIDC): Technical Overview - YouTube’.
[Link] (accessed Aug. 02, 2022).
[29] ‘2022 Cyber Security Statistics: The Ultimate List Of Stats, Data & Trends | PurpleSec’.
[Link] (accessed Jul. 31, 2022).
[40] ‘Docker Tips: Running a Container With a Non Root User | by Luc Juggery | Better Programming’.
[Link] (accessed Jul.
26, 2022).
[42] ‘How to pass variables to a [Link] file using another .env file - YouTube’.
[Link] (accessed Jul. 18, 2022).
[45] ‘Wagoodman/dive: A tool for exploring each layer in a docker image’. [Link]
(accessed Jul. 20, 2022).
[46] ‘Finding leaked credentials in Docker images - How to secure your Docker images - YouTube’.
[Link] (accessed Jul. 20, 2022).
99
[47] ‘Supply Chain Attack - The Codecov case | Play by play - YouTube’.
[Link] (accessed Jul. 20, 2022).
[48] ‘Secrets exposed in Docker images: Hunting for secrets in Docker Hub’.
[Link] (accessed Jul. 20, 2022).
[50] ‘How to fix docker MariaDB correct definition of table mysql.column_stats – TechOverflow’.
[Link]
column_stats-expected-column-hist_type-at-position-9/ (accessed Jul. 29, 2022).
[60] ‘Postgresql what is the best way to export specific column from specific table from a DB to another -
Stack Overflow’. [Link]
export-specific-column-from-specific-table-fr (accessed Aug. 09, 2022).
[68] ‘Site Reliability Engineering in the Cloud’. [Link] (accessed May. 12, 2022).
100
[69] ‘¿Qué es SRE?: Ingeniería de fiabilidad del sitio | NetApp’. [Link]
solutions/what-is-site-reliability-engineering/ (accessed May. 12, 2022).
[71] ‘¿Qué son los microservicios? | AWS’. [Link] (accessed May. 16,
2022).
[81] T. E. De, ‘The complete guide to developer-first application security’, Resouce Libr., pp. 1–35, 2021
(accessed Jun. 16, 2022).
[84] ‘Code Quality and Code Security | SonarQube’. [Link] (accessed Jul. 06, 2022).
[88] ‘Open Source Container Security with Syft & Grype • Anchore’. [Link]
(accessed Jul. 28 2022).
[90] ‘Container Monitoring: Why, how, and what to look out for - Amazon Web Services’.
[Link] (accessed Jul. 10, 2022).
[91] ‘Docker monitoring tutorial - How to monitor Docker with Telegraf and InfluxDB | Cloud Native Computing
Foundation’. [Link]
telegraf-and-influxdb/ (accessed Jul. 11, 2022).
[92] ‘ABOUT THE SURVEY METHODOLOGY & RESPONDENTS Geographic Location Size of Organization’
(accessed Jul. 11, 2022).
101
[93] ‘How Does Single Sign-On Work?’ [Link] (accessed
Jul. 16, 2022).
[98] ‘Best Open Source Web Application Firewall to Secure Web Apps’. [Link]
source-web-application-firewall/ (accessed Aug. 01, 2022).
[100] ‘Secrets detection in the CI/CD pipeline | Detecting credentials with GitHub actions & GGShield -
YouTube’. [Link] (accessed Jul. 28, 2022).
[103] ‘Monitoring #Docker Using #Grafana | Monitor Docker Containers with Grafana - YouTube’.
[Link] (accessed Jul. 11, 2022).
[104] ‘Meet Grafana LOKI, a Log Aggregation System for Everything | Techno Tim Documentation’.
[Link] (accessed Aug. 12, 2022).
[105] ‘Python - Flask at first run: Do not use the development server in a production environment - Stack
Overflow’. [Link]
server-in-a-production-environmen (accessed Aug. 01, 2022).
[108] ‘Identificación y gestión de acceso para tus aplicaciones de microservicios con Keycloak y Oauth2-proxy’
[Link]
microservicios-con-keycloak-y-oauth2-proxy/ (accessed Aug. 03, 2022).
[115] ‘Using Hashcat to Crack User Passwords Stored by Keycloak | by Cyrill Bolliger’.
[Link]
(accessed Aug. 10, 2022).
102
Appendices
Appendix 1
[Link] of the first implementation of the application (the one downloaded from
the internet).
version: '2'
services:
demo_db:
build: ./demo_db
image: demo_db
container_name: demo_db
hostname: demo_db
restart: always
networks:
- demo_network
volumes:
- ./data/demo_db:/var/lib/mysql
environment:
MYSQL_ROOT_PASSWORD: demo
MYSQL_DATABASE: demo
expose:
- 3306
ports:
- 3306:3306
demo_webservice:
build: ./demo_webservice
image: demo_webservice
container_name: demo_webservice
hostname: demo_webservice
restart: always
networks:
- demo_network
depends_on:
- demo_db
volumes:
- ../demo_webservice:/code
103
expose:
- 5000
ports:
- 5000:5000
networks:
demo_network:
104
Appendix 2
Semgrep GitHub Action configuration code:
on:
pull_request: {}
push:
branches:
- main
paths:
- .github/workflows/[Link]
schedule:
name: Semgrep
jobs:
semgrep:
name: Scan
runs-on: ubuntu-20.04
env:
container:
image: returntocorp/semgrep
steps:
- uses: actions/checkout@v3
- run: semgrep ci
105
Appendix 3
SonarQube GitHub Action configuration code:
name: Sonarqube
on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]
schedule:
jobs:
build:
name: Build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0
- uses: sonarsource/sonarqube-scan-action@master
env:
106
Appendix 4
[Link] file for the local installation of SonarQube.
version: "3"
services:
sonarqube:
image: sonarqube:community
depends_on:
- db
environment:
SONAR_JDBC_URL: jdbc:postgresql://db:5432/sonar
SONAR_JDBC_USERNAME: sonar
SONAR_JDBC_PASSWORD: sonar
volumes:
- sonarqube_data:/opt/sonarqube/data
- sonarqube_extensions:/opt/sonarqube/extensions
- sonarqube_logs:/opt/sonarqube/logs
ports:
- "9000:9000"
db:
image: postgres:12
environment:
POSTGRES_USER: sonar
POSTGRES_PASSWORD: sonar
volumes:
- postgresql:/var/lib/postgresql
- postgresql_data:/var/lib/postgresql/data
volumes:
sonarqube_data:
sonarqube_extensions:
sonarqube_logs:
postgresql:
postgresql_data:
107
Appendix 5
GitGuardian GitHub Action configuration code:
jobs:
scanning:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
with:
uses: GitGuardian/ggshield-action@master
env:
108
Appendix 6
Grype GitHub Action configuration code: [64]
on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]
schedule:
name: Anchore
jobs:
semgrep:
name: Scan
runs-on: ubuntu-20.04
steps
- uses: actions/checkout@v3
if: always()
working-directory: docker/demo_db
uses: anchore/scan-action@v3.0.0
with:
image: "todo-database:latest"
fail-build: true
if: always()
working-directory: docker/demo_webservice
if: always()
uses: anchore/scan-action@v3.0.0
with:
image: "todo-webservice:latest"
fail-build: true
109
Appendix 7
Trivy GitHub Action configuration code: [15]
name: trivy
on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]
schedule:
jobs:
build:
name: Build
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v3
working-directory: docker/demo_db
uses: aquasecurity/trivy-action@master
with:
image-ref: 'todo-database:latest'
format: 'table'
exit-code: '1'
ignore-unfixed: true
vuln-type: 'os,library'
severity: 'CRITICAL,HIGH'
if: always()
working-directory: docker/demo_webservice
110
- name: Run Trivy vulnerability scanner on Webservice Docker
if: always()
uses: aquasecurity/trivy-action@master
with:
image-ref: 'todo-webservice:latest'
format: 'table'
exit-code: '1'
ignore-unfixed: true
vuln-type: 'os,library'
severity: 'CRITICAL,HIGH'
111
Appendix 8
[Link] file for the installation of loki, promtail and grafana for docker monitoring
and log centralization:
version: "3"
networks:
loki:
services:
loki:
image: grafana/loki:2.4.0
volumes:
- /home/marti/Desktop/docker_volumes/loki:/etc/loki
ports:
- "3100:3100"
restart: unless-stopped
command: -[Link]=/etc/loki/[Link]
networks:
- loki
promtail:
image: grafana/promtail:2.4.0
volumes:
- /var/log:/var/log
- /home/marti/Desktop/docker_volumes/promtail:/etc/promtail
restart: unless-stopped
command: -[Link]=/etc/promtail/[Link]
networks:
- loki
grafana:
image: grafana/grafana:latest
user: "1000"
volumes:
- /home/marti/Desktop/docker_volumes/grafana:/var/lib/grafana
ports:
- "3000:3000"
restart: unless-stopped
networks:
- loki
112
Appendix 9
Loki configuration file loki/[Link]:
auth_enabled: false
servers:
http_listen_port: 3100
grpc_listen_port: 9096
common:
path_prefix: /tmp/loki
storage:
file system:
chunks_directory: /tmp/loki/chunks
rules_directory: /tmp/loki/rules
replication_factor: 1
ring:
instance_addr: [Link]
kvstore:
store: immemory
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
scheme: v11
index:
prefix: index_
period: 24h
ruler:
alertmanager_url: [Link]
113
Appendix 10
Promtail configuration file, promtail/[Link]:
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/[Link]
clients:
- url: [Link]
scrape_configs:
- job_name: docker
pipeline_stages:
- docker: {}
static_configs:
- labels:
job: docker
__path__: /var/lib/docker/containers/*/*-[Link]
114
Appendix 11
[Link] file resulting from adding influxDB and telegraf, to the first version of
Docker Compose for monitoring and centralizing logs:
version: "3"
networks:
loki:
services:
loki:
image: grafana/loki:2.4.0
volumes:
- /home/marti/Desktop/docker_volumes/loki:/etc/loki
ports:
- "3100:3100"
restart: unless-stopped
command: -[Link]=/etc/loki/[Link]
networks:
- loki
promtail:
image: grafana/promtail:2.4.0
volumes:
- /var/log:/var/log
- /home/marti/Desktop/docker_volumes/promtail:/etc/promtail
restart: unless-stopped
command: -[Link]=/etc/promtail/[Link]
networks:
- loki
grafana:
image: grafana/grafana:latest
user: "1000"
volumes:
- /home/marti/Desktop/docker_volumes/grafana:/var/lib/grafana
ports:
- "3000:3000"
restart: unless-stopped
env_file:
- '[Link]'
networks:
115
- loki
influxdb:
image: influxdb:1.8.3
container_name: influxdb
ports:
- "8083:8083"
- "8086:8086"
- "8090:8090"
env_file:
- '[Link]'
volumes:
- /home/marti/Desktop/docker_volumes/data_influxdb:/var/lib/influxdb
telegraf:
image: telegraf:1.16.3
container_name: telegraf
user: telegraf:1000
links:
- influxdb
volumes:
-
/home/marti/Desktop/docker_volumes/[Link]:/etc/telegraf/[Link]:ro
- /var/run/[Link]:/var/run/[Link]
volumes:
data_influxdb:
data_grafana:
[Link]:
116
Appendix 12
[Link] file for Keycloak installation:
postgres:
image: postgres:14.4
volumes:
- postgres_data:/var/lib/postgresql/data
environment:
POSTGRES_DB: "$POSTGRES_DB"
POSTGRES_USER: "$POSTGRES_USER"
POSTGRES_PASSWORD: "$POSTGRES_PASSWORD"
restart: unless-stopped
keycloak:
image: [Link]/keycloak/keycloak:19.0.1-legacy
environment:
DB_VENDOR: POSTGRES
DB_ADDR: postgres
DB_DATABASE: keycloak
DB_USER: keycloak
DB_SCHEMA: public
DB_PASSWORD: "$DB_PASSWORD"
KEYCLOAK_USER: "$KEYCLOAK_USER"
KEYCLOAK_PASSWORD: "$KEYCLOAK_PASSWORD"
#KEYCLOAK_LOGLEVEL: DEBUG
ports:
- 8080:8080
- 8443:8443
depends_on:
- postgres
restart: unless-stopped
volumes:
postgres_data:
driver: local
117
Appendix 13
[Link] file for the Oauth2 proxy installation:
oauth2-proxy:
container_name: oauth2-proxy
image: [Link]/oauth2-proxy/oauth2-proxy:v7.2.0
ports:
- 4180:4180
hostname: oauth2-proxy
restart: unless-stopped
118
Appendix 14
Configuration file /usr/local/etc/oauth2-proxy/[Link]: [65]
http_address="[Link]:4180"
cookie_secure=false
cookie_secret="dskjfhkdjsfhkjds"
provider="oidc"
client_id="oauth2-proxy"
client_secret="7SxEQFyKMj2DyJE0b18R9phghuCxw4hR"
oidc_issuer_url="[Link]
insecure_oidc_allow_unverified_email=true
redirect_url="[Link]
email_domains = [
"*"
upstreams = [
"[Link]
119
Appendix 15
[Link] file to close the Keycloak and Oauth2 proxy session:
<!DOCTYPE html>
<html lang='en'>
<head>
<title>LOGOUT on KeyCloak</title>
</head>
<br>
<body>
<div>
</div>
<button
onclick="[Link]='[Link]
openid-connect/logout';">
logout
</button>
</body>
</html>
120
Appendix 16
hash_parser.py file to format Keycloak hashes:
def main():
f = open("dump/dump_hash", "r")
p = compile('{"value":"{}","salt":"{}","additionalParameters":{}}')
result = [Link]([Link]("\n")[0])
h = open("dump/format_hashes", "a")
[Link]("sha256:27500:"+result[1]+":"+result[0]+"\n")
[Link]
[Link]()
print("Done!")
if __name__ == "__main__":
main()
121
Appendix 17
Final [Link] file of the ToDo App:
version: '3.8'
services:
demo_db:
build: ./demo_db
container_name: todo_app_db
hostname: demo_db
restart: always
networks:
demo_network:
ipv4_address: [Link]
volumes:
- ./data/demo_db:/var/lib/mysql
environment:
MYSQL_ROOT_PASSWORD: "$MYSQL_ROOT_PASSWORD"
MYSQL_DATABASE: "$MYSQL_DATABASE"
expose:
- 3306
ports:
- 3306:3306
security_opt:
- no-new-privileges:true
demo_webservice:
build: ./demo_webservice
image: demo_webservice
container_name: todo_app_webservice
hostname: demo_webservice
restart: always
networks:
demo_network:
ipv4_address: [Link]
depends_on:
- demo_db
volumes:
122
- ../demo_webservice:/code
- /etc/shadowd:/etc/shadowd
environment:
MYSQL_USER: "$MYSQL_USER"
MYSQL_PASSWORD: "$MYSQL_ROOT_PASSWORD"
MYSQL_DATABASE: "$MYSQL_DATABASE"
LOGLEVEL: "$LOGLEVEL"
expose:
- 5000
ports:
- 5000:5000
security_opt:
- no-new-privileges:true
read_only: true
postgres:
image: postgres:14.4
container_name: keycloak_postgres
volumes:
- postgres_data:/var/lib/postgresql/data
- ../dump:/dump
environment:
POSTGRES_DB: "$POSTGRES_DB"
POSTGRES_USER: "$POSTGRES_USER"
POSTGRES_PASSWORD: "$POSTGRES_PASSWORD"
restart: unless-stopped
keycloak:
image: [Link]/keycloak/keycloak:19.0.1-legacy
container_name: keyclaok_logic
environment:
DB_VENDOR: POSTGRES
DB_ADDR: postgres
DB_SCHEMA: public
DB_DATABASE: "$POSTGRES_DB"
DB_USER: "$POSTGRES_USER"
123
DB_PASSWORD: "$POSTGRES_PASSWORD"
KEYCLOAK_USER: "$KEYCLOAK_USER"
KEYCLOAK_PASSWORD: "$KEYCLOAK_PASSWORD"
ports:
- 8080:8080
depends_on:
- postgres
restart: unless-stopped
security_opt:
- no-new-privileges:true
db:
image: zecure/shadowd_database:12.4
container_name: shadow_daemon_db
restart: always
volumes:
- "/var/lib/shadowd/db:/var/lib/postgresql/data"
environment:
POSTGRES_PASSWORD: "$DB_SHADOWD_PASSWORD"
SHADOWD_ENV_DB_LOCATION: "$SHADOWD_ENV_DB_LOCATION"
SHADOWD_DB_LOCATION: "$SHADOWD_DB_LOCATION"
SHADOWD_ENV_LOCATION: "$SHADOWD_ENV_LOCATION"
networks:
demo_network:
ipv4_address: [Link]
security_opt:
- no-new-privileges:true
web:
image: zecure/shadowd_ui:2.0.6
container_name: shadow_daemon_ui
restart: always
ports:
- 8200:80
links:
- db
depends_on:
124
- db
environment:
SHADOWD_DB_HOST: "$DB_SHADOWD_HOST"
SHADOWD_DB_PASSWORD: "$DB_SHADOWD_PASSWORD"
SHADOWD_ENV_DB_LOCATION: "$SHADOWD_ENV_DB_LOCATION"
SHADOWD_DB_LOCATION: "$SHADOWD_DB_LOCATION"
SHADOWD_ENV_LOCATION: "$SHADOWD_ENV_LOCATION"
networks:
demo_network:
ipv4_address: [Link]
security_opt:
- no-new-privileges:true
shadowd:
image: zecure/shadowd:2.2.0
container_name: shadow_daemon_logic
restart: always
ports:
- 9115:9115
links:
- db
depends_on:
- db
environment:
SHADOWD_DB_HOST: "$DB_SHADOWD_HOST"
SHADOWD_DB_PASSWORD: "$DB_SHADOWD_PASSWORD"
SHADOWD_ENV_DB_LOCATION: "$SHADOWD_ENV_DB_LOCATION"
SHADOWD_DB_LOCATION: "$SHADOWD_DB_LOCATION"
SHADOWD_ENV_LOCATION: "$SHADOWD_ENV_LOCATION"
networks:
demo_network:
ipv4_address: [Link]
security_opt:
- no-new-privileges:true
networks:
demo_network:
125
driver: bridge
ipam:
config:
- subnet: [Link]/16
gateway: [Link]
volumes:
postgres_data:
driver: local
126
Glossary
127