0% found this document useful (0 votes)
85 views24 pages

Multi-Tenant Database Architectures

Uploaded by

gopivanam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views24 pages

Multi-Tenant Database Architectures

Uploaded by

gopivanam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

UNIT-III

Chapter

Multi-Tenant Software
The term "software multitenancy" refers to a software architecture in which a single instance
of software runs on a server and serves multiple tenants. ... A tenant is a group of users who share a
common access with specific privileges to the software instance.

The advantages of a multi-tenancy SaaS over a third-party-hosted, single-tenancy application


include the following:
• Lower costs through economies of scale: With a single-tenancy-hosted solution, SaaS
vendors must build out their data center to accommodate new customers.
In contrast, in a multi-tenant environment, new users get access to the same basic
software, so scaling has far fewer infrastructure implications for vendors (depending on
the size of the application and the amount of infrastructure required).
• Shared infrastructure leads to lower costs: SaaS allows companies of all sizes to share
infrastructure and data center operational costs. Users don’t need to add applications and
more hardware to their data centers, and some small- to medium-sized businesses don’t
even need data centers if they utilize SaaS.
• Ongoing maintenance and updates: End users don’t need to pay costly maintenance
fees in order to keep their software up to date. New features and updates are included
with a SaaS subscription and are rolled out by the vendor.
• Configuration can be done while leaving the underlying codebase
unchanged: Although on-premises applications and single-tenant-hosted solutions are
often customized, this endeavour is costly and requires changes to an application’s code.
Additionally, this customization can make upgrades time-consuming, because an upgrade
might not be compatible with your customization.
• Most multi-tenant SaaS solutions are designed to be highly configurable so that
businesses can make the application perform the way they want without changing the
UNIT-III
underlying code or data structure. Because the code is unchanged, upgrades can be
performed easily.
• Vendors have a vested interest in making sure everything runs smoothly: Multi-
tenant SaaS providers have, in a sense, all their eggs in one basket. Although this sounds
dangerous, it’s a benefit to end users.
In a single-tenant environment, if there is a service disruption, it may affect only one
customer, meaning that the vendor might be slow to respond and fail to take the
necessary steps to ensure the problem doesn’t recur.
• In contrast, with a multi-tenant solution, a slight glitch could affect all of a vendor’s
customers. It is, therefore, imperative that SaaS vendors invest significant amounts of
money and effort into ensuring uptime, continuity, and performance.

Multi-tenant cloud vs single-tenant cloud

In a single-tenant cloud, only one customer is hosted on a server and is granted access to it.
Due to multi-tenancy architectures hosting multiples customers on the same servers, it is
important to fully understand the security and performance the provider is offering. Single-
tenant clouds give customers more control over the management of data, storage, security
and performance.

Multi-Entity Support:
UNIT-III

Figure shown below depicts the changes that need to be made in an application to support basic multi-entity
features, so that users only access data belonging to their own units. Each database table is appended with a
column (OU_ID) which marks the organizational unit each data record belongs to.

In the single schema model of Figure 9.2, a Custom Fields table stores meta information and data values for all
tables in the application. Mechanisms for handling custom fields in a single schema architecture are usually
variants of this scheme.

Multi-Schema approach

Instead of insisting on a single schema, it is sometimes easier to modify even an existing application to use
multiple schemas, as are supported by most relational databases. In this model, the application computes which
OU the logged in user belongs to, and then connects to the appropriate database schema. Such an architecture is
shown in Figure 9.3
UNIT-III

Data Access control for Enterprise applications

For the most part, multi-tenancy as discussed above appears to be of use primarily in a software as a service
model. There are also certain cases where multi-tenancy can be useful within the enterprise as well. We have
already seen that supporting multiple entities, such as bank branches, is essentially a multi-tenancy requirement.
Similar needs can arise if a workgroup level application needs to be rolled out to many independent teams, who
usually do not need to share data.

In these cases access to data may need to be controlled based on the values of any field of a table, such as high-
value transactions being visible only to some users, or special customer names being invisible without explicit
permission. Such requirements are referred to as Data Access Control (DAC) needs.
UNIT-III
UNIT-III

Chapter
Data in the Cloud

Since the 80s relational database technology has been the ‘default’ data storage and retrieval mechanism
used in the vast majority of enterprise applications. In the process of creating a planetary scale web search
service, Google in particular has developed a massively parallel and fault tolerant distributed file system (GFS)
along with a data organization (BigTable) and programming paradigm (MapReduce) that is markedly different
from the traditional relational model. Such ‘cloud data strategies’ are particularly well suited for large-volume
massively parallel text processing, as well as possibly other tasks, such as enterprise analytics. At the same time
there have been new advances in building specialized database organizations optimized for analytical data
processing, in particular column-oriented databases such as Vertica.

Relational databases:

Before we delve into cloud data structures we first review traditional relational database systems and how they
store data. Users (including application programs) interact with an RDBMS via SQL; the database ‘front-end’ or
parser transforms queries into memory and disk level operations to optimize execution time. Data records are
stored on pages of contiguous disk blocks, which are managed by the disk-space-management layer.

• Database systems usually do not rely on the file system layer of the OS and instead manage disk space
themselves

• Rows are stored on pages contiguously, also called a ‘row-store’, and indexed using B+-trees

• The database needs be able to adjust page replacement policy when needed and pre-fetch pages from
disk based on expected access patterns that can be very different from file operations.

• Relational records (tabular rows) are stored on disk pages and accessed through indexes on specified
columns, which can be B+-tree indexes, hash indexes, or bitmap indexes.

• B+ tree indexs are not the best for applications where reads dominate; For write and transaction
processing bitmap indexes, cross-table indexes and materialized views are used for efficient access to
records and their attributes.

• Recently column-oriented storage [61] has been proposed as a more efficient mechanism suited for
analytical workloads
UNIT-III

Over the years database systems have evolved towards exploiting the parallel computing capabilities of multi-
processor servers as well as harnessing the aggregate computing power of clusters of servers connected by a
high-speed network.

Figure 10.2 illustrates three parallel/distributed database architectures.


UNIT-III

Cloud File Systems: GFS and HDFS

The Google File System (GFS) [26] is designed to manage relatively large files using a very large distributed
cluster of commodity servers connected by a high-speed network

It is therefore designed to

(a) expect and tolerate hardware failures, even during the reading or writing of an individual file (since files are
expected to be very large)

(b) support parallel reads, writes and appends by multiple client programs.

The Hadoop Distributed File System (HDFS) is an open source implementation of the GFS architecture that is
also available on the Amazon EC2 cloud platform; we refer to both GFS and HDFS as ‘cloud file systems.’

The architecture of cloud file systems is illustrated in Figure 10.3. Large files are broken up into ‘chunks’ (GFS)
or ‘blocks’ (HDFS), which are themselves large (64MB being typical). These chunks are stored on commodity
(Linux) servers called Chunk Servers (GFS) or Data Nodes (HDFS); further each chunk is replicated at least
three times, both on a different physical rack as well as a different network segment in anticipation of possible
failures of these components apart from server failures.

Large files are broken up into ‘chunks’ (GFS) or ‘blocks’ (HDFS), which are themselves large (64MB being
typical). These chunks are stored on commodity (Linux) servers called Chunk Servers (GFS) or Data Nodes
(HDFS); further each chunk is replicated at least three times, both on a different physical rack as well as a
different network segment in anticipation of possible failures of these components apart from server failures.
UNIT-III

Big Table, HBase and Dynamo

• BigTable [9] is a distributed structured storage system built on GFS.


• Hadoop’s HBase is a similar open source system that uses HDFS.

• A BigTable is essentially a sparse, distributed, persistent, multidimensional sorted ‘map.

• Data in a BigTable is accessed by a row key, column key and a timestamp. Each column can store
arbitrary name–value pairs of the form column-family:label, string.

• Each Bigtable cell (row, column) can contain multiple versions of the data that are stored in decreasing
timestamp order

Example:

Since data in each column family is stored together, using this data organization results in efficient data access
patterns depending on the nature of analysis: For example, only the location column family may be read for
traditional data-cube based analysis of sales, whereas only the product column family is needed for say, market-
basket analysis. Thus, the BigTable structure can be used in a manner similar to a column-oriented database.

Figure 10.5 illustrates how BigTable tables are stored on a distributed file system such as GFS or HDFS.
UNIT-III

- Each table is split into different row ranges, called tablets


- Each tablet is managed by a tablet server that stores each column family for the given row range
in a separate distributed file, called an SSTable.
- Additionally, a single Metadata table is managed by a meta-data server that is used to locate the
tablets of any user table in response to a read or write request.
- The Metadata table itself can be large and is also split into tablets, with the root tablet being
special in that it points to the locations of other meta-data tablets.

BigTable and HBase rely on the underlying distributed file systems GFS and HDFS respectively and therefore
also inherit some of the properties of these systems.

In particular large parallel reads and inserts are efficiently supported, even simultaneously on the same table,
unlike a traditional relational database.

Dynamo

• Developed at Amazon and underlies its SimpleDB key-value pair database.

• Unlike BigTable, Dynamo was designed specifically for supporting a large volume of concurrent
updates, each of which could be small in size, rather than bulk reads and appends as in the case of
BigTable and GFS

• Dynamo also replicates data for fault tolerance, but uses distributed object versioning and quorum-
consistency to enable writes to succeed without waiting for all replicas to be successfully updated,
unlike in the case of GFS.

The architecture of Dynamo is illustrated in Figure 10.6.


UNIT-III

- Objects are keyvalue pairs with arbitrary arrays of bytes.


- An MD5 hash of the key is used to generate a 128-bit hash value.
- The range of this hash function is mapped to a set of virtual nodes arranged in a ring, so each
key gets mapped to one virtual node.
- The object is replicated at this primary virtual node as well as N − 1 additional virtual nodes
- Notice that the Dynamo architecture is completely symmetric with each node being equal, unlike
the BigTable/GFS architecture that has special master nodes at both the BigTable as well as GFS
layer.
- A write request on an object is first executed at one of its virtual nodes which then forwards the
request to all nodes having replicas of the object.
- Objects are always versioned, so a write merely creates a new version of the object with its local
timestamp (Tx on node X) incremented
- In Dynamo write operations are allowed to return even if all replicas are not updated.
- Dynamo is able to handle transient failures by passing writes intended for a failed node to
another node temporarily.
- Finally, Dynamo can be implemented using different storage engines at the node level, such as
Berkeley DB or even MySQL;

Cloud Data Stores: Datastore and SimpleDB

The Google and Amazon cloud services do not directly offer BigTable and Dynamo to cloud users.

Google and Amazon both offer simple key-value pair database stores, viz. Google App Engine’s Datastore and
Amazon’s SimpleDB.
UNIT-III
UNIT-III

Chapter
Database Technology

Database in the Cloud

Consumers can avail database facility in cloud in two forms.

• First one is the general database solution that is implemented through installation of some
database solution on IaaS (virtual machine delivered as IaaS).
- The users can deploy database applications on cloud virtual machines like any other
applications software.
- Apart from this, the ready-made machine images supplied by the vendors are also
available with per-installed and pre-configured databases
- Example: Amazon provides ready-made EC2 machine image with pre-installed Oracle
Database.

• The other one is delivered by service providers as database-as-a- service (DBaas) where the
vendor fully manages the backend administration jobs like installation, security management
and resource assignment tasks.
- Operational burden of provisioning, configuration, backup facilities are managed by the
service operators

Data Models

SQL Model or Relational Model

- It is not made for distributed data storage and thus makes the scaling of a database difficult.
- Oracle Database, Microsoft SQL Server, Open-source MySQL or Open-source PostgreSQL come under
this category.

NoSQL Model or Non-relational Model

- Suitable for building scalable systems.


- NoSQL database is built to serve heavy read-write loads and suitable for retrieval of the storages for
unstructured data sets.
- Amazon SimpleDB, Google Datastore and Apache Cassandra are few examples of NoSQL database
systems.

Database-as-a-service

• It is offered on a pay-per-usage basis that provides on-demand access to database for the storage of data
• Database-as-a-Service (DBaaS) is a cloud service offering which is managed by cloud service providers.
• DBaaS has all of the characteristics of cloud services like scaling, metered billing capabilities and else.
• Example of DBaaS for unstructured data include Amazon SimpleDB, Google Datastore and Apache Cassandra.
UNIT-III

Relational DBMS in Cloud

There are two ways to use RDBMS on cloud.

- Relational database deployment on cloud


- Relational Database-as-a-Service or fully-managed RDBMS

Relational database deployment on cloud:

There are two ways of deploying rdbms on cloud:

1. User can install database application on server like local machine


2. Many cloud services provide readymade machine images that already include an installation of a database.

Deploying some relational database on cloud server is the ideal choice for users who require absolute control over the
management of the database.

Relational Database-as-a-service

Many cloud service providers offer the customary relational database systems as fully-managed services which
provide functionalities similar to what is found in Oracle Server, SQL Server or MySQL Servers.

Relational database management system offerings are fully-managed by cloud providers.

Amazon RDS:

Amazon Relational Database Service or Amazon RDS is a relational database service available with AWS.
It supports the capabilities of Oracle Server, Microsoft SQL server, open source PostgreSQL and MySQL.

Two different pricing options are available with Amazon RDS

- Reserved DB Instances : one-time payment and offers three different DB Instance types (for light,
medium and heavy utilization).
- On-Demand DB Instances : provide the opportunity of hourly payments with no long-term
commitments.

Google Cloud SQL:

• Google Cloud SQL is a MySQL database that lives in Google’s cloud and fully managed by Google.
• It is very simple to use and integrates very well with Google App Engine applications written in Java, Python,
PHP and Go.
• Google Cloud SQL is also accessible through MySQL client and other tools those works with MySQL
databases.
• Google Cloud SQL offers updated releases of MySQL.

Google offers two different billing options:

- Packages option is suitable for users who extensively use the database per month
- Per Use hourly-basis billing is preferable which is available
UNIT-III

Azure SQL Database:

• Provides functionalities of Microsoft SQL Server.


• Like Microsoft SQL Server, SQL Database uses T-SQL as the query language.
• Service available under three service tiers like basic, standard and premium with different hourly-basis pricing
options.
• The premium tier supports mission-critical, high-transactional volume and many concurrent users whereas the
basic tier is for small databases with a single operation at a given point in time.

Amazon RDS, Google Cloud SQL and Azure SQL Databases deliver RDBMS as-a-Service.

Non-Relational DBMS in cloud

Non-relational database system is another unique offering in the field of data-intensive computing.

Big Data:

Big data is used to describe both structured and unstructured data that is massive in volume.

Three said characteristics of Big data are described below.

Volume: A typical PC probably had 10 gigabytes of storage in the year of 2000. During that time, excessive data
volume was a storage issue as storage was not so cheap like today. Today social networking sites use to generate few
thousand terabytes of data every day.

Velocity: Data streaming nowadays are happening at unprecedented rate as well as with speed. So things must be
dealt in a timely manner. Quick response to customers’ action is a business challenge for any organization.

Variety: Data of all formats are important today. Structured or unstructured texts, audio, video, image, 3D data and
others are all being produced every day.

NoSQL DBMS:

NoSQL is a class of database management system that does not follow all of the rules of a relational DBMS.

The term NoSQL can be interpreted as ‘Not Only SQL’ as it is not a replacement but rather it is a complementary
addition to RDBMS

NoSQL is not against SQL and it was developed to handle unstructured big data in an efficient way to provide
maximum business value

CAP Theorem:

The abbreviation CAP stands for Consistency, Availability and Partition tolerance of data.

CAP theorem (also known as Brewer’s theorem) says that it is impossible for a distributed computer system to meet
all of three aspects of CAP simultaneously.
UNIT-III
█ Consistency: This means that data in the database remains consistent after execution of an operation. For
example, once a data is written or updated, all of the future read requests will see that data.
█ Availability: It guarantees that the database always remains available without any downtime.
█ Partition tolerance: Here the database should be partitioned in such a way that if one part of the database
becomes unavailable, other parts remain unaffected and can function properly. This ensures availability of
information.

Any database system must follow this ‘two-of-three’ philosophy.

CA: It is suitable for systems being designed to run over cluster on a single site so that all of the nodes always remain
in contact. Hence, the worry of network partitioning problem almost disappears. But, if partition occurs, the system
fails.

CP: This model is tolerant to network partitioning problem, but suitable for systems where 24 × 7 availability is not a
critical issue. Some data may become inaccessible for a while but the rest remains consistent or accurate.

AP: This model is also tolerant to network partitioning problem as partitions are designed to work independently. 24
× 7 availability of data is also assured but sometimes some of the data returned may be inaccurate.

BASE Theorem

Relational database system treats consistency and availability issues as essential criteria. Fulfillments of these
criteria are ensured by following the ACID (Atomicity, Consistency, Isolation and Durability) properties in RDBMS.

NoSQL database tackles the consistency issue in a different way. It is not so stringent on consistency issue; rather it
focuses on partition tolerance and availability. Hence, NoSQL database no more need to follow the ACID rule.

NoSQL database should be much easier to scale out (horizontal scaling) and capable of handling large volume of
unstructured data. To achieve these, NoSQL databases usually follow BASE principle which stands for ‘Basically
Available, Soft state, Eventual consistency’.

The three criteria of BASE are explained below:


UNIT-III
█ Basically Available: This principle states that data should remain available even in the presence of multiple node
failures. This is achieved by using a highly-distributed approach with multiple replications in the database
management.

█ Eventual Consistency: This principle states that immediately after operation, data may look like inconsistent but
ultimately they should converge to a consistent state in future. For example, two users querying for same data
immediately after a transaction (on that data) may get different values. But finally, the consistency will be regained.

█ Soft State: The eventual consistency model allows the database to be inconsistent for some time. But to bring it
back to consistent state, the system should allow change in state over time even without any input. This is known as
Soft state of system.

BASE does not address the consistency issue. The idea behind this is that data consistency is application developer’s
problem and should be handled by developer through appropriate programming techniques.

Features of NoSQL Database

Flexible Schemas: Unlike relational database, NoSQL database is schema-free.


Non-relational
Scalability
Auto-distribution : Distribution and replication of data segments are not inherent features of relational database;
these are responsibilities of application developers. In NoSQL, these happen automatically.
Auto-replication
Integrated Caching : This feature reduces latency and increases throughput by keeping frequently-used data in
system memory as much as possible.

Despite many benefits, NoSQL fails to provide the rich analytical functionality in specific cases as RDBMS serves.

NoSQL Database Types

1. Key-Value Database: Amazon’s DynamoDB, Azure Table Storage and CouchDB


2. Document-Oriented Database :
• It is similar to the Key-Value stores with the values stored in structured documents.
• The documents are schema-free and can be of any format as long as the database application can understand
its internal structure. Generally document-oriented databases use some of the XML, JSON (JavaScript Object
Notation) or Binary JSON (BSON) formats.
• MongoDB, Apache CouchDB, Couchbase

3. Column-Family Database
• A Column-Family Database (or Wide-Column Data Store/Column Store) stores data grouped in columns.
• Each column consists of three elements as name, value and a time-stamp.
• A similar type of columns together forms a column family which are often accessed together.
• A column family can contain virtually unlimited number of columns.
• The difference between column stores and key-value stores is that column stores are optimized to handle
data along columns.
UNIT-III
• Column stores show better analytical power and provide improved performance by imposing a certain
amount of rigidity to a database schema
• Hadoop’s Hbase

4. Graph Database

• Data is stored as graph structures using nodes and edges.


• The entities are represented as nodes and the relationship between entities as edges.
• Useful to store information about relationships when number of elements are huge, such as social
connections
• Neo4J, Info-Grid, Infinite Graph

Commercial NoSQL Databases

• Apache’s HBase
• Amazon’s DynamoDB
• Apache’s Cassandra
• Google Cloud Datastore
• MongoDB
• Amazon’s SimpleDB
• Apache’s CouchDB
• Neo4j
UNIT-III

Content Delivery Network

Content Delivery in the Cloud

Content is any kind of data ranging from text to audio, image, video and so on. Delivering this content to any location at any
time is a critical issue for the success of cloud services.

The Problem

The problem of delivering content in cloud exists due to the distance between the source of the content and the
locations of content consumers.

To meet the business needs and to fulfill application demands, the cloud-based services require a real-time information
delivery system (like live telecasting of events) to respond instantaneously. This is only possible when LAN-like
performance can be achieved in network communication for content delivery.

Cloud computing is basically built upon the Internet, but cloud based services require LAN like performance in
network communication.

The Solution

• Rather than remotely accessing content from data centers, they started treating content management as a set of
cached services located in servers near consumers.

• The basic idea is that instead of accessing data content in cloud centrally stored in a few data centers, it is better to
replicate the instances of the data at different locations.

• A network of such cached servers made for the faster and efficient delivery of content is called Content Delivery
Network (CDN).

Content Delivery Network

The CDN is a network of distributed servers containing content replicas that serve by delivering web-content against each
request from a suitable content server based in the geographic location of the origin of request and the location of content
server.

The actual or original source of any content is known as content source or origin server.

The additional content servers, placed at different geographic locations, are known as edge servers.

CDN enables faster delivery of content by caching and replicating content from ‘content source’ to multiple ‘edge servers’ or
‘cache servers’ which are strategically placed around the globe.

Content Types:

Digital content can be categorized into two types:

- static content
- Live media or stream

Delivering live streaming media to users around the world is more challenging than the delivery of static content.

The Policy decisions


UNIT-III

a. Placement of the edge servers in the network.


b. Content management policy which decides replication management.
c. Content delivery policy which mainly depends on the caching technique.
d. Request routing policy that directs user request for content to make appropriate for the
edge server.

a. Placement of the edge servers in the network : The placement locations of edge servers are often
determined with the help of heuristic (or exploratory) techniques
b. Content management policy which decides replication management :
Two policies of content management: full-site replication and partial-site replication.
Full-site replication is done when a site is static.
c. Content delivery policy which mainly depends on the caching technique : the cache update policies
along with the cache maintenance are important in managing the content over CDN.
Content updates policies: on-demand updates and periodic updates.
d. Request routing policy : the user requests are directed to the optimally closest edge server over a CDN
that can best serve the request.

Policy decisions play a major role in the performance of any CDN service.

Push and Pull

In push mechanism, content is automatically transferred to edge servers whenever it is changed.


In pull process, latest version of content is only transferred to edge servers when a request is raised for that content.

The Model of CDN

Advantages of CDN:

- CDNs facilitate proper distribution and optimized routing service of web-content.


- Accommodating Heavy Traffic: CDN provides mechanism to manage such heavy network
traffic efficiently through the distribution of content delivery responsibility among multiple
servers spread over the network.
- Support for More Simultaneous Users : CDN facilitates cloud service provider or any other
content provider to simultaneously support more number of users to consume their services.
This is because of the strategically placed servers in CDN which ensure that a network always
maintains a very high-data threshold
- Less Load on Servers
UNIT-III
- Faster Content Delivery
- Lower Cost of Delivery
- Controlling Asset Delivery : Depending upon the statistics CDN operators decide where to
prioritize extra capacity to ensure smooth running of all systems.
- Facilitates Scalability
- Better Security : In CDN, the content files are replicated in multiple content servers. This
makes the recovery of damaged files of a server easier as other copies of the same data are
stored in additional servers.

Disadvantages of CDN:

- New points of failure


- Additional Content Management Task

CDN Service Provider

CDN services are offered by many vendors, that any content provider can use to deliver content to
customers worldwide.

Cloud service providers sometimes build up their own CDN infrastructure; otherwise they outsource
the content delivery task to some CDN service providers.

CDN service providers are specialists in content delivery and can deliver highest possible performance
and quality irrespective of delivery location.

CDN Providers:

Akamai

Akamai evolved from a project work at Massachusetts Institute of Technology (MIT)


Currently Akamai’s content delivery network service is world’s largest CDN service.
Customers of Akamai include major online players like Facebook, Twitter, Yahoo, ESPN Star and
BBC iPlayer

Limelight

Provider of content delivery network services and has extensive point-of-presence (PoP)
worldwide.

Amazon’s CloudFront

CloudFront delivers content through a worldwide network of edge locations.


The service operates from more than 50 edge locations spread throughout the world.

Azure Content Delivery Network

Access to Azure blobs through CDN is preferable over directly accessing them from source
containers.
CDN delivery of blobs stored in containers has been able through the Microsoft Azure Developer
Portal.
UNIT-III
When request for data is made using the Azure Blob service URL, the data is accessed directly
from the Microsoft Azure Blob service.
But if request is made using Azure CDN URL, the request is redirected to the CDN end point closest
to the request source location and delivery of data becomes faster.

CDNetworks

Originally founded in Korea in 2000, currently CDNetworks has offices in the Korea, US, China, UK
and Japan.

CDNetworks has developed massive network infrastructure having strong POP (point-of-
presence) coverage on all of the continents.
Currently it has more than 140 POPs in 6 continents including around 20 in China.
UNIT-III

Security Issues

Cloud-based security systems need to address all the basic needs of an information system like
confidentiality, integrity, availability of information, identity management, authentication and
authorization.

Cloud Security

Cloud computing demands shared responsibilities to take care of security issues. It should not be left
solely under the purview of the cloud provider, the consumers also have major roles to play.

Service-level agreements (SLAs) are used in different industries to establish a trust relationship
between service providers and consumers.

The SLA details the service-level capabilities promised by the providers to be delivered and
requirements/expectations stated by consumers.
Organizations should engage legal experts to review the SLA document during contract negotiation
and before making the final agreement.
The SLAs between the cloud service providers (CSPs) and consumers should have detailed mentioning
of the security capabilities of the solutions and the security standards to be maintained by the service
providers. Consumers, on the other hand, should provide clear-cut information to the service
providers about what they consider as a breach in security.

SLA document plays an important role in security management, for consumers moving towards cloud
solutions.

Threats, Vulnerability and Risk

Threat is an event that can cause harm to a system. It can damage the system’s reliability and demote
confidentiality, availability or integrity of information stored in the system.

Vulnerability refers to some weaknesses or flaws in a system (hardware, software or process) that a threat
may exploit to damage the system

Risk is the ability of a threat to exploit vulnerabilities and thereby causing harm to the system. Risk occurs
when threat and vulnerability overlap.

■ Eavesdropping: This attack captures the data packets during network transmission and looks for
sensitive information to create foundation for an attack.

■ Fraud: It is materialized through fallacious transactions and misleading alteration of data to make
some illegitimate gains.

■ Theft: In computing system, theft basically means the stealing of trade secrets or data for gain. It
also means unlawful disclosure of information to cause harm.

■ Sabotage: This can be performed through various means like disrupting data integrity (referred as
data sabotage), delaying production, denial-of-service (DoS) attack and so on.
UNIT-III
■ External attack: Insertion of a malicious code or virus to an application or system falls under this
category of threat.

Threads to Cloud Security:

- Threats to Infrastructure
- Threats to Information
- Threats to Access Control

Public cloud deployment is the most critical case study to understand security concerns of cloud
computing. It covers all possible security threats to the cloud.

Infrastructure Security:

Infrastructure security describes the issues related with controlling access to physical resources which support
the cloud infrastructure.

Infrastructure security can be classified into three categories like network level, host level and service level.

Network Level Security:

The network-level security risks exist for all the cloud computing services (e.g., SaaS, PaaS or IaaS).

It is actually not the service being used but rather the cloud deployment type (public, private or hybrid) that
determine the level of risk.

In case of public cloud services, use of appropriate network topology is required to satisfy security
requirements.

Ensuring data confidentiality, integrity and availability are the responsibilities of network level
infrastructure security arrangement.

Most of the network-level security challenges are not new to cloud; rather, these have existed since the early
days of Internet. Advanced techniques are always evolving to tackle these issues.

Host Level Security:

You might also like