0% found this document useful (0 votes)
41 views216 pages

Data Science Foundations Course Overview

Uploaded by

chinchupandian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views216 pages

Data Science Foundations Course Overview

Uploaded by

chinchupandian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

SRI VENKATESWARA UNIVERSITY::TIRUPATI

[Link]. Computer Science Honours


III year V Semester
Course 14 B: Foundations of Data Science
(w.e.f. 2025-26)

Theory Credits: 3 3 hrs/week


Learning Objectives:
To enable students to develop solutions for real-world problems
Learning Outcomes: On successful completion of the course, students will be able to
1. Identify the need for data science and understand various data collection strategies
2. Understand about NoSQL and Descriptive Statistics
3. Apply Numpy methods to process the data in an array.
4. Summarize and Compute Descriptive Statistics using Pandas.
5. Apply powerful data manipulations visualization using Pandas
UNIT-I
Introduction to Data Science: Need for Data Science – What is Data Science - Evolution of Data
Science, Data Science Process – Business Intelligence and Data Science – Prerequisites for a Data
Scientist – Tools and Skills required. Applications of Data Science in various fields – Data Security
Issues. Data Collection Strategies: Data Pre-Processing Overview, Data Cleaning, Data Integration and
Transformation, Data Reduction, Data Discretization.
UNIT-II
Getting Started with Python: Introduction to Python, Python Keywords, Identifiers Variables,
Comments, Data Types, Operators, Input and Output, Type Conversion, Debugging. Flow of Control,
Selection, Indentation, Repetition, Break and Continue Statement, Nested Loops. Strings- String
Operations, Traversing a String, String handling Functions.
UNIT-III
Functions: Functions, Built-in Functions, User Defined Functions, recursive functions, Scope of
a Variable. Python and OOP: Defining Classes, Defining and calling functions passing
arguments, Inheritance, polymorphism, Modules – date time, math, Packages. Exception
Handling- Exception in python, Types of Exception, User-defined Exceptions
UNIT-IV
List: Introduction to List, List Operations, Traversing a List, List Methods and Built-in
Functions. Tuples and Dictionaries: Introduction to Tuples, Tuple Operations, Tuple Methods
and Built-in Functions, Nested Tuples. Introduction to Dictionaries, Dictionaries are Mutable,
Dictionary Operations, Traversing a Dictionary, Dictionary Methods and Built-in functions.
UNIT-V
Introduction to NumPy: Array, NumPy Array, Indexing and Slicing, Operations on Arrays,
Concatenating Arrays, Reshaping Arrays, Splitting Arrays, Statistical Operations on Arrays. Data
Handling using Pandas: Introduction to Python Libraries, Series, DataFrame, Importing and
Exporting Data between CSV Files and DataFrames. Plotting Data using Matplotlib:
Introduction, Plotting using Matplotlib –Line chart, Bar chart, Histogram, Scatter Chart, Pie
Chart.
SRI VENKATESWARA UNIVERSITY::TIRUPATI
[Link]. Computer Science Honours
III year V Semester
Course 14 B: Foundations of Data Science
(w.e.f. 2025-26)

Practical Credits: 1 2 hrs/week

List of Experiments:
1. Write a Program to check whether given number is Armstrong or not.
2. Write a Program to check whether given number is perfect or not.
3. Write a program to find factorial of given number using recursive function
4. Write a program to implement inheritance and polymorphism
5. Demonstrate a python code to print try, except and finally block statements
6. Write a program to demonstrate String handling functions
7. Write a program to input n numbers from the user. Store these numbers in a tuple. Print the
maximum and minimum number from this tuple.

8. Write a program to enter names of employees and their salaries as input and store them in a
dictionary
9. Write a program to implement statistical operations on arrays using numPy
10. Write a program to import and export CSV file to DataFrame.
11. Create the DataFrame Sales containing yearwise sales and perform basic operation on it.
12. Visualize the plots using matplotlib.
UNIT – I

Introduction to Data Science: Need for Data Science – What is Data Science – Evolution of Data
Science, Data Science Process – Business Intelligence and Data Science - Prerequisites for a Data
Scientist - Tools and Skills required, Applications of Data Science in various fields – Data Security
Issues. Data Collection Strategies: Data Pre-Processing Overview, Data Cleaning, Data Integration
and Transformation, Data Reduction, Data Discretization.

1. Introduction to Data Science:

Data Science:

 Data science is an interdisciplinary field focused on extracting knowledge from large and
complex datasets to solve problems across various application domains.
 It integrates principles from computer science, mathematics, statistics, data visualization,
communication, and domain-specific expertise to understand and analyze phenomena using
data.
 The field is often described as a "fourth paradigm" of science, following empirical, theoretical,
and computational approaches, driven by the vast amounts of data generated by modern
technology.
 Data science encompasses the entire process, including data collection, storage, and the
creation of new analytical methods.
 It uses data to make predictions and prescribe optimal actions.
 Data engineers build the infrastructure for data access, while data scientists use that data to
build models.
 Machine learning is a key technique within data science, used to train algorithms to learn from
data and make predictions, though data scientists may collaborate with machine learning
engineers for scaling models.

The techniques used in data science include classification (sorting data into categories), regression
(finding relationships between variables), and clustering (grouping similar data points to discover
patterns).

These methods are applied to answer fundamental business questions: what happened (descriptive
analysis), why it happened (diagnostic analysis), what will happen (predictive analysis), and what
should be done (prescriptive analysis).

The benefits for businesses are significant, including uncovering hidden patterns, driving innovation in
products and services, and enabling real-time optimization of operations.

Data scientists are professionals who combine programming skills, statistical knowledge, and deep
understanding of a specific industry to extract insights.

They must be able to ask pertinent business questions, apply advanced analytical methods, write code
to automate processes, and effectively communicate complex findings to non-technical audiences.

1
Keys Aspects / Features of Data Science:

 Interdisciplinary: Data science draws upon principles and techniques from mathematics,
statistics, computer science, and domain-specific knowledge.

 Big Data: It often deals with large, complex datasets that require specialized tools and
techniques for analysis.

 Insights and Actionable Knowledge: Data science aims to transform raw data into
meaningful insights that can be used to drive business decisions, improve processes, and create
innovative solutions.

 Problem Solving: Data scientists are essentially problem solvers who use data to understand
phenomena and help organizations make better decisions.

Data Science Applications:

 Predictive Analytics: Using historical data to predict future outcomes, such as sales forecasts
or customer behavior.

 Personalization: Tailoring products and services to individual customer preferences based on


their data.

 Fraud Detection: Identifying fraudulent transactions or activities based on patterns in data.

 Healthcare: Analyzing patient data to improve diagnosis, treatment, and disease prevention.

 Marketing: Optimizing marketing campaigns and targeting the right customers with the right
messages.

In essence, data science is about turning data into a valuable asset by extracting actionable knowledge
and insights that can be used to solve problems and improve outcomes in various fields.

2. Explain Need for Data Science.

Need for Data Science:

Data science is crucial for businesses and organizations to extract meaningful insights from
the vast amounts of data being generated today. It enables data-driven decision-making, predictive
modeling, personalization, risk management, and improved efficiency across various sectors. Without
data science, much of the valuable information hidden within data would remain untapped, hindering
progress and innovation.

1. Data-Driven Decision Making:

 Data science provides the tools and techniques to analyze data and identify trends, patterns,
and relationships that can inform strategic decisions.

 This allows businesses to move away from gut feelings and make choices based on concrete
evidence, leading to better outcomes and improved performance.

2
2. Predictive Insights and Forecasting:

 Data science enables the development of predictive models that can forecast future trends and
outcomes.

 This is valuable for various applications, such as predicting customer behavior, anticipating
market demand, and managing risks.

3. Personalization and Customer Experience:

 In today's digital world, data science is essential for delivering personalized experiences to
customers.

 By analyzing customer data, businesses can tailor their products, services, and marketing
efforts to meet individual needs and preferences, enhancing customer satisfaction and loyalty.

4. Efficiency and Optimization:

 Data science can be applied to optimize processes and operations in various industries, such as
manufacturing, logistics, and supply chain management.

 By analyzing data on resource utilization, workflow patterns, and performance metrics, data
science can identify areas for improvement and streamline operations, leading to increased
efficiency and reduced costs.

5. Scientific Discovery:

 Data science plays a vital role in scientific research by enabling the analysis of large datasets in
fields like genomics, climate science, and drug discovery.

 This accelerates the pace of scientific discovery and leads to breakthroughs that can address
some of the world's most pressing challenges.

6. Risk Management and Fraud Detection:

 Data science is used in finance and insurance to assess and mitigate risks, improve the
accuracy of underwriting, and detect fraudulent activities.

 By analyzing historical data and identifying patterns associated with risk and fraud, data
science helps organizations make more informed decisions and minimize potential losses.

7. Addressing Complex Challenges:

 Data science has the potential to address complex challenges in various domains, such as
healthcare, environmental sustainability, and urban planning.

 By providing insights into the root causes of these challenges and identifying potential
solutions, data science can contribute to positive societal impact.

3
8. Increased Competition:

 In a competitive business environment, organizations that leverage data science are better
positioned to understand their customers, optimize their operations, and develop innovative
products and services.

 This gives them a significant advantage over competitors who rely on traditional methods and
limited data insights.

In essence, data science is not just a field of study; it is a fundamental enabler of progress and
innovation in the modern world. It empowers organizations to make better decisions, improve
efficiency, and create a more personalized and sustainable future.

3. Write about Evolution of Data Science:

Evolution of Data Science:

The field of data science, a discipline concerned with deriving insights and knowledge from data, has
undergone a remarkable transformation from its statistical origins to the integrated, AI-driven
powerhouse it is today.

1. Roots in statistics and early data analysis:

 Ancient Beginnings: The foundations of data analysis can be traced back to ancient
civilizations that used early forms of statistical thinking to track populations, agricultural
yields, and other administrative data. For instance, ancient Egyptians conducted periodic
censuses for various governmental planning activities, including constructing the pyramids.

 17th - 19th Centuries: Birth of Modern Statistics: Key figures like John Graunt (who analyzed
mortality data in London) and Blaise Pascal (a pioneer of probability theory) laid the
groundwork for modern statistics in the 17th century. In the 18th century, statisticians like
Pierre Simon Laplace and Thomas Bayes further formalized probability theory, which became
fundamental to data analysis. The 19th century saw advancements such as the establishment of
statistical societies and the use of systematic data collection, often for census purposes.
Florence Nightingale pioneered the use of statistical graphics to advocate for healthcare
reforms during this period.

2. Rise of computing and data management:

 Mid-20th Century: The Advent of Computers: The invention of computers in the mid-20th
century, such as the ENIAC, revolutionized data processing, enabling faster and more efficient
analysis of larger datasets.

 1960s-1970s: The Birth of Databases: The 1960s saw the emergence of data storage systems
like IBM's magnetic disk drive, allowing for more efficient data collection and storage. This
period also witnessed the development of the first database management systems (DBMS),
such as Charles Bachman's Integrated Data Store (IDS) and IBM's Information Management
System (IMS),

4
 1970s: Relational Databases and SQL: Edgar F. Codd's groundbreaking work on the relational
model revolutionized database management, organizing data into tables and utilizing SQL
(Structured Query Language) for querying and manipulation. This paved the way for modern
DBMS systems like Oracle and MySQL.

 1980s-1990s: Data Warehousing and Data Mining: The increasing volume of data in the 1980s
led to the concept of the data warehouse, a system optimized for reporting and analysis. The
1990s saw the development of data mining, which uses computational processes to discover
patterns in large datasets, according to [Link].

3. Big data era and the emergence of "Data Science":

 2000s: The Big Data Revolution: The widespread adoption of the internet, social media, and
mobile devices led to an explosion in data volume, velocity, and variety, ushering in the era of
"Big Data". Technologies like Hadoop and MapReduce emerged to handle the distributed
processing of these massive datasets. The term "data scientist" gained popularity to describe
professionals equipped to handle this complexity.

 Present: The AI and Machine Learning Driven Era: Data science today is deeply intertwined
with Artificial Intelligence (AI) and Machine Learning (ML).

 Deep Learning: Advanced neural networks and deep learning techniques have enabled
breakthroughs in areas like image recognition, natural language processing, and
autonomous systems.

 Cloud Computing: Cloud platforms such as AWS, Google Cloud, and Azure have
democratized access to scalable storage and powerful computing resources, making
advanced data analysis and model deployment accessible to a wider range of
organizations. This is especially beneficial for small and medium businesses.

 AI in Everyday Life: AI-driven data science is now part of our daily lives, from virtual
assistants like Siri and Alexa to recommendation engines on platforms like Netflix.

4. Current trends and the future of data science

Data science continues to evolve rapidly, driven by emerging technologies and increasing demand for
data-driven insights. Some key trends and future directions include:

 Generative AI: Continued growth in the use of generative AI models like ChatGPT and
advancements in areas like deepfakes and synthetic data.

 AutoML and MLOps: Automation of machine learning workflows and lifecycle management
through AutoML platforms and MLOps practices, enabling data scientists to focus on higher-
value tasks.

 Cloud-Native Solutions: Increased adoption of cloud-native analytics solutions and cloud


migration for cost-efficiency, scalability, and flexibility.

 Edge Computing and TinyML: Processing data closer to the source (edge computing) and
developing machine learning models for small, low-power devices (TinyML) will become
increasingly important with the growth of the Internet of Things (IoT).

5
 Responsible AI and Data Regulation: Growing focus on ethical considerations, data privacy,
and robust data governance frameworks to ensure fairness, transparency, and compliance with
regulations like GDPR.

 Democratization of Data Science: The development of user-friendly tools and cloud platforms
is making data science more accessible to non-technical users and smaller businesses.

 The Continued Importance of Human Expertise: While AI and automation will handle many
tasks, human data scientists will remain crucial for tasks requiring domain expertise, ethical
reasoning, and strategic interpretation of results, notes Binariks.

The journey of data science, fuelled by technological advancements and the relentless pursuit of
knowledge from data, continues to reshape industries, drive innovation, and offer promising solutions
to complex global challenges.

4. Explain the Process of Data Science (or) Life Cycle of Data Science.

Process of Data Science:

The field of data science involves a structured approach to extracting meaningful insights
and knowledge from data to solve real-world problems. It’s an iterative process that continuously
refines models and insights based on new information and changing requirements.

Fig: Process of Data Science

Key Stages involved:

1. Problem definition:

 Clearly define the business or research problem to be solved and its potential impact.

 Identify the project goals and objectives.

 Gather domain knowledge to understand the context of the problem.

 Define success metrics to measure the effectiveness of the project.

6
2. Data collection:

 Gather relevant data from various sources like databases, APIs, websites, sensors, and surveys.

 Ensure the collected data is of high quality, relevant, and suitable for the problem at hand.

 Choose appropriate data collection methods based on the type of data and project
requirements.

3. Data preparation:

 Clean the raw data by identifying and handling missing values, outliers, and inconsistencies.

 Transform the data into a suitable format for machine learning algorithms (e.g., normalization,
standardization, encoding categorical variables).

 Perform feature engineering by creating new variables or modifying existing ones to enhance
the model's ability to learn from the data.

4. Exploratory data analysis (EDA):

 Explore the prepared data to uncover patterns, trends, relationships, and anomalies.

 Use summary statistics and visualizations (histograms, scatter plots, box plots, etc.) to gain
insights into the data's characteristics.

 Formulate hypotheses about the data based on observations during EDA.

5. Data modelling:

 Based on the data analysis and exploration, choose and build suitable machine learning or deep
learning models.

 Train the models using the prepared training data.

 Select appropriate evaluation metrics to assess the model's performance (e.g., accuracy,
precision, recall for classification problems, or Mean Squared Error for regression problems).

 Fine-tune hyperparameters to optimize model performance.

6. Model Deployment:

 Deploy the trained and evaluated model into a real-world production environment.

 Integrate the model with applications or systems to make predictions or support ongoing
analysis.

 Choose appropriate deployment strategies like real-time, batch, or streaming, depending on the
application needs.

7. Model monitoring and maintenance:

 Continuously monitor the model's performance in the production environment to ensure its
continued accuracy and effectiveness.

7
 Address any issues such as data drift or model bias by retraining or updating the model with
new data as needed.

8. Communicating results:

 Effectively communicate the findings and insights to stakeholders in a clear and concise
manner, using reports, presentations, and data visualizations.

 Explain the methodology, key findings, and potential actions based on the analysis.

In essence, the data science process is a cycle of understanding the problem, acquiring and preparing
data, exploring and analyzing it, building and deploying models, and continuously monitoring and
improving their performance to solve real-world problems and drive data-driven decision-making.

5. Explain Business Intelligence and Data Science.

Business Intelligence:

 Business Intelligence is a set of technologies, applications, and processes that enable


businesses to gather, analyze, and present historical and current data for informed decision-
making.
 BI tools provide insights into past and present business performance, answering questions like
"What happened?" and "Why did it happen?".
 Common BI tools include reporting software and dashboards that visualize key performance
indicators (KPIs) and operational metrics to identify its strengths and weaknesses.
 Thus, the management team can decide in which area the company can improve its operating
efficiency.

It is not a new practice to support decision-making with data. However, dramatic improvements in BI
technology also mean significant improvements in speed, efficiency, and effectiveness. Automation
and data visualization are two examples, which both are transforming the process of business
intelligence.

Data Science:

 Data science is a multidisciplinary field. It combines scientific methods, processes, algorithms,


and systems to extract knowledge and insights from both structured and unstructured data.
 The goal is to predict future outcomes and identify hidden patterns.
 It often addresses questions like "What will happen?" and "How can we make it happen?".
 Data science utilizes advanced techniques, including machine learning, statistical modeling,
and algorithms, to analyze data and build predictive models.

The next step is data analysis, which can be conducted through text mining, regression, descriptive and
predictive analytics, and so on. By analyzing the data, the patterns behind the raw data can be
discovered to forecast future trends.

8
Aspect Business Intelligence (BI) Data Science

1. Focus Primarily descriptive analytics Predictive and prescriptive analytics (what


(what happened, what is will happen, what should we do).
happening).

2. Data Types Primarily structured data (e.g., Structured, semi-structured, and unstructured
databases, spreadsheets). data (e.g., social media, text, images, sensor
data).

3. Tools Reporting software, data Programming languages (e.g., Python, R),


warehousing, dashboards (e.g., machine learning platforms, statistical
Tableau, Power BI). software (e.g., SAS).

4. Skillset Business knowledge, data Programming, statistics, machine learning,


visualization, reporting, data advanced analytics, data mining.
warehousing, SQL.

5. Complexity Generally easier to implement and Requires more advanced technical skills and
interpret insights. expertise.

6. Decision- Supports operational and tactical Drives strategic decisions by predicting


making decisions based on historical future trends and uncovering new
performance. opportunities.

How BI and data science work together:

BI and data science are highly complementary, although distinct. BI can provide structured and
organized data. It also provides descriptive insights into current operations, which can lay the
groundwork for data science initiatives. Data science can then build upon these insights to develop
predictive models. It can also uncover deeper patterns and drive strategic decision-making and
innovation. For example, BI dashboards might reveal a decline in sales. This might prompt data
scientists to investigate the underlying causes and predict future sales trends using machine learning
models.

Conclusion:

Both Business Intelligence and Data Science are critical for organizations seeking to thrive in a data -
driven landscape. Understanding their differences and harnessing their combined power can unlock
valuable insights, drive innovation, and lead to more effective decision-making.

9
6. Explain prerequisites (Requirements) for a Data Scientist.

Data Scientist:

Data scientists are analytical experts who extract meaning from and interpret data to solve
complex problems. They use industry knowledge, contextual understanding, and skepticism of
existing assumptions to uncover solutions to business challenges.

A data scientist’s role combines computer science, statistics, and mathematics to collect and
organize data from many different data sources, translate results into actionable plans, and
communicate their findings to their organizations. Successful data scientists must be effective
communicators, leaders, team members, and high-level analytical thinkers.

Prerequisites for a Data Scientist:

To become a successful data scientist, you'll need a combination of education, technical skills,
workplace skills, and hands-on experience.

Here's a breakdown of the key prerequisites:

1. Education:

 Strong Educational Foundation: A bachelor's or master's degree in data science, computer


science, statistics, mathematics, or a related field is highly recommended.

 Alternative Paths: Some successful data scientists have entered the field through self-directed
learning, bootcamps, or a combination of education and experience.

2. Technical skills:

 Programming Languages: Proficiency in languages like Python and R is crucial for data
manipulation, analysis, and building predictive models. SQL is essential for managing and
querying large databases.

 Statistics and Mathematics: A strong understanding of statistical concepts (e.g., hypothesis


testing, probability, regression) and foundational mathematics (e.g., linear algebra, calculus) is
necessary for data analysis and machine learning.

 Machine Learning: Expertise in machine learning algorithms, including supervised and


unsupervised learning, is key for building predictive systems and automating decision-making.

 Data Wrangling and Visualization: The ability to clean, structure, and transform messy data, as
well as effectively present insights through visualizations, is crucial for turning raw data into
actionable information.

 Big Data Technologies: Knowledge of big data technologies like Hadoop and Apache Spark
for processing and analyzing large datasets is often a requirement.

 Cloud Computing: Familiarity with cloud platforms such as AWS, Google Cloud, or Microsoft
Azure for data storage and processing can be beneficial.

10
3. Workplace skills:

 Analytical Thinking: The ability to analyze large datasets, identify patterns, and derive
meaningful insights is fundamental.

 Problem-Solving: Data scientists are expected to use data-driven insights to solve complex
problems and make informed decisions.

 Communication: Effective communication skills are essential to convey findings and


recommendations to both technical and non-technical stakeholders.

 Collaboration: Data science projects often involve working with other teams, so teamwork and
the ability to work effectively with others is vital.

 Curiosity and Continuous Learning: Data science is a constantly evolving field, requiring a
commitment to continuous learning and staying updated with new technologies and
methodologies.

4. Practical experience and portfolio development:

 Internships: Internships offer hands-on experience and networking opportunities, allowing you
to apply your skills in a professional setting.

 Personal Projects: Building personal projects, like creating dashboards or predictive models,
helps demonstrate your abilities and build a portfolio to showcase your skills to potential
employers.

 Competitions: Participation in data science competitions, such as those on Kaggle, allows you
to sharpen your skills and build experience.

Conclusion: A successful career in data science typically involves a combination of formal education,
mastering technical skills, developing crucial soft skills, and gaining practical experience through
projects and internships

7. What are the tools and skills required for data science?

Tools Required for Data Science:

Data science involves a wide range of tasks, from collecting and cleaning data to building
models and deploying them. To effectively handle these tasks, data scientists utilize a diverse set of
tools and technologies.

1. Programming languages:

 Python: The most popular choice for data science due to its simplicity, versatility, and
extensive libraries. It is used for tasks like data manipulation, machine learning, and
visualization.

 R: Specialized in statistical analysis and visualization, especially valuable for tasks requiring
advanced statistical modeling and complex data visualizations.

11
 SQL (Structured Query Language): Crucial for interacting with relational databases to query,
retrieve, and manipulate structured data efficiently.

 Julia: A high-performance language gaining traction for numerical and scientific computing,
particularly useful for computationally intensive tasks.

2. Data analysis and manipulation libraries:

 NumPy: A fundamental Python library for numerical computing, providing support for multi-
dimensional arrays and mathematical functions.

 Pandas: A powerful Python library for data manipulation and analysis, offering data structures
like Series and DataFrames to streamline data cleaning, transformation, and exploration.

3. Machine learning and deep learning frameworks:

 Scikit-learn: A Python library offering a wide array of supervised and unsupervised machine
learning algorithms, including classification, regression, clustering, and preprocessing tools.

 TensorFlow: An open-source platform by Google, primarily focused on building and deploying


large-scale deep learning models, particularly neural networks.

 PyTorch: A deep learning framework favored for its flexibility, dynamic computation graphs,
and ease of debugging, commonly used in research and development.

 Keras: A high-level API designed to simplify building and training neural networks, often used
in conjunction with TensorFlow.

4. Data visualization tools and libraries:

 Tableau: A powerful data visualization and business intelligence tool, enabling users to create
interactive dashboards and communicate insights effectively.

 Power BI: Microsoft's business analytics service for visualizing and analyzing data, offering
seamless integration with other Microsoft products.

 Matplotlib: A widely used Python library for creating static, animated, and interactive data
visualizations.

 Seaborn: A Python data visualization library built on Matplotlib, offering attractive statistical
visualizations and simplifying the creation of complex plots.

 Plotly: A Python library particularly well-suited for creating interactive and dynamic
visualizations, ideal for web-based applications and dashboards.

 [Link]: A JavaScript library for creating interactive and dynamic data visualizations on the web,
giving control over visual elements.

12
5. Big data and cloud platforms:

 Apache Hadoop: An open-source framework for processing and storing large datasets across
computer clusters.

 Apache Spark: A fast and general-purpose cluster computing system for big data processing,
supporting batch processing, streaming data, machine learning, and graph processing.

 AWS (Amazon Web Services): Offers a comprehensive suite of cloud services for data
science, including storage (S3), big data processing (EMR), and machine learning
(SageMaker).

 Google Cloud Platform (GCP): Provides cloud infrastructure and data analytics tools,
including BigQuery for data warehousing and TensorFlow for deep learning.

 Microsoft Azure: A cloud platform with various data science services and tools, such as
HDInsight for big data processing and Machine Learning for building and deploying models.

6. Development and collaboration tools:

 Jupyter Notebook: A web-based interactive computing environment for creating and sharing
live code, visualizations, and narrative text.

 RStudio: An integrated development environment (IDE) specifically designed for R, offering a


comprehensive environment for data analysis and visualization.

 Git & GitHub: Tools for version control, enabling data scientists to track changes in code,
collaborate with others, and manage projects effectively.

Skills Required for Data Science:

In today’s data-driven world, industries across the globe are increasingly relying on
professionals who can collect, analyze, and interpret complex data to make informed business
decisions. This demand has led to the rapid growth of the data science field, especially among
graduates who wish to enter a high-impact, future-ready profession.

1. Strong Foundation in Statistics and Mathematics:

Data science begins with numbers. Core concepts such as probability, statistical modeling,
linear algebra, and calculus are fundamental. These tools are used to draw meaningful insights from
raw data, build algorithms, and forecast outcomes. A sound understanding of statistics allows you to
validate your findings and support business decisions with confidence.

2. Proficiency in Programming (Python/R):

Programming is a crucial component of data science. Languages such as Python and R are
widely used for data manipulation, analysis, and machine learning. Python, in particular, is known for
its simplicity and an extensive library ecosystem, making it a preferred language for most data science
projects.

13
3. Data Wrangling and Preprocessing Skills:

In Real-world data is rarely clean or complete. Data wrangling refers to the process of
cleaning, transforming, and preparing data for analysis. This includes handling missing values,
removing duplicates, and standardizing formats.

These skills ensure that your data is accurate and usable an essential step before any meaningful
analysis or model-building begins.

4. Knowledge of Machine Learning Algorithms:

Machine learning lies at the heart of predictive data science. Understanding supervised and
unsupervised learning techniques such as linear regression, decision trees, clustering, and
classification models is vital.

5. Experience with Data Visualization Tools:

Data scientists must not only analyze data but also communicate their findings clearly.
Visualization tools such as Tableau, Power BI, and Python libraries
like Matplotlib and Seaborn help create insightful dashboards and charts.

These visual tools are critical when presenting complex data to decision-makers who may not have
technical expertise.

6. SQL and Database Management:

Data is typically stored in relational databases. The ability to extract, manipulate, and organize
data using SQL (Structured Query Language) is considered a core skill in any data science role.

This includes writing queries to retrieve data, joining datasets, filtering records, and managing
databases efficiently.

7. Critical Thinking and Problem-Solving:

Employers value data scientists who can think analytically and solve real-world problems
using data. This involves identifying patterns, asking the right questions, testing hypotheses, and
finding data-driven solutions that support business objectives.

8. Communication and Presentation Skills:

While technical expertise is important, being able to communicate your insights in a clear and
structured manner is equally essential. Whether it’s through reports, presentations, or meetings, data
scientists are expected to explain complex concepts to both technical and non-technical stakeholders.

9, Practical Experience Through Projects and Internships:

Internships: Internships offer hands-on experience and networking opportunities, allowing you to
apply your skills in a professional setting.
Personal Projects: Building personal projects, like creating dashboards or predictive models, helps
demonstrate your abilities and build a portfolio to showcase your skills to potential employers.
Competitions: Participation in data science competitions, such as those on Kaggle, allows you to
sharpen your skills and build experience.

14
8. Explain applications of data science used in various fields.

Applications of Data Science:

Data science is a field that uses scientific methods, processes, algorithms, and systems to extract
knowledge and insights from structured and unstructured data.

It finds applications across various industries, enhancing efficiency, optimizing operations, and
enabling data-driven decision-making.

1. Healthcare 9. Fraud Detection and Prevention

2. Finance and Banking 10. Media and Entertainment

3. Retail and E-Commerce 11. Customer Service

4. Transportation and Logistics 12. Education

5. Manufacturing 13. Environmental Monitoring and Sustainability

6. Sports Analytics 14. Energy and Utilities

7. Marketing and Advertising 15. Telecommunications

8. Government and Public Sector

1. Healthcare:

Data science helps in identifying and predicting diseases, personalizing healthcare


recommendations, and optimizing hospital operations. For example, Google developed LYNA, a tool
that uses machine learning to identify breast cancer tumors with 99% accuracy in trials.

2. Finance and Banking:

It's used for fraud detection, credit risk assessment, algorithmic trading, and customer
analytics. Mastercard uses AI to scan billions of transactions, increasing fraud detection accuracy and
reducing false positives.

3. Retail and E-commerce:

Data science plays a role in optimizing marketing campaigns, understanding customer


behavior, predicting sales trends, and managing inventory effectively. For instance, Amazon uses data
to personalize homepages based on buyer's purchase history and preferences.

4. Transportation and Logistics:

Data science helps optimize shipping routes, predict maintenance needs for vehicles, manage
traffic patterns, and enhance supply chain management. UPS, for example, uses an integrated
navigation system powered by algorithms and AI to optimize routes and save fuel.

15
5. Manufacturing:

It's used for predictive maintenance, quality control, demand forecasting, supply chain
optimization, and process improvement. Siemens saved $25 million annually on maintenance by
applying data science techniques.

6. Sports Analytics:

Data science helps evaluate player performance, develop game strategies, predict injury risks,
and enhance fan engagement.

7. Marketing and Advertising:

It enables targeted advertising campaigns, customer segmentation, sentiment analysis, and


campaign performance tracking. Programmatic advertising platforms use real-time data and AI to
place ads in front of the most relevant audience segments.

8. Government and Public Sector:

Data science supports policy-making, improves planning, enhances cybersecurity defenses, and
aids in crime prevention strategies. The US Centers for Disease Control And Prevention, for instance,
use real-time data to enable proactive suicide prevention responses.

9. Fraud Detection and Prevention:

Data science utilizes statistical modeling, anomaly detection, and machine learning to identify
and prevent fraudulent activities across various sectors, such as finance, healthcare, and advertising.

10. Media and Entertainment:

Data science is used for content recommendations, audience segmentation, targeted


advertising, and analyzing viewing patterns to enhance user engagement.

11. Customer Service:

AI-powered chatbots and virtual assistants, driven by data science and NLP, provide 24/7
support and personalize customer interactions.

12. Education:

It's used for personalized learning, student performance analysis, and identifying at-risk
students for timely intervention.

13. Environmental Monitoring and Sustainability: Data science helps track pollution, model
climate change impacts, monitor deforestation, and manage natural resources efficiently.

14. Energy and Utilities: Data science is used for smart grid management, forecasting energy
demand, optimizing asset maintenance, and integrating renewable energy sources.

15. Telecommunications: Data science helps in network optimization and customer retention by
analyzing network performance and customer usage patterns.

16
9. Explain Data Security Issues.

Data Security Issues:

Data science, with its reliance on large datasets, presents unique challenges for ensuring the security
and privacy of information. Protecting this data is paramount, as breaches can lead to significant
financial and reputational damage, as well as legal and compliance issues.

Following are some of data security issues,

1. Insider threats:

 Employees, contractors, or other individuals with authorized access can pose a significant risk
if they accidentally or intentionally misuse or leak sensitive data.

 Data democratization, a growing trend in data science, provides more employees with access to
critical information, increasing the potential for data breaches through carelessness or
malicious intent.

2. Data privacy:

 Ensuring the privacy of individuals whose data is used in data science projects is crucial,
especially with regulations like GDPR and CCPA.

 Data anonymization and minimization techniques are essential to protect sensitive personal
information (SPI) and personally identifiable information (PII) from being exposed or re-
identified.

 AI's ability to infer new information from seemingly unrelated data sources further complicates
privacy protection.

3. Data breaches and loss:

 Data breaches can result from various factors, including cyberattacks, insider threats, and
accidental data exposure.

 Losing or corrupting data can cripple organizations, highlighting the importance of data
security to ensure business continuity.

 According to IBM's Cost of a Data Breach Report 2024, the average total cost of a breach
reached a record high of $4.88 million, emphasizing the need for robust security measures.

4. Insecure data storage and access control:

 Cloud data storage, while offering flexibility, can be vulnerable if not properly secured.

 Weak or misconfigured access controls can be exploited by attackers to gain unauthorized


access, modify AI models, leak sensitive data, or disrupt the development process.

 Centralizing processes for creating data pipelines and implementing data governance
frameworks with data catalogs are crucial for ensuring secure data movements and
connections.

17
5. Vulnerabilities in AI systems:

 AI systems, particularly machine learning models, are susceptible to attacks like data poisoning
and adversarial attacks, which can compromise the integrity and reliability of the models.

 Insider threats also pose a risk to AI systems, as internal personnel can misuse their access
privileges to compromise the system, steal data, or manipulate models.

 Insufficient logging can hinder the detection of problems and enable malicious actors to exploit
weaknesses without being noticed.

6. Compliance and regulations:

 Data science operations must comply with various data protection laws like GDPR, CCPA,
HIPAA, and India's Digital Personal Data Protection Act (DPDP Act).

 Compliance with these regulations is essential to avoid hefty fines, legal repercussions, and
reputational damage.

7. Lack of security expertise:

 The increasing complexity of IT environments and the rise of new technologies create a
demand for skilled cybersecurity professionals, leading to a shortage in the field.

 Organizations may struggle to balance data security requirements with their limited resources
and budgets.

Addressing the challenges:

Effectively mitigating these risks requires a multi-layered approach that includes:

Implementing robust security measures: Encryption, access controls, data masking, and data loss
prevention (DLP) are vital.

Following security best practices: Use strong passwords and multi-factor authentication, keep software
updated, and secure URLs.

Conducting regular audits and assessments: This helps identify vulnerabilities and assess the
effectiveness of security measures.

Employee training and awareness: Educating employees on data security best practices is crucial to
prevent human error and insider threats.

Adopting Privacy-by-Design principles: Integrating privacy safeguards into every phase of the data
science lifecycle is essential for building trust and ensuring compliance.

AI security and governance: Develop ethical AI policies, conduct bias audits, and implement
mechanisms for transparency and accountability.

Leveraging security technologies: Tools like AI-powered security solutions, incident response
platforms, and data discovery and classification tools can enhance security posture.

18
Data Collection Strategies

10. Explain Data Collection Strategies in Data Science.

Data Collection Strategies:

Data collection is the process of collecting and evaluating information or data from multiple
sources to find answers to research problems, answer questions, evaluate outcomes, and forecast
trends and probabilities. It is an essential phase in all types of research, analysis, and decision-making,
including that done in the social sciences, business, and healthcare.

During data collection, researchers must identify the data types, the sources of data, and the
methods being used. We will soon see that there are many different data collection methods. Data
collection is heavily reliance on in research, commercial, and government fields.

Before an analyst begins collecting data, they must answer three questions first:

 What’s the goal or purpose of this research?

 What kinds of data are they planning on gathering?

 What methods and procedures will be used to collect, store, and process the information?

Additionally, we can divide data into qualitative and quantitative types. Qualitative data covers
descriptions such as color, size, quality, and appearance. Unsurprisingly, quantitative data deals with
numbers, such as statistics, poll numbers, percentages, etc.

There are two methods of Collecting data are,

1. Primary Data Collection and

2. Secondary Data Collection

1. Primary Data Collection:

The first techniques of data collection is Primary data collection which involves the collection
of original data directly from the source or through direct interaction with the respondents. This
method allows researchers to obtain firsthand information tailored to their research objectives. There
are various techniques for primary data collection, including:

a. Surveys and Questionnaires: Researchers design structured questionnaires or surveys to collect


data from individuals or groups. These can be conducted through face-to-face interviews, telephone
calls, mail, or online platforms.

b. Interviews: Interviews involve direct interaction between the researcher and the respondent. They
can be conducted in person, over the phone, or through video conferencing. Interviews can be
structured (with predefined questions), semi-structured (allowing flexibility), or unstructured (more
conversational).

c. Observations: Researchers observe and record behaviors, actions, or events in their natural setting.
This method is useful for gathering data on human behavior, interactions, or phenomena without direct
intervention.

19
d. Experiments: Experimental studies involve manipulating variables to observe their impact on the
outcome. Researchers control the conditions and collect data to conclude cause-and-effect
relationships.

e. Focus Groups: Focus groups bring together a small group of individuals who discuss specific
topics in a moderated setting. This method helps in understanding the opinions, perceptions, and
experiences shared by the participants.

2. Secondary Data Collection:

The next techniques of data collection is Secondary data collection which involves using existing data
collected by someone else for a purpose different from the original intent. Researchers analyze and
interpret this data to extract relevant information. Secondary data can be obtained from various
sources, including:

a. Published Sources: Researchers refer to books, academic journals, magazines, newspapers,


government reports, and other published materials that contain relevant data.

b. Online Databases: Numerous online databases provide access to a wide range of secondary data,
such as research articles, statistical information, economic data, and social surveys.

c. Government and Institutional Records: Government agencies, research institutions, and


organizations often maintain databases or records that can be used for research purposes.

d. Publicly Available Data: Data shared by individuals, organizations, or communities on public


platforms, websites, or social media can be accessed and utilized for research.

e. Past Research Studies: Previous research studies and their findings can serve as valuable
secondary data sources. Researchers can review and analyze the data to gain insights or build upon
existing knowledge.

Data Collection Tools:

 Word Association: The researcher gives the respondent a set of words and asks them what
comes to mind when they hear each word.

 Sentence Completion: Researchers use sentence completion to understand the respondent's


ideas. This tool involves giving an incomplete sentence and seeing how the interviewee
finishes it.

 Online/Web Surveys: These surveys are easy to accomplish, but some users may be unwilling
to answer truthfully, if at all.

 Mobile Surveys: These surveys take advantage of the increasing proliferation of mobile
technology. Mobile collection surveys rely on mobile devices like tablets or smartphones to
conduct surveys via SMS or mobile apps.

 Phone Surveys: No researcher can call thousands of people at once, so they need a third party
to handle the chore. However, many people have call screening and won’t answer.

20
11. Explain Data Pre-Processing Overview in Data Science.

Data Pre-Processing Overview:

Data preprocessing is a fundamental step in data science, ensuring that raw data is transformed into a
clean, consistent, and usable format suitable for analysis and model building. Raw data often contains
noise, inconsistencies, and missing values, all of which can hinder model performance and lead to
inaccurate results.

The Key Stages involved:

Fig: Stages of Data Pre-Processing

1. Data cleaning:

This stage involves addressing common data quality issues.

 Handling missing values can be done by deleting them or imputing them with appropriate
techniques.

 Dealing with outliers, which are data points that differ significantly from others, can be
achieved through techniques like z-score normalization or visualization.

 Reducing noise, or irrelevant information, can involve methods like binning or clustering.

 Removing duplicate records ensures data accuracy and consistency.

2. Data integration:

 Data integration combines data from multiple sources into a single dataset, resolving
inconsistencies in format and structure.

 Data integration plays a crucial role in improving data quality by identifying and resolving
issues like inaccuracies, inconsistencies, and redundancies during the transformation phase.

 Integrated data facilitates advanced analytical techniques, machine learning, and business
intelligence reporting by providing a reliable and consistent data foundation.

21
3. Data transformation:

This step converts data into a suitable format for analysis or modeling. This includes:

 Normalization or standardization to scale numerical features.

 Encoding categorical variables into numerical formats.

 Feature engineering to create new features that can improve model performance.

 Handling skewed data to normalize its distribution.

4. Data reduction:

 Data reduction simplifies the dataset by reducing the number of features or records while
retaining essential information.
 Techniques include feature selection, dimensionality reduction like PCA, numerosity
reduction, and data compression.

5. Data validation:

 The final step is to validate the preprocessed data to ensure it meets the requirements for the
intended analysis or modelling, checking for data types, ranges, and consistency.
 Effective data preprocessing is crucial for improving data quality and leading to more accurate
analysis and robust machine learning models.

12. Explain Data Cleaning in detail.

Data Cleaning:

Data cleaning, also known as data cleansing or data scrubbing, is a crucial process in data
science that ensures the quality and reliability of datasets used for analysis, modeling, and decision-
making. It involves identifying and correcting or removing errors, inconsistencies, inaccuracies, and
other issues within the data. .

Key aspects and steps of data cleaning:

 Understanding the data: Before diving into cleaning, it's essential to understand the data's
context, source, and potential problems. This includes reviewing metadata and exploring the
dataset's structure and contents.

 Handling missing data: Missing values are common in real-world datasets and can significantly
impact analysis. Strategies include imputation (filling in missing values with estimated ones) or
removing records with missing data.

 Removing duplicates: Duplicate entries can skew analysis and lead to inaccurate results.
Identifying and removing these redundant records is crucial.

22
 Fixing structural errors: Inconsistencies in data formats, naming conventions, or data types can
hinder analysis. Standardizing formats and correcting such errors ensures uniformity.

 Handling outliers: Outliers are data points that significantly deviate from the majority.
Determining whether they represent errors or genuine anomalies is crucial, and the chosen
approach (removing, capping, or transforming them) depends on the context.

 Standardizing and normalizing data: This involves transforming data to a consistent format or
scale to ensure comparability across features.

 Validating and quality assurance: After cleaning, it's important to validate the dataset to ensure
it meets quality standards, using methods like statistical checks and data visualizations to
ensure it's ready for use.

Importance of Data Cleaning:

 Improves accuracy and reliability: Clean data forms the foundation for reliable insights and
predictions.

 Better decision-making: Decisions based on clean, high-quality data are more likely to be
effective and aligned with business goals.

 Enhanced model performance: Machine learning models require clean data to learn patterns
effectively and make accurate predictions.

 Reduced costs: Cleaning data helps organizations eliminate unnecessary duplicates or


irrelevant data, which reduces storage and processing costs.

 Compliance and security: Clean data helps organizations comply with data protection
regulations by keeping data accurate and current.

Data cleaning tools:

While manual data cleaning is possible for smaller datasets using tools like Excel, specialized software
and programming libraries are crucial for larger and more complex datasets. These include:

 Programming Languages: Python (with libraries like Pandas, NumPy, and scikit-learn)
and R (with packages like dplyr and tidyr) are popular choices for data cleaning and
manipulation.

 Data Analysis Platforms: Tools like Tableau Prep, OpenRefine, Trifacta, Talend, and
specialized AI tools can streamline the cleaning process and offer features like duplicate
detection, standardization, and data profiling.

23
13. Explain Data Integration in detail.

Data Integration:

Data integration in data science: A critical enabler

Data integration in data science is the process of combining data from various, often disparate, sources
into a unified, consistent, and usable format for analysis, modeling, and insights generation. It's a
foundational step that breaks down data silos, enabling a holistic view of the data and laying the
groundwork for effective data science endeavors.

Aspects:

 Data Extraction: Retrieving data from diverse sources, which can include databases, APIs, flat
files, cloud services, and more. These sources often have varying formats, structures, and
protocols.

 Data Transformation: Converting the extracted data into a consistent and standardized format
suitable for analysis. This involves cleaning, mapping, and reconciling inconsistencies or
discrepancies between sources. Common transformations include data type conversions,
handling missing values, de-duplication, and schema mapping.

 Data Loading: Storing the transformed data in a target system, such as a data warehouse, data
lake, or data lakehouse, where it can be readily accessed for data science tasks.

 Ensuring Data Quality: Data integration plays a crucial role in improving data quality by
identifying and resolving issues like inaccuracies, inconsistencies, and redundancies during the
transformation phase.

 Enabling Unified Views: By consolidating data from multiple sources, data integration
provides a holistic view of information, allowing data scientists to gain more comprehensive
insights and build more accurate predictive models.

Key Techniques and Strategies:

Common techniques for data integration include Extract, Transform, Load (ETL), Extract, Load,
Transform (ELT), data streaming, and data virtualization. The choice of technique depends on factors
such as data volume, velocity, complexity, and specific analytical requirements.

Challenges:

 Data quality issues: Inconsistencies, inaccuracies, and redundancies in data can hinder
analysis.
 Large data volumes and scalability: Handling increasing volumes of data requires scalable
integration solutions.
 Data security and compliance: Protecting sensitive data and adhering to regulations like GDPR
is critical.
 Technical complexity and resource constraints: Implementing and managing integrations can
be time-consuming and require specialized skills.

24
14. Explain Data Transformation in detail.

Data Transformation:

Data transformation in data science is the process of converting data from one format or structure into
another, making it suitable for analysis, modeling, or other downstream tasks. It involves cleaning,
validating, and restructuring data to ensure quality and usability.

This is a crucial step in data pipelines, especially in ETL (Extract, Transform, Load) or ELT (Extract,
Load, Transform) processes, and is essential for data integration, migration, and warehousing.

Crucial of Data Transformation:

 Improving Data Quality: It cleanses the data by handling missing values, removing duplicates,
and correcting errors, ensuring its accuracy and reliability for downstream tasks.

 Enhancing Compatibility: It standardizes data formats and structures, enabling seamless


integration from diverse sources and platforms.

 Boosting Model Performance: Many machine learning algorithms require numerical input and
operate best with features on a similar scale. Transformation techniques like scaling,
normalization, and encoding categorical variables prepare the data to meet these requirements,
leading to better model accuracy and faster convergence.

 Supporting Data-Driven Decisions: By providing clean, structured, and easily accessible data,
transformation facilitates effective analytics, reporting, and visualization, enabling
organizations to make informed decisions.

Key elements of Data Transformation:

 Encoding Categorical Data: Converting categorical variables (text-based) into numerical


representations that machine learning models can understand.

 Aggregation: Summarizing data by calculating averages, sums, or counts within specific


groups or time periods.

 Feature Engineering: Creating new features from existing ones to improve model
performance.

 Dimensionality Reduction: Reducing the number of features while retaining important


information, such as using Principal Component Analysis (PCA).

 Time Series Decomposition: Breaking down time series data into trend, seasonality, and
residual components for analysis.

25
Example:

Imagine you have a dataset of customer purchases with different currencies and varying price
ranges. To analyze this data effectively, you would need to:

1. Clean the data: Remove any duplicate entries and correct any inaccurate price values.

2. Convert all prices to a single currency: This ensures consistency in the dataset.

3. Normalize the prices: Rescale the prices to a common range, like 0-1, to ensure no feature
dominates the analysis.

4. Create a new feature: You might calculate the total purchase value by combining multiple
transactions from the same customer, or you could categorize purchases into different price
ranges (e.g., low, medium, high).

By applying these transformation steps, you can analyze the data more effectively and gain valuable
insights.

Common data transformation techniques:

 Min-Max Scaling: Rescales numerical data to a fixed range, typically between 0 and 1.

 Label Encoding: Assigns a unique integer to each category in a categorical variable. It can be
used for ordinal categories where the order matters.

 Handling Missing Data: Techniques like imputation (mean, median, mode) or removal address
missing values in the dataset..

 Smoothing: Reduces noise and fluctuations in time series data using methods like moving
averages or exponential smoothing.

 Binning/Discretization: Groups continuous data into discrete categories, simplifying analysis,


particularly for algorithms that benefit from categorical input.

26
15. Explain Data Reduction in detail.

Data Reduction:

Data reduction in data science refers to techniques used to reduce the size of a dataset while preserving
essential information. This process aims to make datasets more manageable for storage, processing,
and analysis, especially when dealing with large volumes of data. It involves transforming, encoding,
or otherwise converting data into a smaller, more efficient form, often through methods like
dimensionality reduction, data compression, or feature selection.

Goals of Data Reduction:

 Storage Optimization: Reducing the amount of space needed to store data.

 Improved Processing Speed: Making data easier and faster to process, especially in
computationally intensive tasks.

 Enhanced Analysis: Simplifying complex datasets for better visualization and interpretation.

 Mitigating Dimensionality Curse: Addressing the challenges of high-dimensional data where


algorithms can become less effective.

 Reducing Redundancy: Eliminating unnecessary or repetitive data to focus on core


information.

Data Reduction Techniques:

 Dimensionality Reduction: Reducing the number of attributes or features in a dataset, often by


selecting the most relevant ones or combining them.

 Data Compression: Using algorithms to reduce the size of data by encoding it more
efficiently. This can be lossless (fully reconstructible) or lossy (some information loss).

 Data Sampling: Selecting a subset of the original data that is representative of the whole
dataset.

 Data Aggregation: Summarizing data into smaller, more manageable units, such as grouping
similar data points or creating aggregated tables.

 Feature Selection: Choosing a subset of relevant features from a larger set, discarding
irrelevant or redundant ones.

Benefits:

 Increased Efficiency: Reduced storage costs and faster processing times.

 Improved Analysis: Easier to visualize and interpret data, leading to better insights.

 Reduced Computational Costs: Less processing power and time required for analysis.

 Better Model Performance: Addressing the challenges of high-dimensional data, leading to


improved model accuracy and performance.

27
16. Explain Data Discretization in detail.

Data Discretization:

Discretization in data science is the process of transforming continuous numerical data into
discrete, categorical data. It involves grouping continuous values into a finite number of intervals or
bins, essentially converting a variable with a potentially infinite range of values into a finite set of
categories. This technique is crucial for simplifying data analysis, improving algorithm performance,
and revealing hidden patterns.

Need of Discretization:

 Simplifies data analysis: By grouping continuous data into categories, discretization makes it
easier to understand and interpret complex datasets. For example, instead of dealing with
individual ages, you can group them into age ranges like "child," "teen," "adult," and "senior".

 Improves algorithm performance: Some machine learning algorithms, particularly those


designed for categorical data, may perform better with discretized data. Discretization can
reduce the computational complexity and improve the speed of these algorithms.

 Reveals hidden patterns: Discretization can help uncover relationships and patterns that might
not be apparent when working with raw, continuous data.

 Data reduction: It reduces the number of unique values, essentially compressing the data and
making it more manageable for certain tasks.

Working of Discretization:

1. Identify continuous attributes: Determine which attributes in your dataset need to be


discretized.

2. Choose a discretization method: There are various methods, including:

 Equal width: Divides the data into intervals with equal ranges.

 Equal frequency: Divides the data into intervals containing roughly the same number of
data points.

 K-means clustering: Uses clustering algorithms to group similar data points into
intervals.

 Entropy-based methods: Uses information gain or entropy to determine optimal split


points.

3. Define intervals: Based on the chosen method, define the boundaries of the intervals.

4. Assign interval labels: Replace the original continuous values with labels corresponding to the
appropriate interval.

Example: Imagine a dataset of house prices. Instead of dealing with individual prices, you could
discretize the prices into categories like "low," "medium," and "high" based on price ranges. This
could make it easier to analyze factors that influence the sale of houses within these price ranges.

28
UNIT – II

Getting Started with Python: Introduction to Python, Python Keywords, Identifiers, Variables,
Comments, Data Types, Operators, Input and Output, Type Conversions, Debugging, Flow of Control,
Selection, Indentation, Repetition, Break and Continue Statement, Nested Loops. Strings – String
Operations, Traversing a String, String handling Functions.

1. Introduction to Python:

Python, designed by Guido van Rossum and first released in 1991, is a high-level, general-purpose
programming language. Its design philosophy emphasizes code readability, making it accessible for
both beginners and experienced developers. Python's versatility allows it to be used across various
domains, including web development, data science, machine learning, and automation.

Key features of Python:

 Easy to Learn and Read: Python's syntax is simple, intuitive, and similar to the English language,
making it easy to learn and understand.

 Interpreted Language: Python executes code line by line at runtime, which simplifies debugging
and speeds up development.

 Dynamically Typed: Python dynamically determines variable types at runtime, eliminating the need
for explicit type declarations.

 High-Level Language: Python abstracts away low-level details like memory management, allowing
developers to focus on the application's logic.

 Object-Oriented Programming (OOP) Support: Python supports OOP principles like inheritance
and polymorphism, which promotes modular and reusable code.

 Cross-Platform Compatibility: Python code can run on various operating systems, including
Windows, macOS, and Linux, without modification.

 Extensive Standard Library: Python comes with a rich set of built-in modules and functions for
various tasks like file I/O, networking, and data processing.

 Third-Party Libraries and Frameworks: Python boasts a vast ecosystem of third-party libraries and
frameworks, available through the Python Package Index (PyPI), extending its capabilities in areas
like web development, data science, and machine learning.

 Strong Community Support: Python has a large and active community, providing extensive
documentation, tutorials, forums, and open-source contributions.

29
Applications of Python:

 Web Development: Frameworks like Django and Flask enable the development of dynamic and
scalable web applications. Major platforms such as Instagram, Pinterest, and Reddit utilize these
frameworks.

 Data Science and Machine Learning: Libraries like Pandas, NumPy, scikit-learn, and TensorFlow
make Python a preferred choice for data analysis and machine learning tasks. Python powers data
platforms at companies like Netflix, Spotify, and Airbnb.

 Automation and Scripting: Python's simplicity is ideal for automating repetitive tasks like file
management, web scraping, and system administration. Tech giants like Google and Facebook use
Python for automation in their operations.

 Game Development: Python is used for scripting in game engines and creating simple games.
Popular games like World of Tanks and Sims 4 have utilized Python.

 Desktop GUI Applications: Python supports the development of cross-platform desktop


applications using libraries like Tkinter, PyQt, and Kivy. Applications like Dropbox have desktop
components written partly in Python.

 Scientific Computing: Libraries like SciPy and SymPy provide tools for mathematical and symbolic
computations, making Python useful for scientific research and engineering simulations.
Organizations like NASA and CERN use Python for complex simulations and scientific data
processing.

Why Python is popular?

Python's popularity stems from a combination of its features:

 Ease of Learning and Use: Its simple syntax and clear structure make it accessible for beginners and
efficient for experienced developers.

 Versatility and Broad Applications: Python's ability to be used across diverse domains, from web
development to data science and automation, makes it valuable for various projects and industries.

 Strong Community Support and Ecosystem: The large and active community, along with the
extensive libraries and frameworks, provides abundant resources and support for developers.

 Corporate Support and Adoption: Major corporations like Google, Microsoft, and Meta use and
support Python, driving innovation and creating numerous job opportunities for Python developers.

30
2. Python Keywords:

Python keywords are reserved words that have special meanings within the python language and
cannot be used as identifiers (e.g., variable names, function names, class names). They are
fundamental to the syntax and structure of Python programs.

Category Keywords

Value Keywords True, False, None

Operator Keywords and, or, not, is, in

Control Flow if, else, elif, for, while, break, continue, pass, try, except, finally,
Keywords raise, assert

Function and Class def, return, lambda, yield, class

Context Management with, as

Import and Module import, from

Scope and Namespace global, nonlocal

Async Programming async, await

Table: Python Keywords

Key characteristics:

Reserved:
They have predefined meanings and functionalities within Python.
Case-sensitive:
Most keywords are entirely lowercase, with the exceptions of True, False, and None.
Cannot be identifiers:
You cannot use a keyword as the name for a variable, function, class, or any other identifier in your
code. Using them for such purposes will result in a SyntaxError.
Essential for program structure:
Keywords define control flow (e.g., if, for, while), function and class definitions (def, class),
exception handling (try, except), and other core language features.

31
3. Identifiers:

In Python, an identifier is a name used to identify a variable, function, class, module, or other
object. Identifiers are fundamental for writing Python code as they allow for referencing and
manipulating various elements within a program.

Rules:

 Characters Allowed:
Identifiers can consist of letters (A-Z, a-z), digits (0-9), and underscores (_).
 Starting Character:
An identifier must begin with a letter (A-Z, a-z) or an underscore (_). It cannot start with a digit.
 Case-Sensitivity:
Python is case-sensitive, meaning myVariable, myvariable, and MYVARIABLE are treated as
distinct identifiers.
 Keywords:
Identifiers cannot be a reserved keyword in Python (e.g., if, else, while, for, class, def).
 Special Symbols:
Punctuation characters or other special symbols (e.g., @, $, %, !) are not permitted within
identifiers.
 Whitespace:
Whitespace characters are not allowed within identifiers.

Examples:
Valid Identifiers:
my_variable, calculateSum, _private_data, user_name123
Invalid Identifiers:
123name (starts with a digit), my-variable (contains a hyphen), for (is a keyword), user
name (contains whitespace)

4. Variables:

In Python, variables are symbolic names that act as containers for storing data values in memory. They
allow for the storage and manipulation of various data types, such as numbers, strings, lists, and
objects.

Key characteristics:
 No explicit declaration: Unlike some other programming languages, Python does not require a
specific command to declare a variable. A variable is created the moment a value is assigned to it
using the assignment operator (=).

32
Ex:
x=5
y = 3.14
z = "Hi"

 Dynamic typing: Python is a dynamically typed language, meaning the type of a variable is
inferred at runtime based on the value assigned to it. The type can also change if a new value of a
different type is assigned.
Ex:
x=5 # x is an integer
x = "hello" # Now x is a string

 Multiple Assignments: Python allows multiple variables to be assigned values in a single line.

1. Assigning the Same Value


Python allows assigning the same value to multiple variables in a single line, which can be useful for
initializing variables with the same value.

Ex:
a = b = c = 100
print(a, b, c)

Output:

100 100 100

2. Assigning Different Values


We can assign different values to multiple variables simultaneously, making the code concise and
easier to read.

Ex:
x, y, z = 1, 2.5, "Python"
print(x, y, z)

Output:
1 2.5 Python

 Case-sensitive:
Variable names in Python are case-sensitive. my_variable, My_Variable,
and MY_VARIABLE are considered three distinct variables.
Rules for Variables:
 Must start with a letter or an underscore (_).
 Cannot start with a number.
 Can only contain alphanumeric characters and underscores.
 Cannot be a Python keyword (e.g., if, for, while).
 It is a common convention to use lowercase letters and underscores for multi-word variable
names (e.g., first_name).

33
Valid Variables:
name = "Clara", total_score = 95, _color = "Green"

Invalid Variables:
1name = "Error" # Starts with a digit
class = 10 # 'class' is a reserved keyword
user-name = "Doe" # Contains a hyphen
Types of variables based on scope:

1. Local variables: Defined inside a function and accessible only within that function.
2. Global variables: Defined outside of any function and accessible throughout the entire program.
3. Instance variables: Associated with specific instances (objects) of a class.
4. Class variables (Static variables): Shared by all instances of a class.

5. Comments:

In Python, comments are used to add explanatory notes within the code, improving its
readability and understanding for both the programmer and others. The Python interpreter ignores
comments during execution, meaning they do not affect the program's functionality.

Types of comments:

Python has 2 types for adding comments to your code:

1. Single-line comments: These start with a hash symbol (#) and extend to the end of the line. They
are useful for brief explanations or notes on a single line of code.

Ex:
# This is a single-line comment.
print("Hello, world!") # This prints a message..

2. Multi-line comments (Docstrings): While Python does not have a dedicated syntax for multi-line
comments like other languages, triple-quoted strings (''' or """) are widely used to achieve this effect.
When placed as the first statement within a module, function, class, or method definition, these triple-
quoted strings become Docstrings, serving as embedded documentation for that code entity.

Ex:
"""
This is a multiline comment,
used as a Docstring to explain a function.
"""
def function_name( ):
#Function Implementation here
pass

34
Purpose and Advantages of Comments:

 Enhanced readability: Comments act as helpful explanations, making the code easier to grasp for
the programmer and others who might review it.

 Improved maintainability: Well-documented code with clear comments simplifies future


modifications and updates.

 Facilitating debugging: Comments can be used to temporarily disable sections of code during
debugging, allowing you to isolate and test specific parts of your program.

 Better collaboration: Well-structured and commented code facilitates smooth collaboration within a
development team.

 Self-documenting code: Although comments are crucial, strive for code that is as self-explanatory
as possible by using clear variable and function names.

6. Data Types:

Python Data types are the classification or categorization of data items. It represents the kind of
value that tells what operations can be performed on a particular data. Since everything is an object in
Python programming, Python data types are classes and variables are instances (objects) of these
classes. The following are the standard or built-in data types in Python:

 Numeric - int, float, complex

 Sequence Type - string, list, tuple

 Mapping Type - dict

 Boolean - bool

 Set Type - set, frozenset

 Binary Types - bytes, bytearray, memoryview

35
1. Numeric Data Types:

The numeric data type in Python represents the data that has a numeric value. A numeric value can be
an integer, a floating number, or even a complex number.

 Integers - This value is represented by int class. It contains positive or negative whole
numbers (without fractions or decimals). In Python, there is no limit to how long an integer
value can be.

 Float - This value is represented by the float class. It is a real number with a floating-point
representation. It is specified by a decimal point. Optionally, the character e or E followed by a
positive or negative integer may be appended to specify scientific notation.

 Complex Numbers - A complex number is represented by a complex class. It is specified


as (real part) + (imaginary part)j . For example - 2+3j

Ex:

a=5
print(type(a))
b = 5.0
print(type(b))
c = 2 + 4j
print(type(c))

Output:
<class 'int'>
<class 'float'>
<class 'complex'>

2. Sequence Data Types:

The sequence Data Type is the ordered collection of similar or different Python data types. Sequences
allow storing of multiple values in an organized and efficient fashion. There are several sequence data
types of Python:
 String
 List
 Tuple

a) String Data Type:

Strings are arrays of bytes representing Unicode characters. In Python, there is no character data type
Python, a character is a string of length one. It is represented by str class.
Strings in Python can be created using single quotes, double quotes or even triple quotes. We can
access individual characters of a String using index.

36
Ex:
s = 'Welcome to the World'
print(s)

# check data type


print(type(s))

# access string with index


print(s[1])
print(s[2])
print(s[-1])

Output:
Welcome to the Geeks World
<class 'str'>
e
l
d

b) List Data Type:

List are just like arrays, declared in other languages which is an ordered collection of data. It is very
flexible as the items in a list do not need to be of the same type. It is created by just placing the
sequence inside the square brackets[ ] and separated by comma (,) .

Ex:
# Empty list
a=[]

# list with int values


a = [1, 2, 3]
print(a)

# list with mixed int and string


b = ["Hello", "User!", 4, 4.5]
print(b)

Output:
[1, 2, 3]
['Hello', 'User!', 4 , 4.5]

37
c) Tuple Data Type:
Just like a list, a tuple is also an ordered collection of Python objects. The only difference
between a tuple and a list is that tuples are immutable. Tuples cannot be modified after it is created.
It is created by placing a sequence of values separated by a ‘comma’ with or without the use of
parentheses for grouping the data sequence.

Tuples can contain any number of elements and of any datatype (like strings, integers, lists, etc.).

Ex:
# initiate empty tuple
tup1 = ()

tup2 = ('Hello ', 'User!')


print("\nTuple with the use of String: ", tup2)

Output:
Tuple with the use of String: ('Hello ', 'User!')

3. Boolean Data Type:


Boolean data type has two built in values, namely True and False. However non-Boolean
objects can be evaluated in a Boolean context as well and determined to be true or false. It is denoted
by the class bool.

Example: The first two lines will print the type of the boolean values True and False, which is <class
'bool'>. The third line will cause an error, because true is not a valid keyword in Python. Python is
case-sensitive, which means it distinguishes between uppercase and lowercase letters.

Ex:
print(type(True))
print(type(False))
print(type(true))

Output:
<class 'bool'>
<class 'bool'>

Traceback (most recent call last):


File "/home/[Link]", line 8, in
print(type(true))

NameError: name 'true' is not defined

38
4. Set Data Type:

Set is an unordered collection of data types that is iterable, mutable, and has no duplicate
elements. The order of elements in a set is undefined though it may consist of various elements.

Sets can be created by using the built-in set( ) function with an iterable object or a sequence by placing
the sequence inside curly braces, separated by a ‘comma’. The type of elements in a set need not be
the same, various mixed-up data type values can also be passed to the set.

Ex:
# initializing empty set
s1 = set()

s1 = set("GeeksForGeeks")
print("Set with the use of String: ", s1)

s2 = set(["Geeks", "For", "Geeks"])


print("Set with the use of List: ", s2)

Output:
Set with the use of String: {'s', 'o', 'F', 'G', 'e', 'k', 'r'}
Set with the use of List: {'Geeks', 'For'}

5. Dictionary Data Type:


A dictionary in Python is a collection of data values, used to store data values like a map,
unlike other Python Data Types that hold only a single value as an element, a Dictionary holds a key:
value pair. Key-value is provided in the dictionary to make it more optimized. Each key-value pair in a
Dictionary is separated by a colon : , whereas each key is separated by a ‘comma’.

The values in a dictionary can be of any datatype and can be duplicated, whereas keys can’t be
repeated and must be immutable. The dictionary can also be created by the built-in function dict( ).

Ex:
# initialize empty dictionary
d={}

d = {1: 'one', 2: 'two', 3: 'three'}


print(d)

# creating dictionary using dict() constructor


d1 = dict({1: 'one', 2: 'two', 3: 'three'})
print(d1)

Output:
{1: 'one', 2: 'two', 3: 'three'}
{1: 'one', 2: 'two', 3: 'three'}

39
7. Python Operators:

Python operators are special symbols used to perform specific operations on one or more
operands. The variables, values, or expressions can be used as operands.

The different types operators, their functionalities:


1. Arithmetic Operators
2. Relational Operators
3. Logical Operators
4. Assignment Operators
5. Bitwise Operators
6. Ternary Operators
7. Membership Operators and
8. Identity Operators

1. Arithmetic Operators:
These operators perform fundamental mathematical operations.

Operator Description Example

+ Addition sum = a + b

- Subtraction difference = a - b

* Multiplication product = a * b

/ Division (float) quotient = b / a

% Modulus (remainder) remainder = a % b

** Exponentiation (power) power = a ** b

// Floor division (integer division) floor_quotient = a // b

2. Relational Operators:

These operators compare two values and return a boolean result (True or False).

Operator Description Example

== Equal to flag = a == b

40
!= Not equal to flag = a != b

> Greater than flag = a > b

< Less than flag = a < b

>= Greater than or equal to flag = a >= b

<= Less than or equal to flag = a <= b

3. Logical Operators:

These operators combine or manipulate boolean values (True/False).

Operator Description Example

and Logical AND: Returns True if both operands are True, otherwise flag = exp1 and
False. exp2

or Logical OR: Returns True if at least one operand is True, otherwise flag = exp1 or exp2
False.

not Logical NOT: Returns the inverse of the operand's value. flag = not(True)

4. Assignment Operators:
These operators assign values to variables, providing a concise way to update variables based
on computations.
Operator Description Example

= Simple Assignment a = 10

+= Addition Assignment a += 5 (equivalent to a = a + 5)

-= Subtraction Assignment a -= 5 (equivalent to a = a - 5)

*= Multiplication Assignment a *= 5 (equivalent to a = a * 5)

/= Division Assignment a /= 5 (equivalent to a = a / 5)

%= Modulus Assignment a %= 5 (equivalent to a = a % 5)

41
**= Exponentiation Assignment a **= 2 (equivalent to a = a ** 2)

//= Floor Division Assignment a //= 3 (equivalent to a = a // 3)

5. Bitwise Operators:

These operators work on the individual bits of integers (their binary representations).

Operator Description Example

& Bitwise AND x = 10 & 7

| Bitwise OR x =10 | 7

^ Bitwise XOR x = 10 ^ 7

~ Bitwise NOT (One's Complement) x = ~10

<< Left Shift x = 10 << 1

>> Right Shift x = 10 >> 1

6. Ternary Operators:

The ternary operator allows us to perform conditional checks and assign values or perform
operations on a single line. It is also known as a conditional expression because it evaluates a
condition and returns one value if the condition is True and another if it is False.

Ex:
n=5
res = "Even" if n % 2 == 0 else "Odd"
print(res)

Output:
Odd

42
7. Membership Operators:
These operators test whether a value is present in a sequence (like strings, lists, or tuples).

Operator Description

in Returns True if the value is found in the sequence, otherwise False.

not in Returns True if the value is not found in the sequence, otherwise False.

8. Identity operators:
These operators compare the memory locations of two objects to determine if they are the same
object.

Operator Description

is Returns True if both variables point to the same object in memory, otherwise False.

is not Returns True if both variables do not point to the same object in memory, otherwise False.

Note: Identity operators (is, is not) are different from equality operators (==, !=). Identity operators
check if two objects are the same instance in memory, while equality operators check if two objects
have the same value.

Operator Precedence and Associativity:

In expressions with multiple operators, operator precedence determines the order of evaluation.
For example, multiplication and division generally have higher precedence than addition and
subtraction. Python follows a similar rule set as PEMDAS in arithmetic.
Parentheses can be used to override the default precedence and explicitly specify the order of
evaluation within an expression.
Associativity dictates the order of evaluation when operators have the same precedence. Most
Python operators (including arithmetic, comparison, logical and, or, and bitwise operators) have left-
to-right associativity. The exception is the exponentiation operator (**) and assignment operators,
which have right-to-left associativity.
Understanding Python's operators is crucial for writing effective, readable, and logically correct code.

Ex:
a=10
b=3
c=5
print("Additon:",a+b)
print("10 power 3 is ",a**b)
print("10 != 3",a!=b)
print("10>3 and 3>5 is",a>b and b>c)

43
a /= b
print("a is ",a)
a=10
a //= b
print("Floor Division a is ",a)

Output:
Additon: 13
10 power 3 is 1000
10 != 3 True
10>3 and 3>5 is False
a is 3.3333333333333335
Floor Division a is 3

8. Input and Output (I/O) in Python:

In Python, input and output operations are fundamental for interacting with users and displaying
results.

 The input() function is used to gather input from the user and
 The print() function is used to display output.

Input Operations:

Python’s input() function allows us to get data from the user. By default, all input received via this
function is treated as a string. If we need a different data type, we can convert the input accordingly.

Syntax:
input(prompt)
 Parameter: prompt (optional) is the string that is displayed to the user to provide instructions or
information about what kind of input is expected.
 Returns: It returns the user input as string.

Example:

[Link] default string:

a = input("Enter your name: ")


print(a)

Output:
Enter your name: Clara
Clara

44
2. Converting string into int ( ) data type:

n = int(input("Enter a number: "))


res = n * n
print(res)

Output:
Enter a number: 4
16

3. Converting string into float ( ) data type:

l = float(input("Enter the length of the rectangle: "))


w = float(input("Enter the width of the rectangle: "))
a=l*w
print(a)

Output:
Enter the length of the rectangle: 4
Enter the width of the rectangle: 8
32.0

Output Operations:
The print( ) function is used to display output to the user. It can be used in various ways to
print values, including strings, variables and expressions.

Ex:
print("Hello, World!")

Output:
Hello, World!

Syntax:
print(object(s), sep=' ', end='\n', file=[Link], flush=False)

Parameters:

Parameter Description
object(s) The value(s) to be printed. You can pass multiple values
separated by commas.
sep (Optional) Separator between values. Default is a space ' '.
end (Optional) What to print at the end. Default is a newline '\n'.
file (Optional) Where to send the output. Default is [Link] (console).
flush (Optional) Whether to forcibly flush the output stream. Default is False.

45
1. You can also print multiple variables by separating them with commas:
Ex:
id=21
name="Gautham"
city = "Hyderabad”
print(id, name, city)

Output:
21 Gautham Hyderabad

2. F-strings, allow for concise and readable string formatting by embedding expressions inside string
literals using the f prefix.

Ex:
name = "Alice"
age = 30
print(f"Name: {name}, Age: {age}")

Output:
Name: Alice, Age: 30

3. The string modulo operator % is a traditional method for formatting strings, often referred to as
printf-style formatting. It uses format specifiers like %s for strings and %d for integers.

Ex:
name = "Alice"
age = 30
print("Name: %s, Age: %d" % (name, age))

Output:
Name: Alice, Age: 30

4. Multiple values are printed with " , " used as the separator instead of the default space. This is
useful for formatting output with custom characters between values.

Ex:
print("Python", "Java", "C++", sep=", ")

Output:
Python, Java, C++

46
5. The end parameter is set to "..." to avoid the default newline and print the next statement on the
same line. This creates a smooth, continuous output message.

Ex:
print("Loading", end="... ")
print("Please wait")

Output:
Loading... Please wait

9. Type Conversions in Python:

Type conversion refers to the process of changing the data type of a variable from one type to
another.
There are 2 types of type conversion:

1. Implicit Type Conversion and


2. Explicit Type Conversion

1. Implicit Type Conversion:

Implicit type conversion is automatically performed by the Python interpreter, where the interpreter
converts one data type to another without any user involvement.

For example, when adding an integer and a float, Python automatically converts the integer to a float
to avoid data loss

Ex:

a=5
b = 5.5
sum = a + b
print (sum)
print (type (sum))

Output:

10.5

<class 'float'>

In the above example, we have taken two variables of integer, and float data types and added
them. Further, we have declared another variable named ‘sum’ and stored the result of the addition in
it. When we checked the data type of the sum variable, we could see that the data type of the sum
variable had been automatically converted into the float data type by the Python compiler. This is
called implicit type conversion.

47
2. Explicit Type Conversion:

Explicit type conversion, also known as type casting, requires the user to manually convert
the data type using built-in functions such as int( ), float( ), str( ), etc.. For instance, converting a
string to an integer can be done using int( ) function.

However, explicit conversion may lead to data loss if the object is forced to conform to a particular
data type.

Some common functions used for type conversion in Python include int( ), float( ), str( ), and others
like tuple( ), set( ), list( ), etc., which are used to convert data into different structures.

Additionally, functions like ord() and chr() are used to convert characters to their corresponding ASCII
values and vice versa.

Ex:

a = 100
b = "200"
result1 = a + b
print(result1)

Output:

result1 = a + b

~~^~~

TypeError: unsupported operand type(s) for +: 'int' and 'str'

In the above example, variable a is of the number data type, and variable b is of the string data
type. When we try to add these two integers and store the value in a variable named result1, a
TypeError occurs, as shown in the output.

So, in order to perform this operation, we have to use explicit type casting.

Ex 1:

a = 100
b = "200"
b = int(b) # Explicit Type casting
result2 = a + b
print(result2)

Output:

300

48
Ex 2: Converting string into int ( ) data type

n = int(input("Enter a number: "))


res = n * n
print(res)

Output:
Enter a number: 4
16

10. Debugging:

Debugging is the process of identifying and resolving errors or "bugs" in your Python code. These
errors can be categorized as:

 Syntax Errors: Errors that violate Python's grammatical rules, like typos or incorrect
punctuation.

 Runtime Errors: Errors that occur during program execution, often due to unexpected input or
conditions (e.g., dividing by zero).

 Logic Errors: Errors where the program runs without crashing but produces incorrect output
due to faulty logic or an incorrect algorithm.

1. Tracebacks: Your First Clue

When Python encounters an error during execution, it displays a "traceback", which is a


detailed message indicating the location and type of the error. Learning to interpret tracebacks is
essential for efficient debugging.

Common types of traceback errors in Python include:

 IndexError: Occurs when a sequence is referenced with an out-of-range index.

 NameError: Occurs when a variable is used that has not been defined.

 KeyError: Occurs when a dictionary key is accessed that does not exist.

 TypeError: Occurs when an operation is performed on an object of an inappropriate type.

 ValueError: Occurs when a function receives an argument of the correct type but an invalid
value.

 AttributeError: Occurs when an attribute reference or assignment fails.

 ImportError: Occurs when an import statement fails to find a module or cannot import a
specific name from a module.

49
a) Ex: TypeError

a = 100
b = "200"
print(a+b)

Output:

Traceback (most recent call last):

File "C:\Users\mbraj\PyCharmMiscProject\[Link]", line 3, in <module>

print(a+b)

~^~

TypeError: unsupported operand type(s) for +: 'int' and 'str'

b) Ex: NameError

a = 100
b = 200
print(a+b+c)

Output:

Traceback (most recent call last):

File "C:\Users\mbraj\PyCharmMiscProject\[Link]", line 3, in <module>

print(a+b+c)

NameError: name 'c' is not defined

2. Print statements:

One of the simplest ways to debug is by strategically placing print() statements in your code to display
variable values and track the program's flow. This allows you to see the program's state at specific
points and identify where things go wrong.

Ex:

print "Hello, World!"

This would result in a SyntaxError: Missing parentheses in call to 'print'. To fix this, you should use
parentheses:

print("Hello, World!")

50
3. Python Debugger (Pdb):

Pdb, Python's built-in debugger, offers more control and features than print statements. It provides an
interactive debugging environment where you can:

 Set breakpoints: Pause code execution at specific lines using pdb.set_trace().

 Step through code: Execute code line by line with commands like n (next) and s (step).

 Inspect variables: Examine variable values using the p (print) command.

 View the call stack: See the sequence of function calls using the w (where) command.

Example using Pdb:

import pdb
def buggy_function(x, y):
result = x * y
pdb.set_trace() # Breakpoint set using pdb
return result
buggy_function(5, "abc")

Output:

(Pdb) p x
5
(Pdb) p y
'abc'
(Pdb) n
TypeError: unsupported operand type(s) for *: 'int' and 'str'

4. IDE debuggers:

Integrated Development Environments (IDEs) like PyCharm and Visual Studio Code (VS Code)
provide graphical debuggers that offer a more user-friendly interface compared to Pdb. These
debuggers allow you to:

 Set and manage breakpoints visually.

 Step through code with dedicated buttons (step over, step into, step out).

 Inspect variables in a dedicated panel or by hovering over them.

 View the call stack and manage exceptions in a dedicated window.

5. Logging:

Python's built-in logging module offers a more structured and robust approach to tracking events and
errors than simple print statements, particularly in larger applications. Logging allows you to:

 Define different log levels: Categorize messages based on severity


(e.g., DEBUG, INFO, WARNING, ERROR, CRITICAL).

51
 Configure output destinations: Send logs to the console, a file, or other custom handlers.

 Customize log message formats: Include information like timestamps, log levels, and module
names.

6. Testing:

Writing unit tests for your code can help you catch errors early and prevent bugs before they occur.
Tests provide controlled inputs and verify that your functions behave as expected, enabling you to
identify and fix issues systematically.

11. Flow of Control in Python:

Flow of Control:

Flow of control refers to the order in which the statements in a program are
executed. By default, Python executes statements sequentially from top to bottom. However, control
flow statements allow for altering this default order, enabling decision-making, repetition, and
handling of various program states.

The Primary Types of control flow are

a) Selection / Conditional statements


b) Repetition / Looping / iterative statements and
c) Jumping statements

a) Selection statements:

Selection statements also known as decision-making or branching statements, are used to


control the flow of a program based on certain conditions. These statements allow the program to
execute different blocks of code depending on whether a condition is true or false.

Python provides several types of selection statements,

1. Simple if

2. if…else

[Link] if

4. if…elif

1. Simple if : The if statement is the simplest form of a selection statement. It executes a block of code
if a given condition is true.

Syntax:
if (condition):

#True statements

52
Ex:
x = 10
if (x > 5):
print("x is greater than 5")

Output:
x is greater than 5

2. if…else: The if-else statement extends the if statement by providing an alternative block of code to
execute if the condition is false. i.e if the condition is true, true block executed otherwise false block
executed.

Syntax:

if (condition):

#True statements

else:

#False statements

Ex:
x=2

if (x > 5):
print("x is greater than 5")

else:

print("x is less than 5")

Output:

x is less than 5

3. Nested if: Nested if statements involve placing an if statement inside another if statement,
allowing for hierarchical decision-making. If the condition1 becomes true check another condition i.e
condition 2.

Syntax:

if(condition1):
if(condition2):
#stmt-1
else:
#stmt-2
else:
#stm-3

53
Ex:

age=int(input("Enter your age:"));


if age>=18:
if age>=60:
print("Eligible to vote and senior citizen")
else:
print("Eligible to vote but not senior citizen")
else:
print("Not eligible to vote")

Output:

Enter your age:21

Eligible to vote but not senior citizen

4. if…elif: The elif (short for "else if") statement allows checking multiple conditions sequentially.
The first condition that evaluates to true will have its corresponding block of code executed. If all the
conditions becomes false, else block will be execute.

Syntax:

if(condition1):

#stmt-1

elif(condition2):

#stmt2

else:

#stmt3

Ex:

a=10
b=10
if(a>b):
print("a is greater!")
elif(b>a):
print("b is greater!")
else:
print("a and b are equal!")

Output:

a and b are equal!

54
b) Repetition Statements:
Repetitive statements, also known as loops or iterative statements, are used to execute a block
of code for multiple times. This is particularly useful for automating tasks that need to be performed
repeatedly.
 Looping constructs provide the facility to execute a set of statements in a program repetitively,
based on a condition.
 The statements in a loop are execute again and again as long as the particular logical conditions
remains true.
 This condition is checked based on the value of a variable called the loop’s control variable.
 When the condition becomes false, the loop terminates.

There are 2 types loops,

1. While loop statements and


2. For loop statements
1. While loop statements: The while statement executes a block of code repeatedly as long as the
control condition of the loop is true.

 The control condition of the while loop is executed before any statement inside the loop is
executed.
 After each iteration, the control condition is tested again and the loop continues as long as the
condition remains true.
 When this condition becomes false, the statements in the body of loop are not executed and the
control is transferred to the statement immediately follow the body of the while loop.
 If the condition of the while loop is initially false, the block statements not executed even once.
 The statements within the body of the while loop must ensure that the condition eventually
becomes false; otherwise the loop will become an infinite loop.

Syntax:
while (condition):
#block of code
#increase loop_variable value

Ex:
n = int(input("Enter n value:"))
i=1
while(i<=n):
print(i)
i=i+1

Output:
Enter n value:5
1
2
3
4
5

55
2. For loop statements:
The for statement is used to iterate over a range of values or a sequence. The for loop is
executed for each of the items in the range. These values can be either numeric, or they can be
elements of a data type like a string, list or tuple.
 With every iteration of the loop, the control variable checks whether each of the values in the
range have been traversed or not.
 When all the items in the range are exhausted, the statements within loop are not executed. The
control is then transferred to the statement immediately following the for loop.
 While using a for loop, it is known in advance the number of times the loop will execute.

Syntax:
for control_variable in sequence/items in range:

#block of the code

Ex 1:
n = int(input("Enter n value:"))
for i in range(n):
print(i)

Output:

Enter n value:5
1
2
3
4

Ex 2:
n = int(input("Enter n value:"))
for i in range(1,n+1):
print(i)

Output:
Enter n value:5
1
2
3
4
5

Ex 3:

n = int(input("Enter n value:"))
for i in range(1,n+1,3):
print(i)

56
Output:
Enter n value:10
1
4
7
10

Range ( ) function:
range ( ) is a built-in function, used to create a list containing a sequence of integers
from the give start value up to stop value (excluding stop value).The parameters are start, stop and
step.

 The start and step parameters are optional.


 If the start value is not specified, by default starts from 0.
 If step is also not specified, by default the value increases by 1 in each iteration.
 All parameters of range( ) function must be integers.
 The step parameter can be positive or a negative integer excluding(except) 0.

Syntax:
range(start, stop, step)

Ex:
for i in range ( 1, 10, 3):
print (i)

Output:
1
4
7
10

c) Jumping statements:
Jump statements in Python are used to alter the normal flow of program execution, primarily within
loops and functions.

Jumping Statements are


1. Break statements
2. Continue statements
3. Return statements

1. Break statements:
The break statement immediately terminates the current loop (either for or while) and transfers
control to the statement immediately following the loop. It is commonly used when a specific
condition is met within the loop, and further iterations are no longer needed.

Syntax:
for control_variable in sequence/items in range:
#block of the code
if (condition):
break

57
Ex:
for i in range(5):
if i == 3:
break
print(i)

Output:
0
1
2

2. Continue statement:
The continue statement skips the rest of the code within the current iteration of a loop and
proceeds to the next iteration. It is useful when you want to skip certain parts of a loop's execution
based on a condition but still want the loop to continue.

Syntax:

for control_variable in sequence/items in range:


#block of the code
if (condition):
continue
Ex:
for i in range(5):
if i == 2:
continue
print(i)

Output:
0
1
3
4

3. Return statement:
The return statement is used within a function to exit the function and optionally return a value
to the caller. When return is encountered, the function's execution stops, and the specified value
(or None if no value is specified) is sent back to the point where the function was called.

Ex:
def calculate_sum(a, b):
return a + b

result = calculate_sum(5, 7)
print(result)

Output:
12

58
12. Write about Nested Loops.

Nested Loop:
A nested loop is a loop placed inside another loop. This structure allows you to iterate through
items in multiple sequences or repeat actions within the outer loop's iteration. You can nest
both for and while loops.

Working of Nested loop:


 Outer loop: This loop contains the inner loop(s) and controls the primary iterations.

 Inner loop: These loops are inside the outer loop and execute fully for each iteration of the
outer loop.

 Execution flow: For each iteration of the outer loop, the inner loop completes all its iterations
before the outer loop proceeds to its next iteration.

Uses:
 Working with multidimensional data, such as lists of lists or matrices.

 Generating patterns. For example, printing stars or numbers in a specific pattern, like a triangle
or a square.

 Creating pairwise combinations. For example, generating all possible pairs from two lists.
 Matrix operations, such as matrix addition, subtraction, multiplication and transposition.

Syntax:
for control_variable in range(stop):
for control_variable in range(stop):

#block of code to perform

Ex:
n=int(input("Enter n to generate patterns: "))
for i in range(1,n+1):
for j in range(1,i+1):
print(j,end=" ")
print(" ")

Output:
Enter n to generate patterns: 5

1
12

123
1234

12345

59
13. Write about indentation.

Indentation:

 Indentation in Python refers to the leading whitespace (spaces or tabs) at the beginning of a
line of code.
 Unlike many other programming languages that use curly braces or keywords to define code
blocks, Python uses indentation to denote the structure and scope of code.
 This makes indentation a fundamental part of Python's syntax, and incorrect indentation will
lead to IndentationError exceptions.

Key aspects of Python indentation:

Defining Code Blocks:

Indentation is used to define blocks of code associated with control structures


like if statements, for and while loops, function definitions, class definitions, and more. All
statements within a single block must have the same level of indentation.

Consistency:

While the number of spaces or the use of tabs versus spaces for indentation is flexible,
consistency within a single code block is crucial. Mixing tabs and spaces for indentation within the
same file can lead to unexpected errors. The widely accepted convention is to use four spaces per
indentation level.

Hierarchy:

Indentation levels create a visual hierarchy, clearly showing which statements belong to which
block. A deeper indentation indicates a nested block of code.

Readability:

Beyond its syntactic role, indentation significantly enhances the readability of Python
code by visually representing its logical structure.

Ex 1: if else indentation

age = 21
if (age >= 18):
print("Eligible to vote!") # This line is indented, belonging to the 'if' block
if (age >= 60):
print("Now, you are senior citizen!") # This line is further indented, belonging to the nested 'if'
else:
print("Not eligible to vote!.") # This line is indented, belonging to the 'else' block

Output:

Eligible to vote!

60
Ex 2: Loop indentation

The two lines of code in the while loop are both indented four spaces. It is required for indicating
what block of code a statement belongs to.

Ex:

i=1
while(i<= 5):
print(i)
i=i+1

Output:
1
2
3
4
5

14. Python String and its Functions

String:
Python string is a data type used to represent a sequence of characters. These characters can
include letters, numbers, symbols, and whitespace. Strings are immutable, meaning that once a string
is created, its content cannot be changed. This immutability feature ensures that strings are consistent
and secure throughout the program execution.

String Creation:
Strings are enclosed within either single quotes (' '), double quotes (" "), or triple quotes (''' ''' or """
""").
Ex:
# Single quotes
single_quote_string = 'Hello, World!'

# Double quotes
double_quote_string = "Hello, World!"

# Using triple single quotes


triple_single_quote_string = '''This is a multi-line
string that spans
several lines.'''

# Using triple double quotes


triple_double_quote_string = """This is another way
to create a multi-line string."""

61
Traversing a String:

 Traversing a string in Python refers to the process of accessing each character of a string one
by one, typically for processing or manipulation.
 Since strings in Python are sequences of characters and are inherently ordered, they can be
iterated over each character in a string and print it using various looping constructs and built-in
functions.
 Another method involves using the range( ) function to access characters by their index.
 Additionally, the enumerate( ) function can be used to get both the index and the character
during iteration.
 This pattern of processing each element in a sequence is known as traversal and is a
fundamental concept in programming.

Method 1: for loop

Ex:
str1 = "Hello"
for char in str1:
print(char)
Output:
H
e
l
l
o

Method 2: for loop with range( )

Ex:
str1 = "Hello"
for i in range(len(str1)):
print(str1[i])
Output:
H
e
l
l
o

Method 3: while loop


Ex:
str1 = "Hello"
i=0
while(i<len(str1)):

62
print(str1[i])
i+=1
Output:
H
e
l
l
o

Method 4: enumerate( )
Ex:
str1 = "Hello"
for i, c in enumerate(str1):
print(i, c)
Output:
0H
1e
2l
3l
4o

String Operations:
String operations refer to the various actions that can be performed on strings, such as
comparison, concatenation, slicing, and using built-in methods.
1. Comparison Operations
2. Concatenation
3. Slicing
4. Membership Testing

1. Comparison Operations
You can use standard comparison operators (==, !=, <, >, <=, >=) to compare strings. When
comparing two strings, the characters are compared sequentially based on their ASCII values.
Ex:
str1 = "Hello, world!"
str2 = "I like to code."
str3 = "Hello, world!"
print(str1 == str2) #False
print(str1 == str3) #True

2. Concatenation:
Strings can be joined (concatenated) using the + operator.
Ex:
greet = "Hello, "
name = "Jack"
result = greet + name
print(result) # Output: Hello, Jack

63
3. Slicing:
String slicing allows you to extract a portion of a string by specifying the start and end indices.
The syntax for slicing is string[start:end], where start is the starting index and end is the stopping
index (excluded)

Ex:
s = "Hello, world!"
print(s[1:4]) # Output: ell
print(s[:3]) # Output: Hel
print(s[7:]) # Output: world!

4. Membership Testing:
The in keyword checks if a particular substring is present in a string.
Ex:
s = "Hello user, welcome!"
print("user" in s) # True
print("Welcome" in s) # False
print("hello" not in s) # True

Built-in methods / String Handling functions:

1. len( ) 7. cout( ) 13. strip( )


2. lower( ) 8. find( ) 14. lstrip( )
3. upper( ) 9. startswith( ) 15. rstrip( )
4. capitalize( ) 10. endswith( ) 16. islower( )
5. title( ) 11. split( ) 17. isupper( ) 19. isdigit( )
6. swapcase( ) 12. replace( ) 18. isalpha( ) 20. isalnum( )

1. len( ): This function is used to return the number of characters present in the string.
Syntax:
var_name = len(string)
Ex:
str1 = "Hello user!"
len = len(str1)
print("Length = ",len) #Length = 11

2. lower( ): This function is used to convert the string into lower case.
Syntax:
var_name = [Link]( )

Ex:
str1 = "Hello user!"
str2 = [Link]()
print(str2) # hello user!

64
3. upper( ): This function is used to convert the string into upper case.
Syntax:
var_name = [Link]( )

Ex:
str1 = "Hello user!"
str2 = [Link]()
print(str2) # HELLO USER!

4. capitalize( ): This function is used to Capitalizes the first character of the string.
Syntax:
var_name = [Link]( )
Ex:
str1 = "welcome user!"
print([Link]()) # Welcome user!

4. title( ): This function is used to converts the string into title case i.e. first letter of each word is
capitalized.
Syntax:
var_name = [Link]( )

Ex:
str1 = "welcome user!"
print([Link]()) # Welcome User!

6. swapcase( ): This function is used to swaps the case of all characters i.e uppercase to lowercase and
vice-versa
Syntax:
var_name = [Link]( )

Ex:
str1 = "Welcome User!"
print([Link]()) # wELCOME uSER!

7. count( ): This function returns the number of times a specified substring appears within the string. It
is case sensitive.
Syntax:
var_name = [Link](substring)

Ex:
str1 = "hello world, hello python, hello user"
print([Link]("hello")) # 3
print([Link]("o")) #5

65
8. find( ): This function is used to returns the starting index value for the specified substring appears
within the string. If no substring is in string, then return -1.

Syntax:
[Link](substring)

Ex:
str1 = " Welcome User!"
print([Link]('c')) #3

9. startswith( ): This function returns True if the string starts with the specified prefix.
Syntax:
var_name = startswith(prefix)

Ex:
str1 = "Hello python coder!"
print([Link]("Hello")) # True
print([Link]("hello")) # False

10. endswith( ): This function returns True if the string ends with the specified suffix.
Syntax:
var_name = startswith(suffix)

Ex:
str1 = "Hello python coder!"
print([Link]("coder")) # False
print([Link]("coder!")) # True

11. split( ): This function is used to splits the string into a list of substrings based on the delimiter.
Syntax:
list_name = [Link](delimiter)

Ex:
str1 = "Hello python coder!"
spliter=[Link](" ")
print(spliter)

Output:
['Hello', 'python', 'coder!']

12. replace( ) : This function is used to replace occurrences of a specified substring with another
substring within a string. It returns a new string with the replacements applied, as strings are
immutable in python, meaning the original string remains unchanged.
Syntax:
var_name = [Link](old, new, count)

66
Ex:
str = "Hello world, hello python, hello coder!"
str1 = [Link]("hello","Hi")
print(str1)
str2 = [Link]("hello","Hi",1)
print(str2)
str3 = [Link]("Welcome","Hi")
print(str3)

Output:
Hello world, Hi python, Hi coder!
Hello world, Hi python, hello coder!
Hello world, hello python, hello coder!

13. islower( ): This function returns True, if all the characters in string are in lower case, otherwise
False. Only alphabets are checks.

Syntax:
var_name = [Link]( )

Ex:
str1 = "hello user!"
str2 = [Link]()
print(str2) # True

14. isupper( ): This function returns True, if all the characters in string are in upper case, otherwise
False. Only alphabets are checks.

Syntax:
var_name = [Link]( )

Ex:
str1 = "Hello User!"
str2 = [Link]()
print(str2) # False

15 strip( ): This function is used to remove leading (beginning) and trailing (ending) whitespace in
string. You can also specify which characters need to remove.

Syntax:
var_name = [Link](characters)

Ex:
str1 = " Hello user! "
print([Link]()) # Hello user!
print([Link](".@!")) #Hello user

67
16. lstrip( ): This function is used to removes leading(beginning) whitespace (or specified characters).
Syntax:
var_name = [Link]( )
Ex:
str = " Hello coder! "
print([Link](" ")) # Hello coder!

17. rstrip( ): This function is used to removes trailing(ending) whitespace (or specified characters).
Syntax:
var_name = [Link]( )

Ex:
str = " Hello coder! "
print([Link](" ")) # Hello coder!

18. isalpha( ): This function returns True, if string has only alphabet characters.
Syntax:
var_name = [Link]( )

Ex:
str1 = "Hello"
str2 = "Hello123"
print([Link]()) # True
print([Link]()) # False

19. isdigit( ): This function returns True, if string has only number characters.
Syntax:
var_name = [Link]( )

Ex:
str1 = "Hello"
str2 = "123"
print([Link]()) # False
print([Link]()) # True

20. isalum( ): This function returns True, if string has only alphanumeric characters.
Syntax:
var_name = [Link]( )

Ex:
str1 = "Hello123"
str2 = "Hello 123"
print([Link]()) # True
print([Link]()) # False

68
UNIT – III

Functions: Functions, Built-in functions, User defined functions, recursive functions, scope of a
variable, Python and OOP: Defining classes, Defining and calling functions, passing arguments,
Inheritance, polymorphism, Modules – date time, math, Packages. Exception Handling – Exception in
python, Types of Exception, User-defined Exceptions.

1. Functions:

Python Functions is a block of statements that does a specific task. The idea is to put some
commonly or repeatedly done task together and make a function so that instead of writing the same
code again and again for different inputs, we can do the function calls to reuse code contained in it
over and over again.

Benefits of Using Functions


 Code Reuse
 Reduced code length
 Increased redability of code

Types of functions:
1. Built – in functions and
2. User – defined functions

1. Built – in functions:

 Built-in functions are pre-defined functions that come with the Python programming language
and can be used directly without importing any external modules.
 These functions provide basic functionality for various tasks, such as manipulating data types,
handling input/output, and performing mathematical operations.
 Built-in functions are readily available for use in any Python program without requiring
additional specifications or definitions.

For example, functions like print( ), len( ), and type( ), sum( ), max( ), min( ), and abs() are commonly
used built-in functions..

Ex:

str1 = "Hello"
a = (1,2,3,4,5)
print(len(str1)) #5
print(sum(a)) # 15
print(min(a)) #1
print(max(a)) #5

69
2. User – Defined functions:

 User-defined functions in Python are functions created by the user to perform specific tasks.
 These functions are defined using the def keyword and can be called multiple times throughout
a program.
 They help in decomposing a large program into smaller segments, making the program easier
to understand, maintain, and debug.
 A user-defined function can have parameters, which are placeholders for values that the
function will use.
 These parameters can have default values, making them optional.

Syntax: for function definiton

def function_name(parameters):
# statements
# return expression

 def: We use the keyword def to declare a function.

 function_name: We give the function a name so we can call and reuse it whenever we want in
our program.

 parameters: These are input variables or values that are passed into the function. Parameters
are optional, so a function may have none. They are enclosed in parentheses and separated by
commas.

 Function body: This part contains the actual code that runs when we call the function. It’s
indented under the def line.

 return: We use return to send a value back to where the function was called, so we can use
the result.

Syntax: for function call

function_name(arguments) # without return values

var_name = function_name(arguments) #with return values

Ex 1: With return values

# function definition
def add_numbers(x, y):
sum = x + y
return sum

# function call
result = add_numbers(2, 3)
print("Addition = ",result)
print("Sum = ",add_numbers(5,6))

70
Output:

Addition = 5
Sum = 11

Ex 2: without return values

def add_numbers(x, y):


sum = x + y
print("Additon = ",sum)
add_numbers(5,6)

Output:
Addition = 11

Types of User – Defined functions:

1. Parameterized Functions
2. Function with default arguments
3. Keyword argument functions
4. Variable length argument functions
5. Function with return values and
6. Lambda functions

1. Parameterized Functions: These functions accept parameters (arguments) to process and return
results dynamically. Parameters allow for flexibility, enabling the function to handle different inputs
each time it is called.

Ex:
def fun(name):
print("Name is ",name) # Name is Guido Van Rossum
fun("Guido Van Rossum")

2. Function with default arguments: A function can have default values assigned to its parameters. If no
argument is provided when calling the function, it takes the default value.

Ex:
def fun(x, y=50):
print("x:", x) # x: 10
print("y:", y) # y: 50

fun(10)

71
3. Keyword argument functions: Function arguments can be passed using keywords to improve
code readability. This ensures the correct mapping of values to parameters, regardless of their order.

Ex:
def fun(id,name):
print("ID:",id)
print("Name:",name)
fun(id=101,name="Guido")
fun(name="Clara",id=102)

Ex:
ID: 101
Name: Guido
ID: 102
Name: Clara

4. Variable length argument functions: When the number of arguments is unknown, a function can
accept multiple arguments using *args (for non-keyword arguments) or **kwargs (for keyword
arguments).

Ex:
def fun(*args):
for arg in args:
print(arg)
fun("Python","Java","C++")

Output:
Python
Java
C++

5. Functions with return value: A function can return a value using the return statement. This allows
the function to send back a result for further computation.

Example:
def fun(n):
return n * n
res = fun(5)
print("Square = ",res)

Output:
Square = 25

72
6. Lambda Functions:

Lambda Functions are anonymous functions means that the function is without a name that is
defined in a single line using the lambda keyword. It is used for short, simple operations where
defining a full function is unnecessary.
As we already know the def keyword is used to define a normal function in Python. Similarly,
the lambda keyword is used to define an anonymous function.

Ex 1:
res = lambda x: x * x
print("Square = ", res(5))

Output:
Square = 25

Ex 2:
n = lambda x: "Even" if(x%2==0) else "Odd"
print(n(4)) # Even

2. Recursive Function:

A recursive function is a function that calls itself during its execution. This technique is employed to
solve problems that can be broken down into smaller, self-similar subproblems.

Key components of a recursive function:

Base Case:
This is a condition that stops the recursion. When the base case is met, the function returns a
value without making further recursive calls, preventing infinite recursion.
Recursive Case:
In this part, the function calls itself with a modified input that moves closer to the base case.

Syntax:

def recursive_function(parameters):
if (base_case_condition):
return base_result
else:
return recursive_function(modified_parameters)

73
Example: Factorial Calculation

The factorial of a non-negative integer n (denoted as n!) is the product of all positive integers
less than or equal to n. For example, 5! = 5 * 4 * 3 * 2 * 1 = 120.

n = int(input("Enter n : "))

# Calculates the factorial of a non-negative integer using recursion.


def factorial(x):
if (x == 0): # Base case: factorial of 0 is 1
return 1
else: # Recursive case: x * factorial(x-1)
return x * factorial(x - 1)

#Function Call
result = factorial(n)
print(f"Factorial of {n} is ", result)

Output:
Enter n : 5
Factorial of 5 is 120

Explanation :

Base Case: If x is 0, the function immediately returns 1. This is the stopping condition.
Recursive Case: If x is not 0, the function returns x multiplied by the result of calling factorial with
x-1. This process continues until x becomes 0, at which point the base case is hit and the results are
returned up the call stack.

Advantages:
 Can lead to cleaner and more readable code for problems with inherent recursive structures
(e.g., tree traversals, Fibonacci sequence).
 Simplifies the solution of complex problems by breaking them into smaller, manageable parts.

Disadvantages:
 Can be less efficient than iterative solutions due to the overhead of function calls and increased
memory usage (call stack).
 Risk of "Stack Overflow" errors if the recursion depth exceeds Python's default limit,
especially with large inputs.
 Debugging can be more challenging compared to iterative solutions.

74
3. Scope of Variable:

In Python, the scope of a variable determines where it is accessible and visible within a
program.

There are four main types of scope: local, enclosing, global, and built-in, which follow the LEGB rule.

1. Local Variable:
A local scope refers to variables defined within a function or a lambda expression. These
variables are only accessible within the function and are destroyed once the function execution is
completed.

Ex:
def number( ):
a=10 # Local Variable
print(a)
number( )

For example, if a variable is defined inside a function, it cannot be accessed outside of that function.

2. Enclosing Variable:
An enclosing scope exists in nested functions. It contains variables defined in the outer
function, which are accessible to both the outer and inner functions.

Ex:

def out_fun( ):
a=10 # Enclosing Variable
def in_fun( ):
print(a)
in_fun( )
out_fun( )

In this example, variable is an enclosing variable because it is defined in the outer function as
out_fun( ) and is accessible to the inner function as in_fun( )

3. Global Variable:
Global scope refers to variables defined at the top level of a script or outside of any function.
These variables can be accessed from any part of the program, including inside functions.
However, modifying a global variable inside a function requires the use of the global keyword to
explicitly declare it.

Ex 1:
x=100 # Global Variable
def func():
y=200

75
print(x,y)
func()
print(x)

Output:
100 200
100

If you only read the value of a global variable inside a function, it is treated as a global variable
without needing the global keyword.

Ex 2:
x=100
def func():
global x
x=500 # changing global variable value
y=200
print(x,y)
func()
print(x)

Output:
500 200
500
We use the global keyword for global variable inside the function, to change the global variable value.

4. Built-in variables:

 Built-in variables refer to the variables that are automatically available in the global
namespace without needing to import any modules.
 These variables are part of the Python interpreter's default environment and include functions,
types, and other objects that are always accessible.
 For example, functions like print(), type(), and id() are built-in functions, and variables
 like _ _name_ _ and _ _file_ _ are built-in variables that provide information about the current
module or script.

76
4. Python and OOP:

Introduction to OOPS:

Object-Oriented Programming (OOP) is a programming paradigm that structures code around


objects, which represent real-world entities and their interactions. Python, being an object-oriented
language, leverages OOP concepts to enhance code organization, reusability, and maintainability.

The 4 core principles of OOP are:

1. Encapsulation:
 Encapsulation involves bundling data (attributes) and methods (functions) within a class,
creating a self-contained unit.
 It protects data integrity by restricting direct access to an object's internal state, allowing
modifications only through defined methods.
 In Python, encapsulation is achieved through naming conventions using underscores. Single
underscores (_) suggest protected access (accessible within the class and subclasses), while
double underscores (_ _) indicate private access (accessible only within the class).

2. Inheritance:
 Inheritance allows creating new classes (child classes) based on existing ones (parent classes),
inheriting their attributes and methods.
 This promotes code reuse and a hierarchical structure among classes. Child classes can also
extend or override inherited functionalities to suit their specific requirements.
 Python supports various types of inheritance, including single, multiple, multilevel,
hierarchical, and hybrid inheritance.

3. Polymorphism:
 Polymorphism, meaning "many forms," allows objects of different classes to be treated as
instances of a common base class, as long as they share a common interface or behavior.
 This enables the use of a single method name to perform different actions depending on the
object type, increasing code flexibility and adaptability.
 Polymorphism in Python is typically achieved through method overriding, where subclasses
provide their specific implementation of a method defined in the parent class.

4. Abstraction:
 Abstraction focuses on simplifying complex systems by modeling only their essential features,
hiding unnecessary implementation details.
 It allows developers to concentrate on what an object does rather than how it does it, reducing
complexity and enforcing consistency across the codebase.
 In Python, abstraction is achieved using abstract classes and methods, often utilizing
the abc module (Abstract Base Classes).

77
Classes and objects:
 A class acts as a blueprint or template for creating objects. It defines the structure and behavior
(attributes and methods) that objects of that class will possess.
 An object is an instance of a class, representing a concrete realization of the blueprint with its
unique data and state.
 Objects are created through instantiation, which involves calling the class name as if it were a
function, thereby invoking the class's constructor (_ _init_ _ method) to initialize its attributes.

a) Defining Classes:
In Python, a class is like a blueprint or a template for creating objects. Think of a class as a
blueprint for a house – it outlines the structure and features, but it's not a house you can live in yet.
An object is a specific instance of that class, like a particular house built from that blueprint,
with unique details (like its color, number of floors, etc.).

Classes are used to define the characteristics (attributes) and actions (methods) that objects of that
class will possess.

For example, a "Car" class might define attributes like color, model, and year, and methods
like start(), accelerate(), and brake().

Syntax:

To define a class in Python, use the class keyword followed by the class name and a colon.

class MyClass:
# Class attributes and methods go here
Pass

 class: This is the keyword to start a class definition.


 MyClass: This is the name of the class. It's recommended to use PascalCase for class names
(capitalizing the first letter of each word).
 : A colon marks the end of the class name and the start of the class body.
 pass: This is a placeholder statement that does nothing. You can use it when you haven't yet
defined any attributes or methods within your class body.

Attributes:

Attributes are variables associated with a class or its objects.

There are 2 types of attributes:

 Class Attributes: These are variables defined directly inside the class body, outside of any
methods. They are shared by all instances of the class. For example, in a Dog class,
a species attribute set to "Canis familiaris" would be a class attribute, meaning all dogs of that
class are the same species.

78
 Instance Attributes: These are variables defined inside the _ _init_ _() method (the constructor)
and belong to a specific instance of the class. They are unique to each object and represent its
particular state.
For example, a name and age attribute within a Dog class would be instance attributes because each
individual dog has its own unique name and age.
Ex:
class Dog:
species = "Canis familiaris" # Class attribute
def _ _init_ _(self, name, age):
[Link] = name # Instance attribute
[Link] = age # Instance attribute

self Parameter: The first parameter of any instance method is conventionally named self. It refers to
the specific instance of the object on which the method is called. Python automatically passes the
instance to self when the method is invoked.
_ _init_ _( ) Method: This is a special method called the constructor, executed automatically
whenever a new object (instance) of the class is created. It's used to initialize the object's instance
attributes.
Instance Methods: These methods operate on the specific data of an object, often using or modifying
the instance attributes.

Class Methods:
 Class methods operate on the class itself rather than a specific instance.
 They are defined using the @classmethod decorator and take the class as their first argument,
conventionally named cls.
 Class methods can access and modify class-level data or perform operations related to the class
as a whole.
 To call a class method, you can use the class name followed by a dot (.) and the method name,
or call it from an instance of the class.
Ex:
class Dog:
species = "Canis familiaris" # Class variable
def __init__(self, name, age):
[Link] = name
[Link] = age

@classmethod
def get_species(cls):
return [Link]

# Call the class method using the class name


print(Dog.get_species()) # Output: Canis familiaris

# Call the class method using an instance (also works)


my_dog = Dog("Buddy", 3)
print(my_dog.get_species()) # Output: Canis familiaris

79
Creating objects (instantiation):
Once you've defined a class, you can create objects (instances) from it by calling the class
name like a function, passing any arguments required by the _ _init_ _() method.

dog1 = Dog("Buddy", 1) # Creating an object of the Dog class


dog2 = Dog("Lucy", 5) # Creating another object of the Dog class

Accessing attributes and methods:


You can access an object's attributes and methods using dot notations ( . )
object_name.attribute_name
object_name.method_name( ).

print("Dog 1 ")
print("Name: ",[Link]) # Accessing an instance attribute
print("Age : ",[Link]) # Accessing another instance attribute
[Link]() # Calling an instance method
print(dog1.is_puppy([Link])) # Calling an instance method

Output:
Dog 1
Name: Buddy
Age : 1
Buddy says woof!
True
By defining classes, you can structure your code in an organized and reusable way, modeling real-
world entities and their interactions.

b) Calling Methods in OOPs:

Methods:
In Python, there are primarily three ways to call functions associated with a class or an object:

1. Instance method
2. Class method and
3. Static method

Instance Method:
 These are the most common methods used in Python classes.
 They operate on individual objects of the class and can access and modify the object's state.
 Instance methods require the instance itself to be passed as the first argument, conventionally
named self.
 To call an instance method, you first create an instance of the class, then use the instance name
followed by a dot (.) and the method name, passing any required arguments within the
parentheses.

80
Ex:
class Dog:
def __init__(self, name, age):
[Link] = name
[Link] = age

def bark(self):
print(f"{[Link]} says bark bark!")

# Create an instance of the Dog class


my_dog = Dog("Buddy", 3)

# Call the instance method


my_dog.bark()

Output:
Buddy says bark bark!

Class Methods:
 Class methods operate on the class itself rather than a specific instance.
 They are defined using the @classmethod decorator and take the class as their first argument,
conventionally named cls.
 Class methods can access and modify class-level data or perform operations related to the class
as a whole.
 To call a class method, you can use the class name followed by a dot (.) and the method name,
or call it from an instance of the class.
Ex:
class Dog:
species = "Canis familiaris" # Class variable
def __init__(self, name, age):
[Link] = name
[Link] = age

@classmethod
def get_species(cls):
return [Link]

# Call the class method using the class name


print(Dog.get_species()) # Output: Canis familiaris

# Call the class method using an instance (also works)


my_dog = Dog("Buddy", 3)
print(my_dog.get_species()) # Output: Canis familiaris

81
Static Methods:
 Static methods are utility functions that belong to the class but do not operate on the instance or
the class itself.
 They are defined using the @staticmethod decorator and do not require self or cls as their first
argument.
 To call a static method, you can use the class name followed by a dot (.) and the method name,
similar to calling a regular function.

Ex:
class MathUtils:
@staticmethod
def add(x, y):
return x + y

# Call the static method using the class name


print([Link](5, 3)) # Output: 8
def bark(self): # Instance method
print(f"{[Link]} says woof!")

@classmethod
def get_species(cls): # Class method
return [Link]

@staticmethod
def is_puppy(age): # Static method
return age < 2

c) Passing Paraments in OOPs:

In Python's Object-Oriented Programming (OOP), parameters are passed to methods and the
constructor (__init__) of a class in a similar way to how they are passed to regular functions.
2 types:

1. Passing Parameters to Constructor and


2. Passing Parameters to Instance Methods

1. Passing Parameters to the Constructor (__init__):

The __init__ method is a special method called when a new instance (object) of a class is
created. It's used to initialize the object's attributes.
Ex:
class Dog:
def __init__(self, name, age): # 'self' is the instance, 'name' and 'age' are parameters
[Link] = name
[Link] = age

82
# Creating an instance and passing arguments to the __init__ method
my_dog = Dog("Buddy", 3)
print(my_dog.name) # Output: Buddy
print(my_dog.age) # Output: 3

In this example, "Buddy" and 3 are passed as arguments to the name and age parameters of
the __init__ method, respectively.

2. Passing Parameters to Instance Methods:

Instance methods operate on specific instances of a class and often require data to perform their
actions.

Ex:
class Calculator:
def add(self, num1, num2): # 'self' is the instance, 'num1' and 'num2' are parameters
return num1 + num2

# Creating an instance
calc = Calculator()

# Calling the 'add' method and passing arguments


result = [Link](10, 5)
print(result) # Output: 15
Here, 10 and 5 are passed as arguments to the num1 and num2 parameters of the add method.

self parameter:
The first parameter in any instance method (including __init__) is conventionally
named self. It refers to the instance of the class itself and is automatically passed by Python when you
call a method on an object. You do not explicitly pass an argument for self.

Arguments vs. Parameters:


In the function/method definition, the placeholders for values are called parameters. When you
call the function/method, the actual values you provide are called arguments.

Positional and Keyword Arguments:


You can pass arguments positionally (based on the order of parameters) or using keywords
(explicitly naming the parameter).

Ex:
# Positional arguments
my_dog = Dog("Buddy", 3)

# Keyword arguments
my_dog = Dog(age=3, name="Buddy")

83
5. Explain about Inheritance in python.

Inheritance:
Inheritance is a fundamental concept in object-oriented programming (OOP) that allows a class
(called a child or derived class) to inherit attributes and methods from another class (called a parent or
base class).

Syntax:
class ParentClass:
# Parent class code here
pass
class ChildClass(ParentClass):
# Child class code here
pass

Need of Inheritance:
 Promotes code reusability by sharing attributes and methods across classes.
 Models real-world hierarchies like Animal → Dog or Person → Employee.
 Simplifies maintenance through centralized updates in parent classes.
 Enables method overriding for customized subclass behavior.
 Supports scalable, extensible design using polymorphism.

Types of Inheritance:

1. Single Inheritance
2. Multiple Inheritance
3. Multilevel Inheritance
4. Hierarchical Inheritance
5. Hybrid Inheritance

1. Single Inheritance:

Single inheritance is a concept where a derived class inherits properties and methods from a
single parent class. This allows for code reusability and the ability to add new features to existing
code.
Diagram:

84
Syntax:
class ParentClass:
# Parent class code here
pass
class ChildClass(ParentClass):
# Child class code here
pass
Ex:
class One:
id = 101
class Two(One):
def __init__(self,name):
[Link] = name

def display(self):
print("ID : ",[Link])
print("Name : ",[Link])
two = Two("Clara")
[Link]()

Output:
ID : 101
Name : Clara

2. Multiple Inheritance:
Multiple inheritance allows a class to inherit attributes and methods from more than one parent
class. This feature enables the creation of a class that combines behaviors or attributes from multiple
other classes, which can be quite powerful. However, with power comes complexity, and multiple
inheritance should be used judiciously.

Syntax:
class ParentClass1:
# Parent class1 code here
pass
class ParentClass2:
# Parent class2 code here
pass
class ChildClass(ParentClass1,ParentClass2):
# Child class code here
pass

85
Diagram:

Ex:
class One:
id = 101
class Two:
name = "Clara"
class Three(One,Two):
def __init__(self,fees):
[Link] = fees

def display(self):
print("ID : ",[Link])
print("Name : ",[Link])
print("Fees : ",[Link])

three = Three(25000.00)
[Link]()

Output:
ID : 101
Name : Clara
Fees : 25000.0

86
3. Multilevel Inheritance:
Multilevel inheritance is a type of inheritance where a class inherits from another class, which
itself inherits from another class. This creates a hierarchy similar to a family tree, allowing a class to
inherit properties and methods from multiple parent classes.

For example, if we have three classes A, B, and C where A is the superclass (Parent), B is the subclass
(Child), and C is the subclass of B (Grandchild), then C inherits the features of both A and B.

Syntax:

class ParentClass1:
# Parent class1 code here
pass
class ChildClass(ParentClass1):
# Child class code here
pass
class GrandChildClass(ChildClass):
# Grand Child class code here
pass

Diagram:

Ex:
class One:
id = 101
class Two(One):
name = "Clara"
class Three(Two):
def __init__(self,fees):
[Link] = fees
def display(self):
print("ID : ",[Link])
print("Name : ",[Link])
print("Fees : ",[Link])

three = Three(25000.00)
[Link]()

87
Output:
ID : 101
Name : Clara
Fees : 25000.0

4. Hierarchical Inheritance:
Hierarchical inheritance a type of inheritance where multiple child classes are derived from a
single parent class. This allows for a structured relationship among classes, where each child class can
inherit and extend the functionalities of the parent class.

Syntax:
class ParentClass:
# Parent class code here
pass
class ChildClass1(ParentClass):
# Child class1 code here
pass
class ChildClass2(ParentClass):
# Child class2 code here
pass

Diagram:

Ex:
class One:
cname = "[Link]"

class Two(One):
def __init__(self, name):
[Link] = name
def display(self):
print("Student Name : ",[Link])
print("College Name : ",[Link])

88
class Three(One):
def __init__(self,name):
[Link] = name
def display(self):
print("Student Name : ", [Link])
print("College Name : ", [Link])
two = Two("Clara")
[Link]()
three = Three("John")
[Link]()

Output:
Student Name : Clara
College Name : [Link]
Student Name : John
College Name : [Link]

5. Hybrid Inheritance:
Hybrid inheritance is a combination of more than one type of inheritance, such as single,
multiple, multilevel, or hierarchical inheritance, within a single program. This allows for more
complex relationships between classes and enhances the flexibility of the code structure.

Syntax:
class ParentClass1:
# Parent class1 code here
pass
class ChildClass1(ParentClass1):
# Child class1 code here
pass
class ParentClass2:
# Parent class2 code here
pass
class ChildClass2(ChildClass1,ParentClass2):
# Child class2 code here
pass

Diagram:

89
Ex:
class One:
id = 101

class Two(One):
name = "Clara"

class Three():
department = "[Link]"

class Four(Two,Three):
def __init__(self, fees):
[Link] = fees
def display(self):
print("ID : ", [Link])
print("Name : ", [Link])
print("Department : ", [Link])
print("Fees : ", [Link])

four = Four(25000.00)
[Link]()

Output:
ID : 101
Name : Clara
Department : [Link]
Fees : 25000.0

6. Explain about Polymorphism.

Polymorphism: Polymorphism means "many forms". It refers to the ability of an entity (like a
function or object) to perform different actions based on the context.

Technically, in Python, polymorphism allows same method, function or operator to behave differently
depending on object it is working with. This makes code more flexible and reusable.

Needs:
 Ensures consistent interfaces across different classes.
 Allows objects to respond differently to the same method call.
 Promotes loose coupling by relying on shared behavior, not specific types.
 Enables writing flexible, reusable code that works across types.
 Simplifies testing and future extension of code.

90
Example of Polymorphism:
A remote control can operate multiple devices like a TV, AC or music system. You press the
power button and each device responds differently TV turns on, AC starts cooling, music system plays
music.
Polymorphism here means same interface (power button), but different behavior based on device
(object).

Types of Polymorphism:
2 types of polymorphism namely

1. Compile – Time Polymorphism and


2. Run – Time Polymorphism

1. Compile – Time Polymorphism: Compile-time polymorphism means deciding which method or


operation to run during compilation, usually through method or operator overloading.

Languages like Java or C++ support this. But Python doesn’t because it’s dynamically typed it
resolves method calls at runtime, not during compilation. So, true method overloading isn’t supported
in Python, though similar behavior can be achieved using default or variable arguments.

If you define multiple methods with the same name, only the latest definition will be used.

Ex:

class Product:
def prod(self,a,b):
p = a*b
print(p)
def prod(self,a,b,c):
p = a*b*c
print(p)

p1 = Product()
[Link](1,2,3)

Output:
6

Explanation:

 Python only recognizes the latest definition of prod( ).

 The earlier definition prod(a, b) gets overwritten.

 If you call prod(4, 5), it will raise an error because the latest version expects 3 arguments.

By default argument, we overcome such method overloading drawbacks

91
Ex:
class Product:
def prod(self,a=None,b=None,c=None):
if a is not None and b is not None and c is None:
print(a*b)
else:
print(a*b*c)
p1 = Product()
[Link](1,2)
[Link](1,2,3)

Output:
2
6

2. Run – Time Polymorphism:

Run-time polymorphism is typically achieved through method overriding, where a subclass


provides a specific implementation of a method that is already defined in its superclass. The method
that gets called is determined at runtime based on the object's actual type.

Ex:
class Animal:
def sound(self):
return "Some generic sound"

class Dog(Animal):
def sound(self):
return "Bark"
class Cat(Animal):
def sound(self):
return "Meow"
# Polymorphic behavior
animals = [Dog(), Cat(), Animal()]
for animal in animals:
print([Link]())

Output:
Bark
Meow
Some generic sound

Explanation: Here, sound method behaves differently depending on whether object is a Dog, Cat or
Animal and this decision happens at runtime. This dynamic nature makes Python particularly powerful
for runtime polymorphism.

92
7. Explain about modules in python.

Modules:
A Python module is a file containing Python definitions and statements. A module can define
functions, classes, and variables. A module can also include runnable code.
Grouping related code into a module makes the code easier to understand and use. It also makes the
code logically organized.

2 Types of Modules:
1. Built – in modules and
2. User – defined modules

1. Built – in Modules:

 Python built-in modules are pre-written libraries of code that come bundled with the Python
programming language.
 These modules makes the work of developers easy and save time from writing the code for a
particular task for their program.

Some Built-in modules: sys, os, math, datetime, random, collections, json, re, socket, and threading.

Advantages of built-in modules:

1. Reduced Development Time: Built-in modules in Python are made to perform various tasks that
are used without installing external module or writing the lengthy code to perform that specific task.
So, developers used them according to their convenience to save the time.

2. Optimized Performance: Some Python built-in modules are optimized for performance, using low-
level code or native bindings to execute tasks efficiently.

3. Reliability: These modules have been rigorously tested and have been in use by the Python
community for years. As a result, they are typically stable and have fewer bugs compared to new or
less well-known third-party libraries.

4. Consistency: Python Built-in modules provide consistent way to solve problems. Developers
familiar with these modules can easily understand and collaborate on codebases across different
projects.

5. Standardization: As these modules are part of the Python Standard Library, they establish a standard
way of performing tasks.

6. Documentation: Python's official documentation is comprehensive and includes detailed


explanations and examples for Python built-in modules. This makes it easier to learn and utilize them.

7: Maintainability: Python Built-in modules are maintained by the core Python team and community,
regular updates, bug fixes, and improvements are to be expected, ensuring long-term viability.

93
8. Reduced Risk: Using third-party libraries can introduce risks, like discontinued support or security
vulnerabilities.

a) Date and Time Modules:

The datetime module in Python is a built-in library for working with dates and times. It
provides classes and functions for creating, manipulating, formatting, and parsing dates, times, and
time intervals.

To import the datetime module in Python,

Syntax:
import datetime

This statement imports the entire datetime module, allowing access to its classes and functions by
prefixing them with datetime.

Ex: To get the current date and time


import datetime
current_datetime = [Link]()
print("Current Date and Time: ",current_datetime)

Output:
Current Date and Time: 2025-08-13 [Link].334014

Alternatively, specific classes from the datetime module can be imported directly, which can make the
code more concise:

from datetime import datetime, date, time, timedelta


Key Classes within the datetime module:
 date: Represents a date (year, month, day).
 time: Represents a time (hour, minute, second, microsecond, tzinfo).
 datetime: Combines both date and time (year, month, day, hour, minute, second, microsecond,
tzinfo).
 timedelta: Represents a duration or difference between two date, time, or datetime objects.
 tzinfo: An abstract base class for timezone information.
 timezone: A concrete class for fixed offset timezones (subclass of tzinfo).

Common Operations:
 Getting current date/time: Use [Link]() or [Link]().
 Creating objects: Instantiate date, time, or datetime objects by providing the respective
components (e.g., datetime(2025, 8, 13, 10, 30)).
 Arithmetic operations: Perform addition and subtraction with timedelta objects to calculate
future/past dates/times or the difference between two temporal objects.
 Formatting: Convert datetime objects to strings using strftime() with format codes (e.g., "%Y-
%m-%d %H:%M:%S").

94
 Parsing: Convert date/time strings back into datetime objects using strptime(), providing the
string and its corresponding format.
 Accessing components: Retrieve individual components like year, month, day, hour, etc.,
using attributes of the date, time, or datetime objects.

Ex:

from datetime import datetime, date, timedelta

# Get current datetime


current_dt = [Link]()
print(f"Current datetime: {current_dt}")

# Create a specific date


my_date = date(2025, 12, 25)
print(f"My date: {my_date}")

# Calculate a future date


future_date = current_dt + timedelta(days=7)
print(f"Date in 7 days: {future_date}")

# Format a datetime object


formatted_dt = current_dt.strftime("%Y/%m/%d %H:%M:%S")
print(f"Formatted datetime: {formatted_dt}")

# Parse a date string


date_string = "2024-01-15 [Link]"
parsed_dt = [Link](date_string, "%Y-%m-%d %H:%M:%S")
print(f"Parsed datetime: {parsed_dt}")

Output:
Current datetime: 2025-08-13 [Link].335069
My date: 2025-12-25
Date in 7 days: 2025-08-20 [Link].335069
Formatted datetime: 2025/08/13 [Link]
Parsed datetime: 2024-01-15 [Link]

95
b) Math Modules:
The math module is a built-in library that contains a collection of mathematical functions and
constants. It is commonly used to perform standard math operations such as rounding, trigonometry,
logarithms and more, all with precise and reliable results.

Needs of math module:


1. Provides built-in functions for complex math operations like square root, power and
trigonometry.
2. Offers constants like pi and e, useful in scientific and engineering calculations.
3. Improves accuracy and performance over manual calculations or custom functions.
4. Helps in performing logarithmic and exponential operations easily.
5. Supports real-world applications like physics, statistics, geometry and finance.

Importing math module:


To work with mathematical constants and functions in Python, you need to import the math
module:
Syntax:
import math
Once imported, you can access all methods belongs to math modules.

1. sqrt( ): This function is used to calculate the square root value.

Syntax:
var_name = [Link](number)
Ex:
print([Link](16)) # 4.0

2. pow( ):This function is used to calculate the power value. It has 2 argument namely base and power
Syntax:
var_name = [Link](base,power)
Ex:
print([Link](2,3)) # 8.0

3. ceil( ): This function returns the smallest integral value greater than the number.
Syntax:
var_name = [Link](value)
Ex:
print([Link](5.6787)) #6

4. floor( ): This function returns the smallest integral value smaller than the number. If it has fraction
part, removes it.
Syntax:
var_name = [Link](value)

Ex:
print([Link](5.6787)) #5

96
5. gcd( ): This function is used to find the greatest common divisor of two numbers passed as the
arguments.
Syntax:
var_name = [Link](value1,value2)

Ex:
print([Link](10,15)) #5

6. factorial( ): This function is used to calculate the factorial of the number.


Syntax:
var_name = [Link](value)

Ex:
print([Link](5)) # 120

7. fabs( ): This function returns the absolute value of the number. If the number is in negative, it
changes into positive.
Syntax:
var_name = [Link](-10)

Ex:
print([Link](-10)) # 10.0

8. sin( ): This function returns sine of values passed as the argument.


Syntax:
var_name = [Link](value)

Ex:
print([Link](25)) # -0.13235175009777303

9. cos( ): This function returns cosine of values passed as the argument.


Syntax:
var_name = [Link](value)

Ex:
print([Link](25)) # 0.9912028118634736

10. tan( ): This function returns tangent of values passed as the argument.
Syntax:
var_name = [Link](value)

Ex:
print([Link](25)) # -0.13352640702153587

97
11. log( ): This function returns the logarithmic values.
Syntax:
var_name = [Link](value,base)
Ex:
print([Link](1000,10)) # 2.9999999999999996

12. pi : The pi is depicted as either 22/7 or 3.14. [Link] provides a more precise value for the pi. It is
the constant value.
Syntax:
var_name = [Link]
Ex:
print([Link]) # 3.141592653589793

13. remainder( ) :This function returns the remainder value. It has 2 parameter namely, dividend and
divisor
Syntax:
var_name = [Link](dividend, divisior)
Ex:
print([Link](10,3)) # 1.0

2. User – Defined modules in Python:


User-defined python modules are the modules, which are created by the user to simplify their
project. These modules can contain functions, classes, variables, and other code that you can reuse
across multiple scripts.

Creation of user-defined module:


We will create a module calculator to perform basic mathematical operations.
# [Link]

def add(a, b):


return a + b

def sub(a, b):


return a - b

def mul(a, b):


return a * b

def div(a, b):


return a / b

In the above code, we have implemented basic mathematic operations. After that, we will save the
above python file as [Link].
Now, we will use the above user-defined python module in another python program.

98
# [Link]

import calculator

print("Addition of 5 and 4 is:", [Link](5, 4))


print("Subtraction of 7 and 2 is:", [Link](7, 2))
print("Multiplication of 3 and 4 is:", [Link](3, 4))
print("Division of 12 and 3 is:", [Link](12, 3))

Output:
Addition of 5 and 4 is: 9
Subtraction of 7 and 2 is: 5
Multiplication of 3 and 4 is: 12
Division of 12 and 3 is: 4.0

8. Explain about Packages.

Package:
Python packages are a way to organize and structure code by grouping related modules into
directories. A package is essentially a folder that contains an __init__.py file and one or more Python
files (modules). This organization helps manage and reuse code effectively, especially in larger
projects. It also allows functionality to be easily shared and distributed across different applications.
Packages act like toolboxes, storing and organizing tools (functions and classes) for efficient access
and reuse.
Key Components of a Python Package
 Module: A single Python file containing reusable code (e.g., [Link]).
 Package: A directory containing modules and a special __init__.py file.
 Sub-Packages: Packages nested within other packages for deeper organization.

Create and access packages in python:


1. Create a Directory: Make a directory for your package. This will serve as the root folder.
2. Add Modules: Add Python files (modules) to the directory, each representing specific
functionality.
3. Include __init__.py: Add an __init__.py file (can be empty) to the directory to mark it as a
package.
4. Add Sub packages (Optional): Create subdirectories with their own __init__.py files for sub
packages.
5. Import Modules: Use dot notation to import, e.g., from mypackage.module1 import greet.

Example:
In this example, we are creating a Math Operation Package to organize Python code into a structured
package with two sub-packages: basic (for addition and subtraction) and advanced (for multiplication
and division). Each operation is implemented in separate modules, allowing for modular, reusable and
maintainable code.

99
math_operations/__init__.py:
This __init__.py file initializes the main package by importing and exposing the calculate
function and operations (add, subtract, multiply, divide) from the respective sub-packages for easier
access.

# Initialize the main package


from .calculate import calculate
from .basic import add, subtract
from .advanced import multiply, divide

math_operations/[Link]:
This calculate file is a simple placeholder that prints "Performing calculation...", serving as a basic
demonstration or utility within the package.

def calculate():
print("Performing calculation...")

math_operations/basic/__init__.py:
This __init__.py file initializes the basic sub-package by importing and exposing the add and
subtract functions from their respective modules ([Link] and [Link]). This makes these functions
accessible when the basic sub-package is imported.

# Export functions from the basic sub-package


from .add import add
from .sub import subtract

math_operations/basic/[Link]:
def add(a, b):
return a + b

math_operations/basic/[Link]:
def subtract(a, b):
return a – b
In the same way we can create the sub package advanced with multiply and divide
modules. Now, let's take an example of importing the module into a code and using the function:

Ex:
from math_operations import calculate, add, subtract
# Using the placeholder calculate function
calculate()
# Perform basic operations
print("Addition:", add(5, 3))
print("Subtraction:", subtract(10, 4))
Output:
6
8

100
9. Explain about Exception Handling in python.

Exception Handling:

Exception Handling handles errors that occur during the execution of a program. Exception
handling allows to respond to the error, instead of crashing the running program. It enables you to
catch and manage errors, making your code more robust and user-friendly.

Ex:
n = 10
res = n / 0
print(res)

Output:
Traceback (most recent call last):
File "C:\Users\program\PyCharmMiscProject\[Link]", line 2, in <module>
res = n / 0
~~^~~
ZeroDivisionError: division by zero

In the above code, the error occurred due to zero division. So, to handle such error we use the
exception handling i.e even the error occurs, the program handles by itself instead of crashing the
running program.

101
Difference Between Exception and Error:
 Error: Errors are serious issues that a program should not try to handle. They are usually
problems in the code's logic or configuration and need to be fixed by the programmer.
Examples include syntax errors and memory errors.
 Exception: Exceptions are less severe than errors and can be handled by the program. They
occur due to situations like invalid input, missing files or network issues.

Types of Exceptions:
2 types of exceptions, listed below
1. Built – in Type Exception
2. User – Defined Exception

1. Built – in Type Exception:


Exceptions are events that can alter the flow of control in a program. These errors can arise
during program execution and need to be handled appropriately. Python provides a set of built-in
exceptions, each meant to signal a particular type of error.

List of default Python exceptions with descriptions:


1. AssertionError: raised when the assert statement fails.
2. EOFError: raised when the input() function meets the end-of-file condition.
3. AttributeError: raised when the attribute assignment or reference fails.
4. TabError: raised when the indentations consist of inconsistent tabs or spaces.
5. ImportError: raised when importing the module fails.
6. IndexError: occurs when the index of a sequence is out of range
7. KeyboardInterrupt: raised when the user inputs interrupt keys (Ctrl + C or Delete).
8. RuntimeError: occurs when an error does not fall into any category.
9. NameError: raised when a variable is not found in the local or global scope.
10. MemoryError: raised when programs run out of memory.
11. ValueError: occurs when the operation or function receives an argument with the right type but
the wrong value.
12. ZeroDivisionError: raised when you divide a value or variable with zero.
13. SyntaxError: raised by the parser when the Python syntax is wrong.
14. IndentationError: occurs when there is a wrong indentation.
15. SystemError: raised when the interpreter detects an internal error.

Exception Handling with try, except, else, and finally :


After learning about errors and exceptions, we will handle them by using try, except, else,
and finally blocks.
In normal circumstances, these errors will stop the code execution and display the error message. To
create stable systems, we need to anticipate these errors and come up with alternative solutions or
warning messages.

102
The try and except statement:

The most simple way of handling exceptions in Python is by using the try and except block.
1. Run the code under the try statement.
2. When an exception is raised, execute the code under the except statement.

Instead of stopping at error or exception, our code will move on to alternative solutions.

In the first example, we will try to print the undefined x variable. In normal circumstances, it should
throw the error and stop the execution, but with the try and except block, we can change the flow
behavior.
 The program will run the code under the try statement.
 As we know, that x is not defined, so it will run the except statement and print the warning.

Ex 1:

try:
print(x)
except:
print("An exception has occurred!")

Output:
An exception has occurred!

Multiple except statement:


In the second example, we will be using multiple except statements to handle multiple types of
exceptions.
 If a ZeroDivisionError exception is raised, the program will print "You cannot divide a value
with zero."
 The rest of the exceptions will print “Something else went wrong.”
It allows us to write flexible code that can handle multiple exceptions at a time without breaking.

103
Ex 2:
try:
print(1/0)
except ZeroDivisionError:
print("You cannot divide a value with zero")
except:
print("Something else went wrong")

Output:
You cannot divide a value with zero

The try with else clause:


When the try statement does not raise an exception, code enters into the else block. It is the
remedy or a fallback option when you expect a part of your script will produce an exception. It is
generally used in a brief setup or verification section where you don't want certain errors to hide.

Note: In the try-except block, you can use the else after all the except statements.

We are adding the else statement to the ZeroDivisionError example. As we can see, when there are no
exceptions, the print function under the else statement is executed, displaying the result.

Ex:
try:
result = 1/3
except ZeroDivisionError as err:
print(err)
else:
print(f"Your answer is {result}")

Output:
Your answer is 0.3333333333333333

104
The finally keyword:
The finally keyword in the try-except block is always executed, irrespective of whether there is
an exception or not. In simple words, the finally block of code is run after
the try, except and else block is final. It is quite useful in cleaning up resources and closing the object,
especially closing the files.

Ex:
try:
n=0
res = 100 / n

except ZeroDivisionError:
print("You can't divide by zero!")

except ValueError:
print("Enter a valid number!")

else:
print("Result is", res)

finally:
print("Execution complete.")

Output:
You can't divide by zero!
Execution complete.

2. User – Defined Exception:


User-defined exceptions are custom error classes that you create to handle specific error
conditions in your code. They are derived from the built-in Exception class or any of its sub classes.

User-defined exceptions provide more precise control over error handling in your application –

 Clarity − They provide specific error messages that make it clear what went wrong.
 Granularity − They allow you to handle different error conditions separately.
 Maintainability − They centralize error handling logic, making your code easier to maintain.

Create a User-Defined Exception:

Step 1 − Define the Exception Class:


Create a new class that inherits from the built-in "Exception" class or any other appropriate
base class. This new class will serve as your custom exception.

class MyCustomError(Exception):
pass

105
Explanation:
 Inheritance − By inheriting from "Exception", your custom exception will have the same
behaviour and attributes as the built-in exceptions.
 Class Definition − The class is defined using the standard Python class syntax. For simple
custom exceptions, you can define an empty class body using the "pass" statement.

Step 2 − Initialize the Exception:


Implement the "__init__" method to initialize any attributes or provide custom error messages.
This allows you to pass specific information about the error when raising the exception.

class InvalidAgeError(Exception):
def __init__(self, age, message="Age must be between 18 and 100"):
[Link] = age
[Link] = message
super( ).__init__([Link])

Explanation:
 Attributes − Define attributes such as "age" and "message" to store information about the
error.
 Initialization − The "__init__" method initializes these attributes. The
"super().__init__([Link])" call ensures that the base "Exception" class is properly
initialized with the error message.
 Default Message − A default message is provided, but you can override it when raising the
exception.

Step 3 − Optionally Override "__str__" or "__repr__":


Override the "_ _str_ _" or "_ _repr_ _" method to provide a custom string representation of
the exception. This is useful for printing or logging the exception.
Ex:
class InvalidAgeError(Exception):
def _ _init_ _(self, age, message="Age must be between 18 and 100"):
[Link] = age
[Link] = message
super( )._ _init_ _([Link])

def __str__(self):
return f"{[Link]}. Provided age: {[Link]}"

Explanation:
 __str__ Method − The "__str__" method returns a string representation of the exception. This
is what will be displayed when the exception is printed.
 Custom Message − Customize the message to include relevant information, such as the
provided age in this example.

106
Raising User-Defined Exceptions:
Once you have defined a custom exception, you can raise it in your code to signify specific
error conditions. Raising user-defined exceptions involves using the raise statement, which can be
done with or without custom messages and attributes.

Syntax: For raising an exception


raise ExceptionType(args)

Example: "set_age" function raises an "InvalidAgeError" if the age is outside the valid range

def set_age(age):
if age < 18 or age > 100:
raise InvalidAgeError(age)
print(f"Age is set to {age}")

Handling User-Defined Exceptions:


Handling user-defined exceptions in Python refers to using "try-except" blocks to catch and
respond to the specific conditions that your custom exceptions represent. This allows your program to
handle errors gracefully and continue running or to take specific actions based on the type of exception
raised.

Syntax: Handling Exceptions


try:
# Code that may raise an exception
except ExceptionType as e:
# Code to handle the exception

Example:
In the below example, the "try" block calls "set_age" with an invalid age. The "except" block
catches the "InvalidAgeError" and prints the custom error message −

try:
set_age(150)
except InvalidAgeError as e:
print(f"Invalid age: {[Link]}. {[Link]}")

107
Complete Example:

Combining all the steps, here is a complete example of creating and using a user-defined
exception −

n = int(input("Enter Age: "))


class InvalidAgeError(Exception):
def __init__(self, age, message="Age must be between 18 and 100"):
[Link] = age
[Link] = message
super().__init__([Link])

def __str__(self):
return f"{[Link]}. Provided age: {[Link]}"

def set_age(age):
if age < 18 or age > 100:
raise InvalidAgeError(age)
print(f"Age is set to {age}")

try:
set_age(n)
except InvalidAgeError as e:
print(f"Invalid age: {[Link]}. {[Link]}")

Output 1:
Enter Age: 25
Age is set to 25

Output 2:
Enter Age: 150
Invalid age: 150. Age must be between 18 and 100

108
UNIT – IV

List: Introduction to List, List Operations, Traversing a List, List Methods and Built – in Functions.
Tuples and Dictionaries: Introduction to Tuples, Tuple Operations, Tuple Methods and Built – in
Functions, Nested Tuples, Introductions to Dictionaries, Dictionaries are Mutable, Dictionary
Operations, Traversing a Dictionary, Dictionary Methods and Built – in functions.

1. Introduction to List:
A list in Python is a built-in, dynamic-sized array that serves as an ordered collection of
elements, allowing the storage of multiple items of different data types within a single variable. Lists
are defined by enclosing elements in square brackets [ ], separated by commas( , ).
Unlike arrays in some other languages, Python lists can contain heterogeneous data types, such
as integers, strings, booleans, and even other lists, making them highly flexible. It can also contain
duplicate items.
 List in Python are Mutable. Hence, we can modify, replace or delete the items.
 List are ordered. It maintains the order of elements based on how they are added.
 Accessing items in List can be done directly using their position (index), starting from 0.

Creation of List:

a) Using Square Brackets … [ ]

Syntax:
list_name = [element1, element2,….elementn]

Ex:
# List of integers
a = [1, 2, 3, 4, 5]

# List of strings
b = ['apple', 'banana', 'cherry']

# Mixed data types


c = [1, 'hello', 3.14, True]

print(a)
print(b)
print(c)

Output:
[1, 2, 3, 4, 5]
['apple', 'banana', 'cherry']
[1, 'hello', 3.14, True]

109
b) using list ( ) constructor:
We can also create a list by passing an iterable (like a string, tuple or another list) to list() function.

Ex:
a = list((1, 2, 3, 'apple', 4.5))
print(a)

Output:
[1, 2, 3, 'apple', 4.5]

Accessing List Elements:


Elements in a list can be accessed using indexing. Python indexes start at 0, so a[0] will access
the first element, while negative indexing allows us to access elements from the end of the list. Like
index -1 represents the last elements of list.

Ex:
a = [10, 20, 30, 40, 50]

# Access first element


print(a[0])

# Access last element


print(a[-1])

Output:
10
50

Advantages:

 Lists are dynamic, allowing for easy addition and removal of elements, which provides a
flexible data structure that can be resized as needed.
 Lists support efficient element access through indexing, starting at 0, enabling quick retrieval
of specific items.
 Lists are mutable, meaning elements can be modified in place, which is suitable for scenarios
requiring frequent data updates.
 Lists can store elements of different data types, providing versatility in handling diverse
information within a single structure.

Disadvantages:

 Their dynamic nature means Python must allocate extra memory for potential future growth,
which can lead to higher memory usage compared to immutable structures like tuples.
 While lists are efficient for sequential access, they do not support random access as efficiently
as arrays, requiring traversal from the beginning to reach an element at a specific index.
 The memory overhead for storing the list object itself, including its internal structure and
metadata, can be significant, especially for large lists.

110
2. Traversing a List:

The most common way to traverse the elements of a list is with a for loop, which allows direct access
to each element in the list. This method is ideal when only reading the elements is required. The
syntax is similar to iterating over strings.

Different type for traversing a list:

a) for loop
b) for loop using range( ) and len( )
c) while loop
d) enumerate
e) List Comprehension

a) Using for loop:

Syntax:
for i in list:
# print list by i

Ex:
a = [1, 2, 3, 4, 5]

# On each iteration i represents the current item/element


for i in a:
print(i)

Output:
1
2
3
4
5

b) Using for Loop with range( ):


We can use the range( ) method with for loop to traverse the list. This method allow us
to access elements by their index, which is useful if we need to know the position of an element or
modify the list in place.

Ex:
a = [1, 2, 3, 4, 5]
# Calculate the length of the list
n = len(a)

# Iterates over the indices from 0 to n-1 (i.e., 0 to 4)


for i in range(n):
print(a[i])

111
Output:
1
2
3
4
5

c) Using while Loop:


This method is similar to the above method. Here we are using a while loop to iterate through a
list. We first need to find the length of list using len(), then start at index 0 and access each item by its
index then incrementing the index by 1 after each iteration.

Ex:

a = [1, 2, 3, 4, 5]
# Start from the first index
i=0
# The loop runs till the last index (i.e., 4)
while i < len(a):
print(a[i])
i += 1

Output:
1
2
3
4
5

d) Using enumerate( ):
We can also use the enumerate( ) function to iterate through the list. This method provides
both the index (i) and the value (val) of each element during the loop.

Ex:
a = [1, 2, 3, 4, 5]
# Here, i and val reprsents index and value respectively
for i, val in enumerate(a):
print (i, val)

Output:
01
12
23
34
45

112
e) Using List Comprehension: List comprehension is similar to for loop. It provides the shortest
syntax for looping through list.
Ex:
a = [1, 2, 3, 4, 5]
# On each iteration val is passed to print function
# And printed in the console.
[print(val) for val in a]

Output:
1
2
3
4
5

3. Operations – Methods – Built-in Function For List:

List:
A list in Python is a built-in, dynamic-sized array that serves as an ordered collection of
elements, allowing the storage of multiple items of different data types within a single variable. Lists
are defined by enclosing elements in square brackets [ ], separated by commas( , ).
Unlike arrays in some other languages, Python lists can contain heterogeneous data types, such
as integers, strings, booleans, and even other lists, making them highly flexible. It can also contain
duplicate items.
 List in Python are Mutable. Hence, we can modify, replace or delete the items.
 List are ordered. It maintains the order of elements based on how they are added.
 Accessing items in List can be done directly using their position (index), starting from 0.

We can create a list by using [ ] and list( ) constructor:

Ex:
# List of integers
a = [1, 2, 3, 4, 5]
# List of strings
b = ['apple', 'banana', 'cherry']
# Mixed data types
c = [1, 'hello', 3.14, True]
print(a)
print(b)
print(c)

Output:
[1, 2, 3, 4, 5]
['apple', 'banana', 'cherry']
[1, 'hello', 3.14, True]

113
Operations of List:
The operations are performed using built-in methods and functions. Common operations
include creating a list using square brackets, accessing elements via zero-based indexing, and
modifying elements by assignment.

List operations are listed below:

1. Creating list
2. Slice
3. Concatenate
4. Multiply
5. del

1. Creating list: We can create a list by using [ ] and list( ) constructor, creates empty list and we add
elements in it.

Syntax:
list_name = [elements]
list_name = list( )

Ex:
# List of integers
a = [1, 2, 3, 4, 5]
b = list()
print(a)
print(type(b))
Output:
[1, 2, 3, 4, 5]
<class 'list'>

2. Slice: List slicing is easily access specific elements (extract a portion) in a list. It uses both positive
and negative indexing for slicing

Syntax:
list_name[start : end : step]

Ex:
a = [1,2,3,4,5,6,7,8,9,10]
print(a[0:5])
print(a[5:])
print(a[:2])
print(a[-4:-1])
print(a[-4:])
print(a[:-4])
print(a[:])
print(a[::])
print(a[::-1])

114
Output:
[1, 2, 3, 4, 5]
[6, 7, 8, 9, 10]
[1, 2]
[7, 8, 9]
[7, 8, 9, 10]
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

3. Concatenate: The + operator creates a new list by concatenating list1 and list2. This operation does
not modify the original lists but returns a new combined list.
Syntax:
new_list = list1 + list2

Ex:
l1 = [1,2,3]
l2 = [4,5,6]
l3 = l1 + l2
print(l3)

Output:
[1, 2, 3, 4, 5, 6]

4. Multiply: They can be multiplied by an integer using the * operator to repeat the list elements.

Syntax:
new_list = list_name * value

Ex:
l1 = [1,2,3]
l2 = l1 * 2
print(l2)
l3 = l1 * 3
print(l3)

Output:
[1, 2, 3, 1, 2, 3]
[1, 2, 3, 1, 2, 3, 1, 2, 3]

5. del : It is used to delete the list object itself.

Syntax:
del list_name

115
Ex:
a = [1,2,3]
print(a)
del a
print(a)

Output:
[1, 2, 3]
Traceback (most recent call last):
File "C:\Users\guido\PyCharmMiscProject\[Link]", line 4, in <module>
print(a)
^
NameError: name 'a' is not defined

List Methods:

Built-in methods in Python are predefined functions and operators that are always available
within the Python interpreter, providing essential functionality without requiring external libraries or
custom code. These methods are part of the Python Standard Library and are implemented as part of
the language's core, ensuring high performance and reliability.

A method is a function that is associated with an object or a class and is called using dot
notation on that specific object.

Methods are typically used to perform actions that interact with or manipulate the data (state)
of the object they belong to.
object_name.method_name( )
The Python interpreter includes a wide range of built-in methods, such as
1. append( ) 6. clear( )
2. extend( ) 7. index( )
3. insert( ) 8. count( )
4. remove( ) 9. sort( )
5. pop( ) 10. reverse( )

1. append( ): This function is used to adds a single element to the end of the list.
Syntax:
list_name.append(element)
Ex:
a = [1, 2, 3]
# Add 4 to the end of the list
[Link](4)
print(a)

Output:
[1, 2, 3, 4]

116
2. extend( ): This function is used to adds multiple elements from an iterable (like another list or tuple)
to the end of the current list.
Syntax:
list_name.extend(iterable)

Ex:
a = [1, 2, 3]
b = [4, 5]
# Using extend() to add elements of b to a
[Link](b)
print(a)

Output:
[1, 2, 3, 4, 5]

[Link] ( ): This function is used to inserts an element at a specified position within the list.
Syntax:
list_name.insert(index, element)

Ex:
a = [1, 3]
# Insert 2 at index 1
[Link](1, 2)
print(a)

Output:
[1, 2, 3]

4. remove( ): This function is used to remove the first occurrence of a specified element from the list.

Syntax:
list_name.remove(element)

Ex:
a = [1, 2, 3]

# Remove the first occurrence of 2


[Link](2)
print(a)

Output:
[1, 3]

117
5. pop ( ): This function is used to remove the last element from the list.
Syntax:
list_name.pop( )

Ex:
a = [1, 2, 3]

# Remove and return the last element in the list


[Link]()
print(a)

Output:
[1, 2]

6. clear( ): This function is used to clear all the elements from the list.

Syntax:
list_name.clear( )

Ex:
a = [1, 2, 3]

# Remove all elements from the list


[Link]()
print(a)

Output:
[]

7. index( ): This function is used to find the index of a specific element in the list.

Syntax:
list_name.index(element)

Ex:
a = [1, 2, 3]

# Find the index of 2 in the list


print([Link](2))

Output:
1

118
8. cout( ): This function is used to count the occurrences of a specific element in the list.

Syntax:
list_name.count(element)

Ex:
a = [1, 2, 3, 2]

# Count occurrences of 2 in the list


print([Link](2))

Output:
2

9. sort( ): This function is used to sort the elements of the list in ascending order.

Syntax:
list_name.sort(key=None, reverse=False)

Ex:
a = [3, 1, 2]

# Sort the list in ascending order


[Link]()
print(a)

Output:
[1, 2, 3]

10. reverse( ): This function is used to reverse the order of the elements in the list.

Syntax:
list_name.reverse( )

Ex:
a = [1, 2, 3]

# Reverse the list order


[Link]()
print(a)

Output:
[3, 2, 1]

119
Built – in Functions:
A built-in function is a standalone function provided by Python that can be called directly by
its name and is not tied to any specific object or class. These functions can generally be applied to
various data types, such as print( ), len( ), or all( ).

Some of bulit – in functions in list are listed below

1. len( ) 5. all( )
2. max( ) 6. any( )
3. min( ) 7. reversed( )
4. sum( )

1. len ( ): This function is used to calculate length of the list.


Syntax;
len(list_name)

Ex:
a = [3, 1, 2]
print(len(a))

Output:
3

2. max( ): This function is used to find the max values in the list.
Syntax:
max(list_name)

Ex:
a = [15,34,78,13,65]
print(max(a))

Output:
78

3. min( ): This function is used to find the min values in the list.
Syntax:
min(list_name)

Ex:
a = [15,34,78,13,65]
print(min(a))

Output:
13

120
4. sum( ): This function is used to calculate sum of all the values in the list.
Syntax:
sum(list_name)

Ex:
a = [15,34,78,13,65]
print(sum(a))

Output:
205

6. all( ): This function returns True if all items in an iterable are true, otherwise it returns False.

Syntax:
all(list_name)

Ex:
list1 = [1,2,3,True]
print(all(list1))
list2 = [1,2,3,False]
print(all(list2))

Output:
True
False

6. any( ): This function returns True if any item in an iterable are true, otherwise it returns False.

Syntax:
any(list_name)

Ex:
list1 = [0,1,False]
print(any(list1))
list2 = [0,False]
print(any(list2))

Output:
True
False

7. reversed( ): This function is used to returns a reverse iterator for a given iterable, such as a list,
allowing you to traverse the elements in reverse order without modifying the original list.
Syntax:
list(reversed(list_name))

121
Ex:
a = [1,2,3]
for i in list(reversed(a)):
print(i)

Output:
3
2
1

4. Introduction to Tuples:

A tuple in Python is an immutable ordered collection of elements.


 Tuples are similar to lists, but unlike lists, they cannot be changed after their creation (i.e., they
are immutable).
 Tuples can hold elements of different data types.
 The main characteristics of tuples are being ordered , heterogeneous and immutable.

Creating a Tuple:
A tuple is created by placing all the items inside parentheses (), separated by commas. A tuple can
have any number of items and they can be of different data types.

Syntax:
tuple_name = (element_1, element_2, element_3, . . . , element_n)

Ex:

tup1 = ( )
print(tup1)
# Using String
tup2 = ('Hello', 'User!')
print(tup2)
# Mixed Data types
tup3 = (1, 2.345, 'Hello', True)
print(tuple(tup3))
# Using Built-in Function
tup = tuple('Guido')
print(tup)

Output:
()
('Hello', 'User!')
(1, 2.345, 'Hello', True)
('G', 'u', 'i', 'd', 'o')

122
Accessing Tuple Elements:

 Each item in a tuple is associated with a number, known as a index. The index always starts
from 0, meaning the first item of a tuple is at index 0, the second item is at index 1, and so on.
 We can also access the elements of a tuple by using slicing. It has both positive(forward) and
negative(backward) numbers.
 Iterate Through a Tuple: We use the for loop to iterate over the items of a tuple.

Example:

tuple1 = (10,20,30,40,50)

# Access using index value


print(tuple1[0])
print(tuple1[1])

#Access using slice


print(tuple1[1:3])
print(tuple1[-3:-1])

Output:

10
20
(20, 30)
(30, 40)

Advantages:

 Tuples store items in ordered in which we have provided. They preserve our insertion order.
 Tuples allow to add any number of duplicate elements.
 Tuples offers security to the data value due to its immutable nature.
 Iterating over elements of a tuple is faster than the list because of its immutability nature.
 Easy to display different values stored in the tuple.

Disadvantages:

 we cannot use it when we will want to add or delete a specific element. Hence, there is limit
use of tuple.

 We cannot create a tuple with a single element without using parentheses. In such a case, we
need to include a trailing comma which can make the code readability a little bit complex.

 In Python, tuple is not a flexible in nature as it stores elements only once and does not provide
any method to update the element values stored in the tuple.

123
5. Traversing Tuple:

a) Traversing a tuple in Python involves iterating over its elements, and several methods are available
for this purpose. The most common and straightforward approach is using a for loop, which allows
you to access each element sequentially.

Ex: using for loop

tuple1 = (10,20,30,40,50)

#Traverse all the element using for loop


for i in tuple1:
print(i,end=" ")

Output:

10 20 30 40 50

b) A powerful feature is tuple unpacking, which allows you to extract individual elements from a tuple
directly within a loop. This is particularly useful when iterating over a list of tuples.

For instance, if you have a list of coordinate pairs coordinates = [(1, 2), (3, 4), (5, 6)], you can unpack
them using for x, y in coordinates: print(f"X: {x}, Y: {y}").

Ex:

coordinates = [(1, 2), (3, 4), (5, 6)]

#Upacking list to tuple


for x, y in coordinates:
print(f"X: {x}, Y: {y}")

Output:

X: 1, Y: 2

X: 3, Y: 4

X: 5, Y: 6

c) The enumerate( ) function provides both the index and the tuple value during iteration, which is
helpful when you need to track the position of each tuple in a list. For example, for idx, tup in
enumerate(x): print(f"Index: {idx}, Tuple: {tup}").

Example:

tuple1 = (10,20,30,40,50)

#By Enumerate
for idx, tup in enumerate(tuple1):
print(f"Index: {idx}, Tuple: {tup}")

124
Output:

Index: 0, Tuple: 10

Index: 1, Tuple: 20

Index: 2, Tuple: 30

Index: 3, Tuple: 40

Index: 4, Tuple: 50

6) Nested Tuple:

A Tuple is an ordered and immutable collection of elements, and tuples are created using parentheses,
there are different ways to use a tuple within a tuple.

A tuple can contain elements of any data type, such as another tuple. When one tuple is placed inside
another tuple, it is called a nested tuple or a tuple within a tuple.

Syntax:

outer_tuple = (value1, value2, (nested_value1, nested_value2), value3)

Creation of nested tuple:

we will create a Tuple with integer elements, and within that tuple, we will add an inner tuple (18, 19,
20) to it.

Ex:

# Creating a Tuple
mytuple = (20, 40, (18, 19, 20), 60, 80, 100)

# Displaying the Tuple


print("Tuple = ",mytuple)

# Length of the Tuple


print("Tuple Length= ",len(mytuple))

Output:

Tuple = (20, 40, (18, 19, 20), 60, 80, 100)

Tuple Length = 6

125
Accessing Tuple:

We can access the inner tuple and its elements using indexing. Each level of nesting requires an
additional index.

For example:

mytuple = (20, 40, (18, 19, 20), 60, 80, 100)

Here, if you need to access element 40, use index 1

Suppose we need to access nested tuple, identify in which index is nested tuple is created.

In the above example nested tuple start at the index 2.

In index 2, three elements are presented in nested tuple. So, we access it by starting index with 0

i.e element 18 is accessed as mytuple[2][0]

Ex:

# Creating a Tuple
mytuple = (20, 40, (18, 19, 20), 60, 80, 100)

# Displaying the Tuple


print("Tuple = ",mytuple)

#Accessing nested tuple


print(mytuple[2][0])
print(mytuple[2][1])
print(mytuple[2][2])
print(mytuple[2][3])

Output:

Tuple = (20, 40, (18, 19, 20), 60, 80, 100)


18
19
20

Traceback (most recent call last):

File "C:\Users\guido\PyCharmMiscProject\[Link]", line 11, in <module>

print(mytuple[2][3])

~~~~~~~~~~^^^

IndexError: tuple index out of range

In the last statement, mytuple[2][3] is index of out of range, because nested loop has only upto 2
index.

126
7. Operations – Methods – Built-in Function For Tuples:

Introduction to Tuples:

A tuple in Python is an immutable ordered collection of elements.


 Tuples are similar to lists, but unlike lists, they cannot be changed after their creation (i.e., they
are immutable).
 Tuples can hold elements of different data types.
 The main characteristics of tuples are being ordered , heterogeneous and immutable.

Creating a Tuple
A tuple is created by placing all the items inside parentheses (), separated by commas. A tuple can
have any number of items and they can be of different data types.

Syntax:
tuple_name = (element_1, element_2, element_3, . . . , element_n)

Ex:
tup1 = ()
print(tup1)

# Using String
tup2 = ('Hello', 'User!')
print(tup2)

# Mixed Data types


tup3 = (1, 2.345, 'Hello', True)
print(tuple(tup3))

# Using Built-in Function


tup = tuple('Guido')
print(tup)

Output:
()
('Hello', 'User!')
(1, 2.345, 'Hello', True)
('G', 'u', 'i', 'd', 'o')

127
Operations of Tuples:

Operations of Tuples are listed below

1. Concatenation

2. Repetition

3. Membership
4. Slicing

5. del

6. Unpacking

1. Concatenation: Tuples can be concatenated using the ‘+’ operator. This operation creates a new
tuple by combining the elements of two or more tuples.

Syntax:

new_tuple = tuple1 + tuple2

Ex:

tuple1 = (1, 2, 3)
tuple2 = ('a', 'b', 'c')

# Concatenating two tuples


con_tuple = tuple1 + tuple2
print(con_tuple)

Output:

(1, 2, 3, 'a', 'b', 'c')

2. Repetition: Tuples can be repetition using the ‘*’ operator. This operation creates a new tuple by
repeating the elements of a tuple a specified number of times.

Syntax:
new_tuple = tuple_name * value

Ex:
tuple1 = (1, 2, 3)
tuple2 = tuple1 * 2
print(tuple2)

Output:
(1, 2, 3, 1, 2, 3)

128
3. Membership Testing: Membership testing in a tuple is performed using the in and not in operators
to check whether a specific value exists within the tuple. The in operator returns True if the value is
found in the tuple and False otherwise. Conversely, the not in operator returns True if the value is not
present in the tuple.
Syntax:
character in tuple_name
character not in tuple_name

Ex:
tuple1 = (1,2,3,4,5,6,7)
print(2 in tuple1)
print(3 not in tuple1)

Output:
True
False

4. Slicing: Slicing allows us to access a range of elements in a tuple. It is done by specifying the start
and end indices, separated by a colon.
Syntax:
tuple_name[start:end:step]

Ex:
tuple1 = (1,2,3,4,5,6,7)
print(tuple1[3:5])
print(tuple1[-3:])
print(tuple1[:3])
print(tuple1[Link])

Output:
(4, 5)
(5, 6, 7)
(1, 2, 3)
(1, 3, 5)

5. Delete: Tuples, being immutable, cannot be deleted directly. However, we can use the ‘del’
keyword to delete the entire tuple.

Syntax:

del tuple_name

Ex 1:
tuple1 = (1, 2, 3)
print(tuple1)
del tuple1
print(tuple1)

129
Output:

(1, 2, 3)

Traceback (most recent call last):

File "C:\Users\guido\PyCharmMiscProject\[Link]", line 4, in <module>

print(tuple1)

^^^^^^

NameError: name 'tuple1' is not defined

6. Unpacking: Tuple unpacking is a powerful feature that allows you to assign the values of a tuple to
multiple variables in a single line. This technique makes your code more readable and efficient.

Key Rules:

 The number of variables on the left must match the number of elements in the tuple.

 If they do not match, Python raises a ValueError.

 Example: a, b = (100, 200, 300) # ValueError: too many values to unpack

Syntax:

variables1,variable2,…,variable = tuple_name(element1,element2,…,element)

Example:

a, b, c = (100, 200, 300)


print(a)
print(b)
print(c)

Output:
100
200
300

130
Tuples Methods and Built – in functions:

Methods:
Built-in methods in Python are predefined functions and operators that are always available
within the Python interpreter, providing essential functionality without requiring external libraries or
custom code. These methods are part of the Python Standard Library and are implemented as part of
the language's core, ensuring high performance and reliability.

A method is a function that is associated with an object or a class and is called using dot
notation on that specific object.
object_name.method_name( )

Built_in Functions:
A built-in function is a standalone function provided by Python that can be called directly by
its name and is not tied to any specific object or class. These functions can generally be applied to
various data types, such as print( ), len( ), or all( ).

The Python interpreter includes a wide range of built-in methods / functions, such as

1. count( ) 6. sum( )
2. index( ) 7. all( )

3. len( ) 8. any( )

4. max( ) 9. sorted( )
5. min( ) 10. tuple( )

1. count( ): This method returns the number of occurrences of an element appearing in a tuple. It
returns 0 if the element is not present in the tuple.

Syntax:
tuple_name.count(x)

Ex:
mytuple = (1,4,3,6,1,3,6,3,6,5,1,3)
print([Link](3))
Output:
4

2. index( ): This method returns the index position of the specified element at the first occurrence in
the tuple. It will raise an error ValueError if the specified element in the tuple is not found.

Syntax:
tuple_name.index(x)

131
Ex:

mytuple = (1,4,3,6,1)
print([Link](3))
print([Link](7))

Output:

Traceback (most recent call last):

File "C:\Users\guido\PyCharmMiscProject\[Link]", line 3, in <module>

print([Link](7))

~~~~~~~~~~~~~^^^

ValueError: [Link](x): x not in tuple

3. len( ): This built-in method returns the number of elements present in the tuple.

Syntax:

len(tuple_name)

Ex:

mytuple = (1,4,3,6,1,3)
print(len(mytuple))

Output:

[Link]( ): This built-in method returns the largest element of a tuple sequence.

Syntax:

max(tuple_name)

Ex:

mytuple = (17,78,10,25,36)
print(max(mytuple))

Output:

78

5. min( ): This built-in function returns the smallest element of a tuple.

Syntax:

min(tuple_name)

132
Ex:
mytuple = (17,78,10,25,36)
print(min(mytuple))

Output:
10

6. sum( ): This method returns the total of all element values of a tuple.

Syntax:
sum(tuple_name)

Ex:
mytuple = (17,78,10,25,36)
print(sum(mytuple))

Output:
166

7. all( ): This method returns True if all items in an iterable are true, otherwise it returns False.
Syntax:
all(tuple_name)

Ex:
tuple1 = (1,2,3)
print(all(tuple1))
tuple2 = (1,2,3,False)
print(all(tuple2))

Output:
True
False

8. any( ): This method returns True if any item in an iterable are true, otherwise it returns False.
Syntax:
any(tuple_name)

Ex:
tuple1 = (0,False,True,3)
print(any(tuple1))
tuple2 = (0,False)
print(any(tuple2))

Output:
True
False

133
9. sorted( ): This method returns a tuple with elements in a sorted order.

Syntax:

sorted(tuple_name)

Ex:

mytuple = (17,78,10,25,36)
print(sorted(mytuple))
print(sorted(mytuple,reverse=True))

Output:

[10, 17, 25, 36, 78]

[78, 36, 25, 17, 10]

10. tuple( ): This built-in method is used to create an empty tuple.

Syntax:

tuple_name = tuple(sequence)

Ex:

mytuple1 = tuple()
print(mytuple1)
a = [1,2,3]
#List converted to Tuple
mytuple2 = tuple(a)
print(mytuple2)

Output:

()

(1, 2, 3)

134
8. Introduction to Dictionary:

Dictionary:

Python Dictionaries are a powerful and versatile data structure in Python that stores key-value
pairs of data, allowing for seamless data retrieval. They are highly efficient for storing large amounts
of data, where you can easily access the relevant data without having to search through the entire data
structure.

Additionally, they are easy to use and manipulate, as you can easily add and delete elements. For
example, a Dictionary can be used to store a student's first and last name or to keep track of a student's
grades.

Key Features:

a) Key-Value Pairs: Each element in a Dictionary is a pair comprising a unique key and a value.

b) Mutable: You can change, add, or remove items after the dictionary is created.

c) Unordered: The items are not stored in any particular order.

d) Dynamic: They can grow and shrink as needed.

Advantages:

 It is primarily centered around efficient data access and flexible data organization.
 They provide O(1) average time complexity for lookups, insertions, and deletions due to their
underlying hash table implementation, making them significantly faster than lists for searching
by key
 This speed is crucial for tasks like mapping relationships, implementing caches, or storing
configuration settings.
 Dictionaries are also highly flexible, allowing any immutable type as a key and any type as a
value, which supports complex data structures and enhances code readability by clearly
associating related data.

Disadvantages:

 This lack of order makes them unsuitable for tasks where sequence is important, such as
iterating through data in a specific sequence.
 Searching for data by value is inefficient, as dictionaries are optimized for key-based lookups,
requiring a full iteration through the dictionary to find a specific value.
 They also consume more memory than other data structures like lists due to the overhead of the
hash table and the need to store keys
 Additionally, dictionaries do not support slicing, which limits their use in certain operations
that require accessing a range of elements.

135
Creating a Dictionary:

Creating a Dictionary in Python is done by enclosing a sequence of elements in curly braces


{}. The elements are separated by a comma. In a Dictionary, each pair of values consists of a Key,
followed by its corresponding value separated by a colon. The values in a Dictionary can be of any
data type and can be repeated. However, the keys must be unique and cannot be changed.

Syntax:

dict_name = {key1:value1,key2:value2,…,keyn:valuen}

Ex:

dict1 = {'ID':501,'Name':'Clara', 'Department':'Web Developer','Salary':65000.00}


print(dict1)

Output:

{'ID': 501, 'Name': 'Clara', 'Department': 'Web Developer', 'Salary': 65000.0}

Accessing elements of a Dictionary:

Accessing elements of a Dictionary can be done by a key and not by an index. This means that an
element in a Dictionary cannot be accessed using its position but with the Dictionary key.

Ex:

dict1 = {'ID':501,'Name':'Clara', 'Department':'Web Developer','Salary':65000.00}


print(dict1)
print("ID:",dict1['ID'])
print("Name:",dict1['Name'])
print("Department:",dict1['Department'])
print("Salary:",dict1['Salary'])

Output:

{'ID': 501, 'Name': 'Clara', 'Department': 'Web Developer', 'Salary': 65000.0}


ID: 501
Name: Clara
Department: Web Developer
Salary: 65000.0

136
9. Dictionaries are Mutable:

In Python, dictionaries are mutable. This means that you can modify the contents of a dictionary after
it has been created. You can add, update or remove key-value pairs in a dictionary which makes it a
flexible and dynamic data structure.

Being mutable means that the state of the dictionary can change over time. You can perform several
operations that modify the dictionary without needing to create a new one.

The most common operation that demonstrate the mutability of dictionaries in Python is :

Adding Key-Value Pairs:

You can add new key-value pairs to an existing dictionary:

Ex:

# Creating a dictionary
dict = {'a': 1, 'b': 2}

# Adding a new key-value pair


dict['c'] = 3

print(dict)

Output:

{'a': 1, 'b': 2, 'c': 3}

In this case, we added a new key 'c' with the value 3 to the dictionary. Since, dictionary is mutable, we
can perform more operations on dictionary like

10. Traversing a Dictionary:

Traversing a dictionary can be accomplished using several methods, primarily through the use
of for loops and built-in dictionary methods.

a) The most common approach is to use a for loop directly on the dictionary, which iterates over its
keys by default. This method is straightforward and efficient for tasks that involve working with the
keys themselves.

Ex:

dict1 = {'ID':501,'Name':'Clara', 'Department':'Web Developer','Salary':65000.00}


for i in dict1:
print(i,dict1[i],sep=':')

137
Output:

ID:501
Name:Clara
Department:Web Developer
Salary:65000.0

b) The .keys( ) method explicitly returns a view object of all the dictionary's keys, which can be
iterated over. While iterating directly over the dictionary achieves the same result, using .keys( ) can
make the code's intent clearer, explicitly indicating that only the keys are being processed.

Ex:

dict1 = {'ID':501,'Name':'Clara', 'Department':'Web Developer','Salary':65000.00}


print([Link]())

Output:
dict_keys(['ID', 'Name', 'Department', 'Salary'])

c) To access only the values of a dictionary, the .values( ) method can be used. This method returns a
view object containing all the values, allowing for sequential access without explicitly referencing the
associated keys. This is particularly useful when the task involves processing or displaying values
directly.

Ex:

dict1 = {'ID':501,'Name':'Clara', 'Department':'Web Developer','Salary':65000.00}


print([Link]())

Output:
dict_values([501, 'Clara', 'Web Developer', 65000.0])

d) For scenarios requiring simultaneous access to both keys and their corresponding values, the
.items( ) method is the most effective. It returns a view object containing tuples of key-value pairs,
enabling the use of tuple unpacking in a for loop to extract both components.

This method is considered a concise and efficient way to navigate the entire dictionary structure,
especially when manipulating the data based on both keys and values.

Ex:

dict1 = {'ID':501,'Name':'Clara', 'Department':'Web Developer','Salary':65000.00}


print([Link]())

Output:
dict_items([('ID', 501), ('Name', 'Clara'), ('Department', 'Web Developer'), ('Salary', 65000.0)])

138
11. Operations – Methods – Built-in Function For List:

Dictionary:

Python Dictionaries are a powerful and versatile data structure in Python that stores key-value
pairs of data, allowing for seamless data retrieval. They are highly efficient for storing large amounts
of data, where you can easily access the relevant data without having to search through the entire data
structure.

Additionally, they are easy to use and manipulate, as you can easily add and delete elements. For
example, a Dictionary can be used to store a student's first and last name or to keep track of a student's
grades.

Creating a Dictionary in Python:

Creating a Dictionary in Python is done by enclosing a sequence of elements in curly braces {}. The
elements are separated by a comma. Each pair of values consists of a Key, followed by its
corresponding value separated by a colon.

Syntax:

dict_name = {key1:value1,key2:value2,…,keyn:valuen}

Ex:

product = {'ID':101,'Name':'Keyboard','Price':350.00}
print(product)
product['Quantity']=5 # Added Element
product['Price']=400.00 # Updated element
print(product)

Output:

{'ID': 101, 'Name': 'Keyboard', 'Price': 350.0}


{'ID': 101, 'Name': 'Keyboard', 'Price': 400.0, 'Quantity': 5}

Operations of Dictionary:

In Python, dictionaries are a fundamental data structure used to store data as key-value pairs, allowing
for efficient retrieval of values using their corresponding keys.

The primary operations include:

1. Adding and modifying


2. Accessing
3. Removing
4. Retrieving and
5. Length

139
1. Adding and modifying: To add a new key-value pair to a dictionary, you can use the subscript
operator with a new key. If the key does not exist, it is added; if it already exists, its value is updated.

Note: Write creating of dictionary.

2. Accessing: The get( ) method can be used, which returns the value for the specified key

3. Removing: We use pop( ), popitem( ), clear( ) and del

4. Retrieving: Retrieving by keys, values and key – value pairs: keys( ), values( ) pairs( )

5. Length: len( )

Methods and Built – in Function in Dictionary:

The Python interpreter includes a wide range of built-in methods / functions, such as

1. clear( ) 7. pop( ) 13. all( )


2. copy( ) 8. popitem( ) 14. any( )
3. get( ) 9. update( ) 15. fromkeys( )
4. items( ) 10. del 16. setdefault( )
5. keys( ) 11. sorted( )
6. values( ) 12. len( )

1. clear( ): This method is used to remove all the elements (key-value pairs) from a dictionary. It
essentially empties the dictionary, leaving it with no key-value pairs.
Syntax:
dictionary_name.clear( )

Ex:
product = {'ID':101,'Name':'Keyboard','Price':350.00}
print(product)
[Link]()
print(product)

Output:
{'ID': 101, 'Name': 'Keyboard', 'Price': 350.0}
{}

2. copy( ): This method in Python is used to return a shallow copy of a dictionary, creating a new
dictionary with the same key-value pairs as the original.

Syntax:
new_dictionary = dicitionary_name.copy( )

140
Ex:
product = {'ID':101,'Name':'Keyboard','Price':350.00}
new_product = [Link]()
print("Original: ",product)
print("New: ",new_product)

Output:
Original: {'ID': 101, 'Name': 'Keyboard', 'Price': 350.0}
New: {'ID': 101, 'Name': 'Keyboard', 'Price': 350.0}

3. get( ): This method is a pre-built dictionary function that enables you to obtain the value linked to a
particular key in a dictionary. It is a secure method to access dictionary values without causing a
KeyError if the key isn't present.

Syntax:
dictionary_name.get(value)

Ex:
product = {'ID':101,'Name':'Keyboard','Price':350.00}
print([Link]('Name'))
print([Link]('Quantity'))

Output:
Keyboard
None

4. items( ): This method retrieves a view object containing a list of tuples. Each tuple represents a key-
value pair from the dictionary. This method is a convenient way to access both the keys and values of
a dictionary simultaneously, and it is highly efficient.

Syntax:
dictionary_name.items( )

Example:
product = {'ID':101,'Name':'Keyboard','Price':350.00}
print([Link]())
print(list([Link]())[0][0])
print(list([Link]())[0][1])

Output:
dict_items([('ID', 101), ('Name', 'Keyboard'), ('Price', 350.0)])
ID
101

141
5. keys( ): This method returns a view object with dictionary keys, allowing efficient access and
iteration.

Syntax:
dictionary_name.keys( )

Ex:
product = {'ID':101,'Name':'Keyboard','Price':350.00}
print([Link]())

Output:
dict_keys(['ID', 'Name', 'Price'])

6. values( ): This method returns a view object containing all dictionary values, which can be accessed
and iterated through efficiently.

Syntax:
dictionary_name.values( )

Ex:
product = {'ID':101,'Name':'Keyboard','Price':350.00}
print([Link]())

Output:
dict_values([101, 'Keyboard', 350.0])

7. pop( ): This method is a pre-existing dictionary method that removes and retrieves the value linked
with a given key from a dictionary. If the key is not present in the dictionary, you can set an optional
default value to be returned.

Syntax:
dictionary_name.pop(key)

Ex:
product = {'ID':101,'Name':'Keyboard','Price':350.00}
[Link]('Price')
print(product)

Output:
{'ID': 101, 'Name': 'Keyboard'}

8. popitem( ): This method is used to remove and return the last inserted key-value pair as a tuple. If
the dictionary is empty then it raises a KeyError.

Syntax:
dictionary_name.popitem( )

142
Ex:
product = {'ID':101,'Name':'Keyboard','Price':350.00}
[Link]( )
print(product)

Output:
{'ID': 101, 'Name': 'Keyboard'}

9. update( ): This method is used to updates the key-value pairs of a dictionary using elements from
another dictionary or an iterable of key-value pairs. With this method, you can include new data or
merge it with existing dictionary entries.

Syntax:
dictionary_name1.update(dictionary_name2)

Example:
product = {'ID':101,'Name':'Keyboard','Price':350.00}
detail = {'Company':'Logitech','Type':'Mechanical'}
[Link](detail)
print(product)
print(detail)

Output:
{'ID': 101, 'Name': 'Keyboard', 'Price': 350.0, 'Company': 'Logitech', 'Type': 'Mechanical'}
{'Company': 'Logitech', 'Type': 'Mechanical'}

10. del: This method is used to remove a specific key-value pair from a dictionary by specifying the
key. When applied to a dictionary, del deletes both the key and its associated value, effectively
removing that entry from the dictionary.

Syntax:
del dictionary_name[key] # deletes particular record
del dictionary_name # delete entire dictionary

Example:
product = {'ID':101,'Name':'Keyboard','Price':350.00}
del product['Price']
print(product)
del product
print(product)

143
Output:
{'ID': 101, 'Name': 'Keyboard'}
Traceback (most recent call last):
File "C:\Users\guido\PyCharmMiscProject\[Link]", line 5, in <module>
print(product)
^^^^^^^
NameError: name 'product' is not defined

11. sorted( ): We use sorted on a dictionary, it sorts by keys by default.


Syntax:
for i in sorted(dictionary_name):
# print statements
Ex:
d = {2: 56, 1: 2, 3: 323}
print("Dictionary", d)
# Sorting and printing key-value pairs by the key
for i in sorted(d):
print((i, d[i]), end=" ")

Output:
Dictionary {2: 56, 1: 2, 3: 323}
(1, 2) (2, 56) (3, 323)

12. len( ): This method returns the length of the dictionary.


Syntax:
len(dictionary_name)

Ex:
product = {'ID':101,'Name':'Keyboard','Price':350.00}
print("Length: ", len(product))

Ouput:
Length: 3

13. all( ): This method returns True if all items in an iterable are true, otherwise it returns False.
Syntax:
all(dictionary_name)
Ex:
d1 = {1:'one',2:'two',3:'three'}
print(all(d1))
d2 = {0:'zero',1:'one',2:'two'}
print(all(d2))

Output:
True
False

144
14. any( ): This method returns True if any item in an iterable are true, otherwise it returns False.

Syntax:
any(dictionary_name)

Ex:
d1 = {1:'one',2:'two',3:'three'}
print(any(d1))
d2 = {0:'zero',1:'one',2:'two'}
print(any(d2))
d3 = {0:'zero',False:'false'}
print(any(d3))

Output:
True
True
False

15. fromkeys( ): This method returns the dictionary with key mapped and specific value. It creates a
new dictionary from the given sequence with the specific value.

Syntax:
fromkeys(seq, val)

Ex:
seq = {'a','b','c'}
print([Link](seq,1))

Output:
{'a': 1, 'b': 1, 'c': 1}

16. setdefault( ): This method is used to retrieve the value for a specified key if it exists, or to insert
the key with a specified default value if it does not exist
Syntax:
[Link](key, default_value)

Ex:
d = {'one':1, 'two':2, 'three':3}
print(d)
[Link]('four', 4)
[Link]('one','First') # 'one' key is already in dictionary, so it not set default.
print(d)
Output:
{'one': 1, 'two': 2, 'three': 3}
{'one': 1, 'two': 2, 'three': 3, 'four': 4}

145
EXTRA:

12. Explain about sets in python.

Python Sets:
A set in Python is an unordered collection of unique elements. Its two main characteristics are
that it does not allow duplicate items and its items are not stored in any particular order. Sets are
created using curly braces {} or the set() constructor.

Python Creating Sets:


To create a new set in Python, you can use curly braces ('{}') or the constructor function 'set()'.

Syntax:
set_name = { }
set_name = set( ) # by set( )

Example:

# Using curly braces


my_set = {1, 2, 3, 4, 4}
print(my_set) # Outputs: {1, 2, 3, 4}

# Using set() function


another_set = set([1, 2, 2, 3, 4])
print(another_set) # Outputs: {1, 2, 3, 4}

Output:
{1, 2, 3, 4}
{1, 2, 3, 4}

Sets can store various types of data, such as strings, integers, or tuples. However, sets cannot contain
unhashable types like lists.

Advantages:

 The most significant advantage is their efficiency in membership testing; checking if an


element is present in a set (using x in s) is significantly faster than the same operation on a list,
especially as the size of the data grows.
 This speed is crucial for operations like filtering or validating data. Sets are also highly
effective for removing duplicates from a sequence, as converting a list or tuple to a set
automatically eliminates any repeated elements, making it the fastest method for this task.
 Furthermore, sets support powerful mathematical operations such as union, intersection,
difference, and symmetric difference, which are invaluable for data analysis and comparison
tasks.

146
Disadvantages:

 The primary drawback is that sets are unordered, meaning the elements do not maintain any
specific sequence, and you cannot access them by index
 Additionally, sets are mutable, which means they can be changed after creation, but this also
means they are not hashable and cannot be used as keys in dictionaries or elements within
other sets.
 The performance of set operations can also be affected by the nature of the elements and the
hash function used, and the initial cost of constructing a set can be significant compared to a
list.

Traversing of Set:
There are several ways to access and process each element of a set in Python, but the sequence
may change with each execution.
Some of the traversing methods are listed below,
1. for loop
2. __iter__( )
3. iter( )
3. enumerate( )

1. for loop: When you use a for loop, Python automatically calls the __iter__() method behind the
scenes, which handles the iteration for you. This makes it simple and efficient to go through all the
elements, without the need for things like indices

Ex:
set1 = {1,2,3}
for i in set1:
print(i)

Output:
1
2
3

2. __iter__( ): This method directly access the internal iterator of a set. It does the same thing a for
loop does behind the scenes but in a more manual way.

Ex:
set1 = {1,2,3}
# using __iter__( )
for i in set1.__iter__():
print(i)

Output:
1
2
3

147
3. iter ( ): This function returns an iterator for a given iterable, like a set. It’s essentially a cleaner and
more readable way to call __iter__() directly.

Ex:
set1 = {1,2,3}

for i in iter(set1):
print(i)

Output:
1
2
3

4. enumerate( ): This function that adds a counter (index) to an iterable. It returns each item as a tuple
containing the index and the corresponding value.

Ex:
set1 = {1,2,3}

for i,j in enumerate(set1):


print(i,j)

Output:
01
12
23

Operations of Set:
Sets are a fundamental data structure that store unique elements. Python provides built-in
operations for performing set operations such as union, intersection, difference and symmetric
difference. In this article, we understand these operations one by one.

1. Union: The union of two sets combines all unique elements from both sets.

Syntax:
set1 | set2 # Using the '|' operator
[Link](set2) # Using the union() method

148
Ex:
A = {1, 2, 3, 4}
B = {3, 4, 5, 6}
# Using '|' operator
res1 = A | B
print("using '|':", res1)
# Using union() method
res2 = [Link](B)
print("using union():",res2)

Output:
using '|': {1, 2, 3, 4, 5, 6}
using union(): {1, 2, 3, 4, 5, 6}

Explanation: | operator and union() method both return a new set containing all unique elements
from both sets .

[Link]: The intersection of two sets includes only the common elements present in both sets.

Syntax:
set1 & set2 # Using the '&' operator
[Link](set2) # Using the intersection() method

Ex:
A = {1, 2, 3, 4}
B = {3, 4, 5, 6}

# Using '&' operator


res1 = A & B
print("using '&':",res1)
# Using intersection() method
res2 = [Link](B)
print("using intersection():",res2)
Output:
using '&': {3, 4}
using intersection(): {3, 4}

Explanation: & operator and intersection( ) method return a new set containing only elements that
appear in both sets.

3. Difference: The difference between two sets includes elements present in the first set but not in the
second.

Syntax:
set1 - set2 # Using the '-' operator
[Link](set2) # Using the difference( ) method

149
Ex:
A = {1, 2, 3, 4}
B = {3, 4, 5, 6}
# Using '-' operator
res1 = A - B
print("using '-':", res1)
# Using difference() method
res2 = [Link](B)
print("using difference():", res2)

Output:
using '-': {1, 2}
using difference(): {1, 2}
Explanation: - operator and difference( ) method return a new set containing elements of A that are
not in B.

[Link] Difference: The symmetric difference of two sets includes elements that are in either set
but not in both.

Syntax:
set1 ^ set2 # Using the '^' operator
set1.symmetric_difference(set2) # Using the symmetric_difference() method

Ex:
A = {1, 2, 3, 4}
B = {3, 4, 5, 6}
# Using '^' operator
res1 = A ^ B
print("using '^':", res1)
# Using symmetric_difference() method
res2 = A.symmetric_difference(B)
print("using symmetric_difference():", res2)
Output:
using '^': {1, 2, 5, 6}
using symmetric_difference(): {1, 2, 5, 6}

Explanation: ^ operator and symmetric_difference() method return a new set containing elements
that are in either A or B but not in both.

150
Methods and Built-in funcitons of Sets:

1. add( ) 5. remove( ) 9. max( ) 13. sorted( )


2. clear( ) 6. discard( ) 10. min( ) 14. union( )
3. copy( ) 7. len( ) 11. any( ) 15. intersection( )
4. pop( ) 8. sum( ) 12. all( ) 16. difference( )

1. add(): This method adds a given element to a set. If the element is already present, it doesn't add any
element.

Syntax:
[Link]( elem )

Ex:
prime_numbers = {2, 3, 5, 7}

# add 11 to prime_numbers
prime_numbers.add(11)
print(prime_numbers)

Output: {2, 3, 5, 7, 11}

2. clear( ): This method removes all elements from the set.

Syntax:
[Link]()

Ex:
test_set = {1, 2, 3, 4}
test_set.clear()
print(test_set)

Output:
set( )

3. copy( ): This method returns a shallow copy of the set in python.

Syntax:
new_set = old_set.copy( )

Ex:
# Python3 program to demonstrate the use
# of join() function

set1 = {1, 2, 3, 4}

151
# function to copy the set
set2 = [Link]()

# prints the copied set


print(set2)

Output:
{1, 2, 3, 4}

4. pop( ): This method removes any random element from the set and returns the removed element,
because set is unordered collection of unique elements.
Syntax: set_obj.pop()

Ex:
s1 = {9, 1, 0}
[Link]()
print(s1)

Ex:
{9,1}

5. remove( ): This method removes the element from the set only if the element is present in the set. If
the element is absent in the set, then error or exception is raised.
Syntax:
set_name.remove(element)

Ex:
set1 = {1, 2, 3, 4}
[Link](4)
print(set1)
[Link](5)
print(set1)

Output:
Traceback (most recent call last):
File "C:\Users\guido\PyCharmMiscProject\[Link]", line 4, in <module>
[Link](5)
~~~~~~~~~~~^^^
KeyError: 5
{1, 2, 3}

6. discard( ):This method removes the element from the set only if the element is present in the set. If
the element is absent in the set, then no error or exception is raised and the original set is printed.
Syntax:
set_name.discard(element)

152
Ex:
set1 = {1, 2, 3, 4}
[Link](4)
print(set1)
[Link](5)
print(set1)

Output:
{1, 2, 3}
{1, 2, 3}

7. len( ): This method returns the length of the set.


Syntax:
len(set_name)

Ex:
set1 = {1, 2, 3, 4, 5}
print("Length: ",len(set1))

Output:
5

8. sum( ): This method returns the total of all element values in set.
Syntax:
sum(set_name)

Ex:
set1 = {1, 2, 3, 4, 5}
print("Sum = ",sum(set1))

Outpu:
Sum = 15

9. max( ): This built-in method returns the largest element in set.


Syntax:
max(set_name)

Ex:
set1 = {1, 2, 3, 4, 5}
print("Max : ",max(set1))
Output:
Max : 5

10. min( ): This built-in function returns the smallest element in set.
Syntax:
min(set_name)

153
Ex:
set1 = {1, 2, 3, 4, 5}
print("Min : ",min(set1))
Output:
Min = 1

11. all( ): This method returns True if all items in an iterable are true, otherwise it returns False.
Syntax:
all(set_name)

Ex:
set1 = {1,2,3}
print(all(set1))
set2 = {0,1,2,3}
print(all(set2))

Output:
True
False

12. any( ): This method returns True if any item in an iterable are true, otherwise it returns False.
Syntax:
any(set_name)

Ex:
set1 = {0,True,False,3}
print(any(set1))
set2 = {0,False}
print(any(set2))
Output:
True
False

13. sorted( ): This method returns a set with elements in a sorted order.
Syntax:
sorted(set_name) # Ascending Order
sorted(set_name,reversed=True) # Descending Order

Ex:
set1 = {8,3,7,10,4}
print("Sorted in Ascending Order : ",sorted(set1))
print("Sorted in Descending Order : ",sorted(set1,reverse=True))
Output:
Sorted in Ascending Order : [3, 4, 7, 8, 10]
Sorted in Descending Order : [10, 8, 7, 4, 3]

154
13. Write difference between list, tuple, dictionary and set.

Difference between List, Tuple, Dictionary and Set:

In Python, List, Tuple, Dictionary, and Set are four fundamental data structures used for
organizing and storing data. The key difference is, that Lists and tuples are ordered collections in
Python, but lists are mutable while tuples are immutable. Sets are unordered collections of unique
elements, while dictionaries are key-value pairs for efficient data retrieval. Sets store unique elements,
while dictionaries store related pieces of information.

Parameters List Tuple Set Dictionary


1. Basics A list is similar to Tuples are Sets are mutable, A dictionary in Python
an array in other collections of Python iterable collections is an unordered
languages (like objects separated by of unique data collection used for
ArrayList in Java commas. types. storing key: value pairs.
or vector in C++).
2. Homogeneity A list is a non- A tuple is a non- A set is a non- A dictionary is a non-
homogeneous data homogeneous data homogeneous data homogeneous data
structure that stores structure that stores structure that structure that stores key-
elements in elements in columns stores elements in value pairs.
columns and rows. and rows. a single row.
3. Representation A List is A Tuple is A Set is A Dictionary is
represented by [ ] represented by ( ) represented by { } represented by { }
4. Duplicate It permits duplicate It permits duplicate It does not permit It does not permit
elements elements. elements. duplicate duplicate keys.
elements.
5. Nested Among It can be nested in It can be nested in a It can be nested in It can be nested in a
All a List. Tuple. a Set. Dictionary.
6. Example list1 = [1, 2, 3] tuple1 = (10, 20, 30) set1 = {100, 200, dict1 = {1: 'one', 2:
300} 'two', 3: 'three'}
7. Function for A list can be A tuple can be A set can be A dictionary can be
Creation created using created using created using created using
the list() function. the tuple() function. the set() function. the dict() function.
8. Mutation It is mutable, It is immutable, not It is mutable, It is mutable, but the
allowing allowing allowing keys cannot be
modifications. modifications. modifications. duplicated.
9. Order It maintains order. It maintains order. It does not It maintains order.
maintain order.
10. Empty An empty list can An empty tuple can An empty set can An empty dictionary can
Elements be created using: be created using: be created using: be created using:
l=[] t=() s=set() d={}

155
UNIT – V

Introduction to NumPy: Array, NumPy Array, Indexing and Slicing, Operations on Arrays,
Concatenation Arrays, Reshaping Arrays, Splitting Arrays, Statistical Operations on Arrays.

Data Handling using Pandas: Introduction to Python Libraries, Series, DataFrame, Importing and
Exporting Data between CSV Files and DataFrames.

Plotting Data Using Matplotlib: Introduction, Plotting using Matplotlib – Line chart, Bar chart,
Histogram, Scatter chart, Pie chart.

1. Array:

 We have learnt about various data types like list, tuple, and dictionary. In this chapter we will
discuss another datatype ‘Array’.
 An array is a data type used to store multiple values using a single identifier (variable name).
An array contains an ordered collection of data elements where each element is of the same
type and can be referenced by its index (position).

The important characteristics of an array are:

 Each element of the array is of same data type, though the values stored in them may be
different.
 The entire array is stored contiguously in memory. This makes operations on array fast.
 Each element of the array is identified or referred using the name of the Array along with the
index of that element, which is unique for each element.
 The index of an element is an integral value associated with the element, based on the
element’s position in the array.
 For example consider an array with 5 numbers: [ 10, 9, 99, 71, 90 ]

Here, the 1st value in the array is 10 and has the index value [0] associated with it; the 2nd value in the
array is 9 and has the index value [1] associated with it, and so on.

The last value (in this case the 5th value) in this array has an index [4]. This is called zero based
indexing. This is very similar to the indexing of lists in Python.

The idea of arrays is so important that almost all programming languages support it in one form or
another.

156
2. Introduction to NumPy:

NumPy, short for Numerical Python, is a fundamental open-source library in Python for
scientific computing. It provides support for large, multi-dimensional arrays and matrices, along with a
collection of high-level mathematical functions to operate on these arrays efficiently.

NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it freely.
NumPy is a Python library and is written partially in Python, but most of the parts that require fast
computation are written in C or C++.

Key Features of NumPy:


 ndarray object: The core of NumPy is the ndarray (n-dimensional array) object, which is a
powerful and efficient data structure for storing and manipulating large datasets. Unlike
Python's built-in lists, NumPy arrays are homogeneous (all elements must be of the same data
type) and have a fixed size after creation.

 Performance: NumPy operations are significantly faster than equivalent operations performed
using standard Python lists. This efficiency stems from its underlying implementation, which is
largely written in C and leverages optimized C-based functions. NumPy arrays also utilize
contiguous blocks of memory, allowing for efficient caching and processing by the CPU.

 Mathematical Functions: NumPy offers a comprehensive suite of mathematical functions for


array operations, including linear algebra, Fourier transforms, random number generation, and
basic statistical operations.

 Integration: NumPy serves as a foundational library for many other prominent data science
and machine learning libraries in Python, such as Pandas, SciPy, Scikit-learn, TensorFlow,
Keras, and PyTorch.

NumPy Array:

NumPy arrays are used to store lists of numerical data, vectors and matrices. The NumPy
library has a large set of routines (built-in functions) for creating, manipulating, and transforming
NumPy arrays. Python language also has an array data structure, but it is not as versatile, efficient and
useful as the NumPy array.

The NumPy array is officially called ndarray but commonly known as array. In rest of the
chapter, we will be referring to NumPy array whenever we use “array”. following are few differences
between list and Array.

Installing NumPy:

To install NumPy, the recommended method involves using pip, the Python package installer.
In the terminal or command prompt, type the following command and press Enter:
pip install numpy
This command instructs pip to download and install the latest stable version of NumPy from the
Python Package Index (PyPI). Verify the installation (optional but recommended).

157
Difference Between List and Array:

Feature Array List

All elements must be of the same It can contain elements of different


Homogeneity type. types.

arr1 = [1,2,3,4,5] list1 = [1, 2.34, "Hello"]

Dynamic size; can grow or shrink as


Size Flexibility Fixed-size once created.
needed.

Memory More memory efficient due to Less memory-efficient due to


Efficiency homogeneity. potential type variability.

Rich set of built-in methods for


Built-in Methods Fewer built-in methods.
various operations.

Module Requires the array module to be Built-in data structure; no import


Requirement imported. needed.

Type Requires a type code during


No type specification is needed.
Specification creation (e.g., ‘i’ for int).

Suitable for numerical operations


General purpose; suitable for a wide
Usage and when interfacing with C
range of tasks.
libraries.

Data Access Direct access using index. Direct access using index.

It can contain elements of the It can contain any type, including


Nesting
specified type only. other lists (nesting).

Duplication It can have duplicate elements. It can have duplicate elements.

Representation Uses the array module. Represented using square brackets [ ].

It might offer better performance Versatile but might be less performant


Performance in specific scenarios due to than arrays for certain operations due
contiguous memory storage. to type checking.

158
Creation of NumPy:

An array allows us to store a collection of multiple values in a single data structure.

The NumPy array is similar to a list, but with added benefits such as being faster and more memory
efficient.

Importing NumPy:

To import the NumPy library in Python, the standard and most common practice is to use
the import statement with an alias:

I import numpy as np

Numpy library provides various methods to work with data. To leverage all those features, we first
need to create numpy arrays.

There are multiple techniques to generate arrays in NumPy, and we will explore each of them below.

NumPy 1-D Array Creation

1. Creating Array Using List:

We can create a NumPy array using a list,

Ex:
import numpy as np
# create a list named list1
list1 = [2, 4, 6, 8]
# create numpy array using list1
array1 = [Link](list1)
print(array1)
Output:
[2 4 6 8]

2. Creating array using [Link]( ):

array1 = [Link](list1)

Here, we have created an array by passing list1 as an argument to the [Link]( ) function.
Instead of creating a list and using the list variable with the [Link]() function, we can directly pass
list elements as an argument. For example,
Ex:
import numpy as np
# create numpy array using a list
array1 = [Link]([2, 4, 6, 8])
print(array1)
Output:
[2 4 6 8]

159
3. Create an Array Using [Link]( ):

The [Link]() function allows us to create an array filled with all zeros.

Ex:
import numpy as np
# create an array with 4 elements filled with zeros
array1 = [Link](4)
print(array1)

Output:

[0. 0. 0. 0.]

Here, we have created an array named array1 with 4 elements all initialized to 0 using
the [Link](4) function.

Note: Similarly we can use [Link]( ) to create an array filled with values 1.

4. Create an Array With [Link]( ):

The [Link]( ) function returns an array with values within a specified interval.

Ex:
import numpy as np
# create an array with values from 0 to 4
array1 = [Link](5)
print("Using [Link](5):", array1)
# create an array with values from 1 to 8 with a step of 2
array2 = [Link](1, 9, 2)
print("Using [Link](1, 9, 2):",array2)

Output:

Using [Link](5): [0 1 2 3 4]

Using [Link](1, 9, 2): [1 3 5 7]

In the above example, we have created arrays using the [Link]() function.

 [Link](5) - create an array with 5 elements, where the values range from 0 to 4

 [Link](1, 9, 2) - create an array with 5 elements, where the values range from 1 to 8 with a
step of 2.

160
5. Create an Array With [Link]( ):

The [Link]() function is used to create an array of random numbers.

Ex: To create an array of 5 random numbers,

import numpy as np
# generate an array of 5 random numbers
array1 = [Link](3)
print(array1)

Output:

[0.08455648 0.56379034 0.66463204]

In the above example, we have used the [Link]() function to create an


array array1 with 5 random numbers.

This code generates a different output each time we run it.

6. Create an Empty NumPy Array:

To create an empty NumPy array, we use the [Link]() function. For example,

Ex:
import numpy as np
# create an empty array of length 4
array1 = [Link](3)
print(array1)

Output:

[0.08455648 0.56379034 0.66463204]

Here, we have created an empty array of length 4 using the [Link]() function.

If we look into the output of the code, we can see the empty array is actually not empty, it has some
values in it.

It is because although we are creating an empty array, NumPy will try to add some value to it. The
values stored in the array are arbitrary and have no significance.

161
NumPy N – Dimensional Array

NumPy N-D Array Creation:

NumPy is not restricted to 1-D arrays, it can have arrays of multiple dimensions, also known as N-
dimensional arrays or ndarrays.

An N-dimensional array refers to the number of dimensions in which the array is organized.

An array can have any number of dimensions and each dimension can have any number of elements.

For example, a 2D array represents a table with rows and columns, while a 3D array represents a cube
with width, height, and depth.

There are multiple techniques to create N-d arrays in NumPy, and we will explore each of them below.

1. N-D Array Creation From List of Lists:

To create an N-dimensional NumPy array from a Python List, we can use the [Link]() function and
pass the list as an argument.

a) Create a 2-D NumPy Array

Let's create a 2D NumPy array with 2 rows and 4 columns using lists.

Ex:

import numpy as np
# create a 2D array with 2 rows and 4 columns
array1 = [Link]([[1, 2, 3, 4],
[5, 6, 7, 8]])
print(array1)

Output:
[[1 2 3 4]
[5 6 7 8]]

In the above example, we first created a 2D list (list of lists) [[1, 2, 3, 4], [5, 6, 7, 8]] with 2 rows
and 4 columns. We then passed the list to the [Link]() function to create a 2D array.

b) Create a 3-D NumPy Array:

Let's say we want to create a 3-D NumPy array consisting of two "slices" where each slice has 3 rows
and 4 columns.

Here's how we create our desired 3-D array,

162
Ex:

import numpy as np
# create a 3D array with 2 "slices", each of 3 rows and 4 columns
array1 = [Link]([[[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]],

[[13, 14, 15, 16],


[17, 18, 19, 20],
[21, 22, 23, 24]]])

print(array1)

Output:
[[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]

[[13 14 15 16]
[17 18 19 20]
[21 22 23 24]]]

Here, we created a 3D list [list of lists of lists] and passed it to the [Link]() function. This creates the
3-D array named array1.

In the 3D list,

 The outermost list contains two elements, which are lists representing the two "slices" of the
array. Each slice is a 2-D array with 3 rows and 4 columns.

 The innermost lists represent the individual rows of the 2-D arrays.

Note: In the context of an N-D array, a slice is like a subset of the array that we can take out by
selecting a specific range of rows, columns.

Creating N-d Arrays From Scratch:

We saw how to create N-d NumPy arrays from Python lists. Now we'll see how we can create them
from scratch.

To create multidimensional arrays from scratch we use functions such as

 [Link]()

 [Link]()

 [Link]()

163
2. Create N-D Arrays using [Link]( ):

The [Link]() function allows us to create N-D arrays filled with all zeros. For example,

Ex:

import numpy as np
# create 2D array with 2 rows and 3 columns filled with zeros
array1 = [Link]((2, 3))
print("2-D Array: ")
print(array1)

# create 3D array with dimensions 2x3x4 filled with zeros


array2 = [Link]((2, 3, 4))
print("\n3-D Array: ")
print(array2)

Output:

2-D Array:

[[0. 0. 0.]

[0. 0. 0.]]

3-D Array:

[[[0. 0. 0. 0.]

[0. 0. 0. 0.]

[0. 0. 0. 0.]]

[[0. 0. 0. 0.]

[0. 0. 0. 0.]

[0. 0. 0. 0.]]]

In the above example, we have used the [Link]() function to create a 2-D array and 3-D array filled
with zeros respectively.

 [Link]((2, 3)) - returns a zero filled 2-D array with 2 rows and 3 columns

 [Link]((2, 3, 4)) - returns a zero filled 3-D array with 2 slices, each slice having 3 rows
and 4 columns.

Note: Similarly we can use [Link]() to create an array filled with values 1.

164
3. Create N-D Array with a Specified Value:

In NumPy, we can use the [Link]( ) function to create a multidimensional array with a specified value.

For example, to create a 2-D array with the value 5, we can do the following:

Ex:
import numpy as np
# Create a 2-D array with elements initialized to 5
numpy_array = [Link]((2, 2), 5)
print("2-D Array:", numpy_array)

# Create a 3-D array with elements initialized to 5


numpy_array = [Link]((2, 2, 2), 5)
print("3-D Array:", numpy_array)

Output:

2-D Array: [[5 5]

[5 5]]

3-D Array: [[[5 5]

[5 5]]

[[5 5]

[5 5]]]

Here, we have used the [Link]() function to create a 2-D array and 3-D, where all elements are
initialized to 5.

4. Creating Arrays With [Link]( )

The [Link]() function is used to create an array of random numbers.


Let's see an example to create an array of 5 random numbers,

Ex:
import numpy as np
# create a 2D array of 2 rows and 2 columns of random numbers
array1 = [Link](2, 2)
print("2-D Array: ")
print(array1)

# create a 3D array of shape (2, 2, 2) of random numbers


array2 = [Link](2, 2, 2)
print("\n3-D Array: ")
print(array2)

165
Output:

2-D Array:

[[0.13198621 0.54730421]

[0.36570987 0.16233836]]

3-D Array:

[[[0.15666007 0.4580507 ]

[0.84769856 0.76699589]]

[[0.45395202 0.39944328]

[0.62999479 0.39629496]]]

Here,

 [Link](2, 2) - creates a 2D array of 2 rows and 2 columns of random numbers.

 [Link](2, 2, 2) - creates a 3D array with 2 slices, each slice having 2 rows


and 2 columns of random numbers.

5. Create Empty N-D NumPy Array:

To create an empty N-D NumPy array, we use the [Link]() function.

Ex:
import numpy as np
# create an empty 2D array with 2 rows and 2 columns
array1 = [Link]((2, 2))
print("2-D Array: ")
print(array1)

# create an empty 3D array of shape (2, 2, 2)


array2 = [Link]((2, 2, 2))
print("\n3-D Array: ")
print(array2)

166
Output:

2-D Array:

[[8.86495615e-317 0.00000000e+000]

[2.21149159e-316 1.76125651e-312]]

3-D Array:

[[[1.0749539e-316 0.0000000e+000]

[0.0000000e+000 0.0000000e+000]]

[[0.0000000e+000 0.0000000e+000]

[0.0000000e+000 0.0000000e+000]]]

In the above example, we used the [Link]( ) function to create an empty 2-D array and a 3-D array
respectively.

If we look into the output of the code, we can see the empty array is actually not empty, it has some
values in it.

It is because although we are creating an empty array, NumPy will try to add some value to it. The
values stored in the array are arbitrary and have no significance value.

3. Indexing and Slicing NumPy:

a) Indexing NumPy:

Array indexing in NumPy refers to the method of accessing specific elements or subsets of data
within an array. This feature allows us to retrieve, modify and manipulate data at specific positions or
ranges helps in making it easier to work with large datasets. In this article, we’ll see the different ways
to index and slice NumPy arrays which helps us to work with our data more effectively.

1. Accessing Elements in 1D Arrays:

A 1D NumPy array is a sequence of values with positions called indices which starts at 0. We access
elements by using these indices in square brackets like arr[0] for the first element. Negative indices
count from the end so arr[-1] gives the last element.

Ex:
import numpy as np
arr = [Link]([10, 20, 30, 40, 50])
print(arr[0])

Output:

10

167
2. Accessing Elements in Multidimensional Arrays:

In this we will see how to access elements in both 2D and 3D arrays using specific indices.

2D Arrays: We can access elements by specifying both row and column indices like matrix[row,
column].

Ex:
import numpy as np
matrix = [Link]([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(matrix[1, 2])

Output:

Here matrix[1, 2] accesses the element in the second row (index 1) and third column (index 2) which
is 6.

3D Arrays: It can be visualized as a stack of 2D arrays, we need three indices-

1. Depth: Specifies the 2D slice.

2. Row: Specifies the row within the slice.

3. Column: Specifies the column within the row.

We can access elements by specifying row, column and depth indices like matrix[depth, row,
column].

Ex:

import numpy as np

cube = [Link]([[[1, 2, 3],

[4, 5, 6],

[7, 8, 9]],

[[10, 11, 12],

[13, 14, 15],

[16, 17, 18]]])

print(cube[1, 2, 0])

Output:

16

168
b) Slicing NumPy:

NumPy array slicing allows for extracting a portion or subset of a NumPy array. This
technique is crucial for efficient data manipulation and analysis in Python, especially when working
with large datasets.

Syntax:
array[start:stop:step]
where,

 start: The inclusive index where the slice begins. If omitted, it defaults to the beginning of the
array (index 0).

 stop: The exclusive index where the slice ends. The element at this index is not included in the
slice. If omitted, it defaults to the end of the array.

 step: The interval between elements to be selected. If omitted, it defaults to 1. A negative step
can be used to slice in reverse order.

Basic Slicing : 1D Array

Ex 1:

arr = [Link]([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])


slice_arr = arr[2:7] # Elements from index 2 upto 6 (exclude 7)
print(slice_arr)

Output:
[2 3 4 5 6]

Ex 2 : Slicing with step

import numpy as np
arr = [Link]([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
stepped_slice = arr[Link] # Elements from index 1 up to 9, taking every second element
print(stepped_slice)

Output:
[1 3 5 7]

Ex 3: Negative Slicing

import numpy as np
arr = [Link]([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
negative_slice = arr[-5:-2]
# Elements from the 5th last to the 2nd last (exclusive): [5, 6, 7]
print(negative_slice)

Output:
[5 6 7]

169
Multi-dimensional Slicing: For 2D arrays (and higher), slicing applies to each dimension separately:

Syntax:

array[row_slice, column_slice].

Ex:

import numpy as np
matrix = [Link]([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
sub_matrix = matrix[0:2, 1:3] # Rows 0 and 1, columns 1 and 2
print(sub_matrix)

Output:
[[2 3]
[5 6]]

For 3D arrays slicing: NumPy provides powerful and flexible slicing capabilities for 3D arrays,
similar to how it handles 1D and 2D arrays, but extended to three dimensions.

Syntax:

array[dim1_slice, dim2_slice, dim3_slice].

Understanding the Dimensions:

Consider a 3D array as a stack of 2D matrices.

 The first dimension (axis 0) represents the "depth" or the different 2D matrices in the stack.

 The second dimension (axis 1) represents the "rows" within each 2D matrix.

 The third dimension (axis 2) represents the "columns" within each 2D matrix.

Examples of 3D Slicing:

4. Operations on Arrays:

Once arrays are declared, we can access it's element or perform certain operations the last section, we
learnt about accessing elements.

This section describes multiple operations that can be applied on arrays.

1. Arithmetic Operations: Arithmetic operations on NumPy arrays are fast and simple. When we
perform a basic arithmetic operation like addition, subtraction, multiplication, division etc. on two
arrays, the operation is done on each corresponding pair of elements.

For instance, adding two arrays will result in the first element in the first array to be added to the first
element in the second array, and so on.

170
Ex:

import numpy as np
a = [Link]([[5,7],[3,1]])
b = [Link]([[2,4],[6,3]])
print("Addition: ",a+b) # Matrix Addition
print("Substration:\n ",a-b) # Matrix Subtraction
print("Multiplication:\n ",a*b) # Multiplication
print("Matrix Multiplication:\n ",a@b) # Matrix Multiplication
print("Expnentiation:\n ",a**2) # Matrix exponentiation
print("Division:\n ",a/b) # Division
print ("Floor Division:\n ",a//b) # Floor Division
print("Moduluo:\n ",a%b) # Modulo

Output:

Addition:

[[ 7 11]

[ 9 4]]

Substration:

[[ 3 3]

[-3 -2]]

Multiplication:

[[10 28]

[18 3]]

Matrix Multiplication:

[[52 41]

[12 15]]

Expnentiation:

[[25 49]

[ 9 1]]

Division:

[[2.5 1.75 ]

[0.5 0.33333333]]

171
Floor Division:

[[2 1]

[0 0]]

Moduluo:

[[1 3]

[3 1]]

b) Transpose: Transposing an array turns its rows into columns and columns into rows just like
matrices in mathematics.

Ex:

import numpy as np
a = [Link]([[5,7,13],[3,1,-34]])
print("Before Transpose:\n ",a)
print("After Transpose:\n ",[Link]())

Output:

Before Transpose:

[[ 5 7 13]

[ 3 1 -34]]

After Transpose:

[[ 5 3]

[ 7 1]

[ 13 -34]]

c) Sorting:

Sorting is to arrange the elements of an array in hierarchical order either ascending or descending. By
default, numpy does sorting in ascending order.

Ex:

import numpy as np
a = [Link]([5,7,-3,15,-18])
[Link]()
print(a)

Output:

[-18 -3 5 7 15]

172
In 2-D array, sorting can be done along either of the axes i.e., row-wise or column-wise. By default,
sorting is done row-wise (i.e., on axis = 1). It means to arrange elements in each row in ascending
order. When axis=0, sorting is done column-wise, which means each column is sorted in ascending
order

Ex:

import numpy as np
a = [Link]([[5,7,8],[81,-6,10],[56,12,33]])
[Link]()
print(a)
[Link](axis=0)
print(a)

Output:

[[ 5 7 8]

[-6 10 81]

[12 33 56]]

[[-6 7 8]

[ 5 10 56]

[12 33 81]]

5. Concatenation of array:

The NumPy `concatenate( )` function is an array operation used to join two or more arrays
along a specified axis. It is useful for combining datasets, restructuring arrays, and performing data
manipulation tasks efficiently.

It requires the arrays to be of the same shape except for the dimension specified by the axis.

Syntax:

[Link]((array1, array2, ...), axis=0, out=None, dtype=None)

Parameters:

 arrays: A sequence of input arrays to be concatenated. These arrays must have the same shape
along all axes except the one specified by axis.

 axis: The axis along which the arrays will be joined. Default is 0 (the first axis).

 out: If provided the result will be placed in this array.

 dtype: It overrides the data type of the output array.

173
1. Basic Concatenation:

Ex:
import numpy as np
array1 = [Link]([1, 2, 3])
array2 = [Link]([4, 5, 6])
result = [Link]((array1, array2))
print(result)

Output:
[1, 2, 3, 4, 5, 6]
This code concatenates two one-dimensional arrays, resulting in `[1, 2, 3, 4, 5, 6]`.

2. Concatenation Along Rows:

Ex:
import numpy as np
array1 = [Link]([[1, 2], [3, 4]])
array2 = [Link]([[5, 6]])
result = [Link]((array1, array2), axis=0)
print(result)

Output:
[[1, 2],
[3, 4],
[5, 6]]

In this example, `array2` has a shape of `(1, 2)`, making it compatible for concatenation along axis 0
with `array1`, resulting in a 3x2 matrix.

3. Concatenation Along Columns:

Ex:
import numpy as np
array1 = [Link]([[1, 2], [3, 4]])
array2 = [Link]([[5, 6], [7, 8]])
result = [Link]((array1, array2), axis=1)
print(result)

Output:
[[1, 2, 5, 6],
[3, 4, 7, 8]]

This example concatenates `array2` along the columns of `array1`, producing a 2x4 matrix.

174
4. Concatenation of 3D Arrays:

Ex:
import numpy as np
array1 = [Link]([[[1, 2], [3, 4]]])
array2 = [Link]([[[5, 6], [7, 8]]])
result = [Link]((array1, array2), axis=0)
print(result)

Output:
[[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]]

6. Reshaping Arrays:

The reshape function in NumPy allows you to give a new shape to an array without changing
its data. It returns a new array with the same data but a different shape. This functionality is
particularly useful when working with different dimensions of data, like transforming a 1D array into a
2D array or reshaping a 3D array into a 2D array.

Syntax:

[Link](array, newshape, order = 'C')

Here,

 array - input array that needs to be reshaped,


 newshape - desired new shape of the array.

 order (optional) - specifies the order in which the elements of the array should be arranged. By
default it is set to 'C'

Note: The order argument can take one of three values: 'C', 'F', or 'A'.

Reshape 1D Array to 2D Array in NumPy:

We use the reshape() function to reshape a 1D array into a 2D array.

Ex:
import numpy as np
array1 = [Link]([1, 3, 5, 7, 2, 4, 6, 8])
# reshape a 1D array into a 2D array
# with 2 rows and 4 columns
result1 = [Link](array1, (2, 4))
print(result1)

175
#reshapa a 1D array into a 3D array
result2 = [Link](array1,(2,2,2))
print(result2)

Output:

[[1 3 5 7]

[2 4 6 8]]

[[[1 3]

[5 7]]

[[2 4]

[6 8]]]

[Link](array1, (2, 4)), for 2D array

[Link](array1,(2,2,2)), for 3D array

Here, reshape() takes two parameters,

 array1 - array to be reshaped

 (2, 4) - new shape of array1 specified as a tuple with 2 rows and 4 columns.

 (2,2,2) – new shape of array1 specified as a tuple with 2 rows, 2 columns with 2 layers.

Flatten N-d Array to 1-D Array Using reshape( ):

Flattening an array simply means converting a multidimensional array into a 1D array.

To flatten an N-d array to a 1-D array we can use reshape() and pass "-1" as an argument.

Ex:
import numpy as np

# flatten 2D array to 1D
array1 = [Link]([[1, 3], [5, 7], [9, 11]])
result1 = [Link](array1, -1)
print("Flattened 2D array:", result1)

# flatten 3D array to 1D
array2 = [Link]([[[1, 3], [5, 7]], [[2, 4], [6, 8]]])
result2 = [Link](array2, -1)
print("Flattened 3D array:", result2)

176
Output:

Flattened 2D array: [ 1 3 5 7 9 11]


Flattened 3D array: [1 3 5 7 2 4 6 8]

Here, reshape(array1, -1) and reshape(array2, -1) convert 2-D array and 3-D array into a 1-D array by
collapsing all dimensions.

7. Splitting Arrays:

The split( ) function in the NumPy library is a versatile tool for dividing an array into multiple
sub-arrays. Whether working with large datasets or performing parallel computations, this function
allows for efficient data manipulation by segmenting arrays based on specified conditions.

NumPy split() is particularly useful when handling large volumes of data that need to be processed in
manageable parts.

There are many methods to Split Numpy Array in Python using different functions some of them are
mentioned below:

 Split numpy array using [Link]()

 Split numpy array using numpy.array_split()

 Splitting NumPy 2D Arrays

 Split numpy array using [Link]()

 Split numpy array using numpyhsplit()

 Split numpy arrayusing [Link]()

Key concepts and terminology:

 Axis: The dimension along which the array is split (e.g., rows, columns, depth).

 Sub-arrays: The smaller arrays resulting from the split.

 Splitting methods: Different functions in NumPy for splitting arrays (e.g., [Link](),
[Link](), [Link](), etc.).

 Equal vs. Unequal splits: Whether the sub-arrays have the same size or not.

1. Equal Split: Divides an array into equal parts along a specified axis.
Method: [Link]( )

Ex:
import numpy as np
Arr = [Link]([1, 2, 3, 4, 5, 6])
array = np.array_split(arr, 3)
print(array)

177
Output:
[array([1, 2]), array([3, 4]), array([5, 6])]

2. Unequal split: It allows for uneven splitting of arrays. This is useful when the array cannot be
evenly divided by the specified number of splits.

Method: [Link]( )

Ex:
import numpy as np
Arr = [Link]([1,2,3,4,5])
array1 = np.array_split(Arr, 2)
print(array1)

Output:
[array([1, 2, 3]), array([4, 5])]

3. hsplit( ) and vsplit( ):

hsplit( ):

 hsplit is for splitting arrays column-wise (horizontally).


 Itt is equivalent to [Link](arr, indices_or_sections, axis=1). It requires that the array can
be equally divided into the specified number of sections.

vsplit( ):

 vsplit is for splitting arrays row-wise (vertically).


 It is equivalent to [Link](arr, indices_or_sections, axis=0). Similar to hsplit, it requires
that the array can be equally divided into the specified number of sections.

Ex:
import numpy as np
arr = [Link]([[1, 2, 3, 4],[5, 6, 7, 8]])

# Split into 2 equal parts horizontally


hsplit = [Link](arr, 2)
print(hsplit)

# Split into 2 equal parts vertically


vsplit = [Link](arr, 2)
print(vsplit)

Output:
[array([[1, 2],
[5, 6]]), array([[3, 4],
[7, 8]])]
[array([[1, 2, 3, 4]]), array([[5, 6, 7, 8]])]

178
8. Statistical Operations on Arrays:

NumPy provides a comprehensive set of functions for performing various statistical operations on
arrays. These functions are highly optimized for numerical computations, making them efficient for
analyzing large datasets.

Common Statistical Operations:

1. Measures of Central Tendency:


 [Link]( ): Calculates the arithmetic mean (average) of array elements.
 [Link]( ): Calculates the median (middle value) of array elements.
 [Link]( ): Calculates the weighted average of array elements.
2. Measures of Dispersion:
 [Link]( ): Calculates the standard deviation, a measure of data spread.
 [Link]( ): Calculates the variance, the square of the standard deviation.
 [Link]( ): Calculates the peak-to-peak range (maximum - minimum).
3. Extremes:
 [Link]( ) or [Link]( ): Returns the minimum value in the array.
 [Link]( ) or [Link]( ): Returns the maximum value in the array.
4. Sums and Products:
 [Link]( ): Calculates the sum of all elements in the array.
 [Link]( ): Calculates the product of all elements in the array.
5. Quantiles:
 [Link](): Calculates the nth percentile of the data along a specified axis.

Ex 1: 1D Array

import numpy as np
array1 = [Link]([12,54,10,18,21])
print("Min: ",[Link](array1))
print("Max: ",[Link](array1))
print("Mean: ",[Link](array1))
print("Median: ",[Link](array1))
print("Average: ",[Link](array1))
print("Sum: ",[Link](array1))
print("Product: ",[Link](array1))
print("Standard Deviation: ",[Link](array1))
print("Variance: ",[Link](array1))
print("Peak-To-Peak range: ",[Link](array1))
print("Percentile: ",[Link](array1,25))

Output:
Min: 10
Max: 54
Mean: 23.0
Median: 18.0

179
Average: 23.0
Sum: 115
Product: 2449440
Standard Deviation: 16.0
Variance: 256.0
Peak-To-Peak range: 44
Percentile: 12.0

Ex 2: 2D Array ( column – axis = 0 ; row – axis = 1)

import numpy as np
array2 = [Link]([[3,8],[2,7]])
print("Column - Min: ", [Link](array2,axis=0))
print("Row - Min: ", [Link](array2,axis=1))
print("Column - Max: ", [Link](array2,axis=0))
print("Row - Max: ", [Link](array2,axis=1))
print("Column - Mean: ", [Link](array2,axis=0))
print("Row - Mean: ", [Link](array2,axis=1))

Output:
Column - Min: [2 7]
Row - Min: [3 2]
Column - Max: [3 8]
Row - Max: [8 7]
Column - Mean: [2.5 7.5]
Row - Mean: [5.5 4.5]

180
Data Handling Using Pandas

9. Introduction: Data Handling Using Pandas

 Pandas is open-source Python library which is used for data manipulation and analysis. It
consist of data structures and functions to perform efficient operations on data.
 It is well-suited for working with tabular data such as spreadsheets or SQL tables.
 It is used in data science because it works well with other important libraries.
 It is built on top of the NumPy library as it makes easier to manipulate and analyze.

Pandas is used in other libraries such as:


 Matplotlib for plotting graphs
 SciPy for statistical analysis
 Scikit-learn for machine learning algorithms.
 It uses many functionalities provided by NumPy library.

Using of Pandas:
 Data Cleaning, Merging and Joining: Clean and combine data from multiple sources, handling
inconsistencies and duplicates.
 Handling Missing Data: Manage missing values (NaN) in both floating and non-floating point
data.
 Column Insertion and Deletion: Easily add, remove or modify columns in a DataFrame.
 Group By Operations: Use "split-apply-combine" to group and analyze data.
 Data Visualization: Create visualizations with Matplotlib and Seaborn, integrated with Pandas.
 Statistical Analysis: Calculating descriptive statistics (Ex: mean, median, standard deviation)
 Time Series Functionality: Handling and analyzing time-indexed data.

Getting Started with Pandas: Working with Pandas Library Files

Installing Pandas:
First step in working with Pandas is to ensure whether it is installed in the system or not. If not then
we need to install it on our system using the pip command.

pip install pandas

Importing Pandas: After the Pandas have been installed in the system we need to import the library.

import pandas as pd

Note: pd is just an alias for Pandas.


It’s not required but using it makes the code shorter when calling methods or properties.

181
Data Structures in Pandas Library:

Pandas provide two data structures for manipulating data which are as follows:

1. Pandas Series and


2. Pandas DataFrame

1. Pandas Series:

A Pandas Series is one-dimensional labelled array capable of holding data of any type (integer,
string, float, Python objects etc.). The axis labels are collectively called indexes.

Pandas Series is created by loading the datasets from existing storage which can be a SQL database, a
CSV file or an Excel file. It can be created from lists, dictionaries, scalar values, etc.

Key Features:
 Supports integer-based and label-based indexing.
 Stores heterogeneous data types.
 Offers a variety of built-in methods for data manipulation and analysis.

Creating a Panda Series:


A Pandas Series can be created from different data structures such as lists, NumPy arrays,
dictionaries or scalar value.

Ex:
import pandas as pd
data = [1, 2, 3, 4]
ser = [Link](data)
print(ser)

Output:
0 1
1 2
2 3
3 4
dtype: int64

The numbers on the left (0, 1, 2, 3) represent the index which is automatically assigned by Pandas.
The values (10, 20, 30, 40) represent the data stored in the Series

182
Accessing element of Series:
We can access element of series using 2 ways:
1. Position-based Indexing and
2. Label-based Indexing

1. Position – based Indexing:


In order to access the series element refers to the index number. Use the index operator [ ] to
access an element in a series. The index must be an integer.
In order to access multiple elements from a series we use Slice operation.

Ex:
import pandas as pd
import numpy as np
data = [Link](['C','o','d','e','r','!'])
ser = [Link](data)
print(ser)
print("By Index: ",ser[2])
print("By Slice:\n",ser[0:4])

Output:
0 C
1 o
2 d
3 e
4 r
5 !
dtype: object
By Index: d
By Slice:
0 C
1 o
2 d
3 e
dtype: object

2. Label – based Indexing:


In order to access an element from series, we have to set values by index label. A Series is like
a fixed-size dictionary in that we can get and set values by index label.

Ex:
import pandas as pd
import numpy as np
data = [Link](['C','o','d','e','r','!'])
ser = [Link](data,index=[11,12,13,14,15,16])
print(ser)
print("Label value for 13 is: ",ser[13])

183
Output:
11 C
12 o
13 d
14 e
15 r
16 !
dtype: object
Label value for 13 is: d

2. Pandas DataFrame:

 A Pandas DataFrame is a two-dimensional table-like structure where data is arranged in rows


and columns.
 It’s one of the most commonly used tools for handling data and makes it easy to organize,
analyze and manipulate data.
 It can store different types of data such as numbers, text and dates across its columns.
 It is created by loading the datasets from existing storage which can be a SQL database, a CSV
file or an Excel file.
 It can be created from lists, dictionaries, a list of dictionaries etc.

Key Features:

 Tabular Structure
 Heterogenous Data Types
 Integration with External Data Source.
 Powerful Data Manipulation Capabilities
 Handing Missing Data
 Mutable size and values

The main 3 parts of a DataFrame are:

1. Data: Actual values in the table.


2. Rows: Labels that identify each row.
3. Columns: Labels that define each data category.

Creating a Pandas DataFrame:

Pandas allows us to create a DataFrame from many data sources. We can create DataFrames
directly from Python objects like lists and dictionaries or by reading data from external files like CSV,
Excel or SQL databases.

Here are some ways by which we create a dataframe:

1. Creating DataFrame using a List:

If we have a simple list of data, we can easily create a DataFrame by passing that list to
the [Link]() function.

184
Ex:

import pandas as pd
lst = ['Hello', 'Python', 'Coder!']
df = [Link](lst)
print(df)

Output:
0
0 Hello
1 Python
2 Coder!

2. Creating DataFrame from dict of ndarray/lists:

We can create a DataFrame from a dictionary where the keys are column names and the values
are lists or arrays.
 All arrays/lists must have the same length.
 If an index is provided, it must match the length of the arrays.
 If no index is provided, Pandas will use a default range index (0, 1, 2, …).

Ex:
import pandas as pd
data = {'Name': ['Tom', 'Nick', 'Krish', 'Jack'],
'Age': [20, 21, 19, 18],
'Qualification': ['[Link]', '[Link]', '[Link]', 'B.C.A']}
df = [Link](data)
print(df)

Output:
Name Age Qualification
0 Tom 20 [Link]
1 Nick 21 [Link]
2 Krish 19 [Link]
3 Jack 18 B.C.A

3. Working with Rows and Columns in DataFrame:


We can perform basic operations on rows/columns like selecting, deleting, adding and
renaming.

a) Column Selection: we can either access the columns by calling them by their columns name.

By using above example, we select the column : Name and Qualification


Ex:
print(df[['Name','Qualification']])

185
Output:
Name Qualification
0 Tom [Link]
1 Nick [Link]
2 Krish [Link]
3 Jack B.C.A

b) Row Selection: Pandas provide unique methods for selecting rows from a DataFrame.
[Link][ ] method is used for label-based selection

By using above example, we select the 2 nd row selection in DataFrame


Ex:
row = [Link][2]
print(row)

Output:
Name Krish
Age 19
Qualification [Link]

10. Importing and Exporting Data between CSV Files and DataFrames.

Pandas DataFrame:

 A Pandas DataFrame is a two-dimensional table-like structure where data is arranged in rows


and columns.
 It’s one of the most commonly used tools for handling data and makes it easy to organize,
analyze and manipulate data.
 It can store different types of data such as numbers, text and dates across its columns.
 It is created by loading the datasets from existing storage which can be a SQL database, a CSV
file or an Excel file.
 It can be created from lists, dictionaries, a list of dictionaries etc.

Key Features:

 Tabular Structure
 Heterogenous Data Types
 Integration with External Data Source.
 Powerful Data Manipulation Capabilities
 Handing Missing Data
 Mutable size and values

The main 3 parts of a DataFrame are:

1. Data: Actual values in the table.


2. Rows: Labels that identify each row.
3. Columns: Labels that define each data category.

186
Ex:
import pandas as pd
lst = ['Hello', 'Python', 'Coder!']
df = [Link](lst)
print(df)

Output:
0
0 Hello
1 Python
2 Coder!

CSV File: Comma Separated Values

A CSV file appears as plain text in a text editor, with data organized in a tabular format
where each row represents a data record and each column represents a field. The values within each
row are separated by a delimiter, most commonly a comma, which is why the format is called Comma
Separated Values. The first line typically contains the column headers, which label the data in each
column.

Ex: Filename as '[Link]'

ID,Name,Department
101,Clara,Developer
102,John,Designer
103,Den,Architect
104,Smith,HR
105,James,Developer
106,Steve,Tester

a) Importing Data between CSV Files and DataFrames:

The process of moving data between CSV files and DataFrames is straightforward, primarily
facilitated by the Pandas library in Python.

Importing data from CSV to a DataFrame:

The pd.read_csv( ) function is the primary tool for reading data from CSV files and loading it
into a DataFrame

Syntax:
DataFrame_name = pd.read_CSV('[Link]')

187
Ex:

import pandas as pd

#Importing CSV file to DataFrame


df = pd.read_csv('E:/[Link]')

#Displaying First 5 rows element


print([Link]( ))

Output:
ID Name Department
0 101 Clara Developer
1 102 John Designer
2 103 Den Architect
3 104 Smith HR
4 105 James Developer

Common pd.read_csv() parameters:

The pd.read_csv() function offers numerous parameters for customizing the data import
process, including:
 filepath_or_buffer: Path or URL of the CSV file.
 sep: Delimiter used in the CSV file (default is comma ,).
 header: Row number(s) to use as column names (default is 0, the first row).
 names: List of column names to assign to the DataFrame.
 index_col: Column(s) to set as the DataFrame index.
 usecols: List of columns to read and include.
 skiprows: Rows to skip during reading.
 nrows: Number of rows to read from the file.

head( ): This function returns top 5 rows.

If we need some specified number of rows then i.e. Need to display 10 rows from the CSV file,

[Link](10)

to_string( ): This function returns all data present in the CSV file

max_rows: The number of rows returned is defined in Pandas option settings.

You can check your system's maximum rows with the [Link].max_rows statement.
You can change the maximum rows number with the same statement.

[Link].max_rows = 9999

188
Ex:
import pandas as pd
[Link].max_rows = 9999
df = pd.read_csv('E:/[Link]')
print([Link](3)) # Display First 3 Elements
print("---------------------------")
print(df.to_string())
max_rows = [Link].max_rows
print("Max rows: ", max_rows)

Output:
ID Name Department
0 101 Clara Developer
1 102 John Designer
2 103 Den Architect
---------------------------
ID Name Department
0 101 Clara Developer
1 102 John Designer
2 103 Den Architect
3 104 Smith HR
4 105 James Developer
5 106 Steve Tester
Max rows: 9999

b) Exporting Data between CSV Files and DataFrames:

When working on a Data Science project one of the key tasks is data
management which includes data collection, cleaning and storage. Once our data is cleaned and
processed it’s essential to save it in a structured format for further analysis or sharing.

A CSV (Comma-Separated Values) file is a widely used format for storing tabular data.
Pandas provides an easy-to-use function:
to_csv( ) - to export a DataFrame into a CSV file

Mode: {‘w’, ‘x’, ‘a’}, default ‘w’


Forwarded to either open(mode=) or [Link](mode=) to control the file opening. Typical values
include:
 ‘w’, truncate the file first.
 ‘x’, exclusive creation, failing if the file already exists.
 ‘a’, append to the end of file if it exists.

189
The to_csv() method offers several parameters for customization:

 path_or_buf: The file path where the CSV will be saved (e.g., 'data/my_exported_data.csv').
 sep: The delimiter to use between values (default is comma ,). Other common options include
tab \t, semicolon ;, or pipe |.
 header: Set to False to exclude column names from the first row of the CSV.
 columns: A list of column names to export, if only a subset of columns is desired.
 na_rep: A string representation for missing values (NaN). By default, missing values are
written as empty fields.
 encoding: Specifies the encoding to use for writing the file (e.g., 'utf-8').

Exporting DataFrame to CSV:

1. Basic Export:
The simplest way to export a DataFrame to a CSV file is by using the to_csv() function without
any additional parameters. This method creates a CSV file where the DataFrame's contents are written
as-is.

Ex:
import pandas as pd
scores = {'Name': ['a', 'b', 'c', 'd'],
'Score': [90, 80, 95, 20]}
df = [Link](scores)
df.to_csv("[Link]")
print(df)

Output:

File Successfully saved

2. Remove Index Column:


The to_csv() exports the index column which represents the row numbers of the DataFrame. If
we do not want this extra column in our CSV file we can remove it by setting index=False.

Ex:
df.to_csv("[Link]", index = False)

Output :

190
3. Export only selected columns:
In some cases we may not want to export all columns from our DataFrame.
The columns parameter in to_csv() allows us to specify which columns should be included in the
output file.

Ex:
df.to_csv("[Link]", columns = ['Name'])

Output :

4. Exclude Header Row:


By default theto_csv() function includes column names as the first row of the CSV file.
However if we need a headerless file e.g., for certain machine learning models or integration with
other systems we can set header=False.

Ex:
df.to_csv("[Link]", header = False)

Output :

5. Handling Missing Values:


DataFrames often contain missing values (NaN) which can cause issues in downstream
analysis. By default Pandas writes NaN as an empty field but we can customize this behavior using
the na_rep parameter.

Ex:
df.to_csv("[Link]", na_rep = 'nothing')

6. Change Column Separator:


CSV files use commas (,) by default as delimiters to separate values. However in some cases
other delimiters may be required such as tabs (), semicolons (;), or pipes (|). Using a different delimiter
can make the file more readable or compatible with specific systems.
Ex:
df.to_csv("[Link]", sep ='\t')
Output :

191
Plotting Data Using Matplotlib

11. Introduction:

 Matplotlib is an open-source visualization library for the Python programming language,


widely used for creating static, animated and interactive plots.
 It provides an object-oriented API for embedding plots into applications using general-purpose
GUI toolkits like Tkinter, Qt, GTK and wxPython.
 It offers a variety of plotting functionalities, including line plots, bar charts, histograms, scatter
plots and 3D visualizations.
 Created by John D. Hunter in 2003, Matplotlib has become a fundamental tool for data
visualization in Python, extensively used by data scientists, researchers and engineers
worldwide.

Setting Up Matplotlib:

Before using Matplotlib, ensure you have it installed. We can install it using pip:

pip install matplotlib

Once installed, we can import it into your Python script:

import [Link] as plt

Components or Parts of Matplotlib Figure:

Anatomy of a Matplotlib Plot: This section dives into the key components of a Matplotlib plot,
including figures, axes, titles, and legends, essential for effective data visualization.

192
The parts of a Matplotlib figure include (as shown in the figure above):

 Figure: The overarching container that holds all plot elements, acting as the canvas for
visualizations.

 Axes: The areas within the figure where data is plotted; each figure can contain multiple axes.

 Axis: Represents the x-axis and y-axis, defining limits, tick locations, and labels for data
interpretation.

 Lines and Markers: Lines connect data points to show trends, while markers denote
individual data points in plots like scatter plots.

 Title and Labels: The title provides context for the plot, while axis labels describe what data is
being represented on each axis.

Matplotlib Pyplot:

Pyplot is a module within Matplotlib that provides a MATLAB-like interface for making plots.
It simplifies the process of adding plot elements such as lines, images, and text to the axes of the
current figure. Steps to Use Pyplot:

 Import Matplotlib: Start by importing [Link] as plt.

 Create Data: Prepare your data in the form of lists or arrays.

 Plot Data: Use [Link]() to create the plot.

 Customize Plot: Add titles, labels, and other elements using methods like [Link](),
[Link](), and [Link]().

 Display Plot: Use [Link]() to display the plot.

Ex:

import [Link] as plt

x = [0, 2, 4, 6, 8]
y = [0, 4, 16, 36, 64]

fig, ax = [Link]()
[Link](x, y, marker='o', label="Data Points")

ax.set_title("Basic Components of Matplotlib Figure")


ax.set_xlabel("X-Axis")
ax.set_ylabel("Y-Axis")

[Link]()

193
Output:

Key Features of Matplotlib:

 Versatile Plotting: Create a wide variety of visualizations, including line plots, scatter plots, bar
charts, and histograms.

 Extensive Customization: Control every aspect of your plots, from colors and markers to labels
and annotations.

 Seamless Integration with NumPy: Effortlessly plot data arrays directly, enhancing data
manipulation capabilities.

 High-Quality Graphics: Generate publication-ready plots with precise control over aesthetics.

 Cross-Platform Compatibility: Use Matplotlib on Windows, macOS, and Linux without issues.

 Interactive Visualizations: Engage with your data dynamically through interactive plotting
features.

Different Types of Plots in Matplotlib

Matplotlib offers a wide range of plot types to suit various data visualization needs. Here are some of
the most commonly used types of plots in Matplotlib:

1. Line Chart

2. Bar Chart

3. Histogram

4. Scatter Chart and

5. Pie Chart

194
1. Line Chart:

A line chart displays the evolution of one or several numeric variables. It is often used to
represents time series.

It is also used to represent the relation between two data X and Y on a different axis.

When to use:

Line charts are best used to display trends and changes in data over a continuous interval, like
time. They are particularly effective for visualizing how a variable changes in relation to another, often
showing patterns of growth, decline, or fluctuations.

Creation of Line Chart:

1. Import Matplotlib: Begin by importing the pyplot module, typically aliased as plt.

import [Link] as plt

2. Prepare Data: Define your x-axis and y-axis data, usually as lists or NumPy arrays. The number of
elements in both lists/arrays should be equal, as each x-value corresponds to a y-value.

month = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun'] # X axis value


sales = [120, 150, 125, 160, 139, 230] # Y axis value

3. Plot the Data: Use [Link]() to create the line chart, passing your x-values and y-values as
arguments.

[Link](x-values, y-values)

4. Add Labels and Title (Optional but Recommended): Enhance readability by adding labels for the
x and y axes and a title for the plot.

[Link]("X-Axis Label")
[Link]("Y-Axis Label")
[Link]("My Line Chart")

5. Display the Plot: Finally, use [Link]() to render and display the generated line chart.

[Link]()

Example:

import [Link] as plt

# Sample data
month = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
sales = [120, 150, 125, 160, 139, 230]

#Create Line Chart


[Link](month, sales)

195
#Add Labels and Title
[Link]("Months")
[Link]("[Link] (Mobiles)")
[Link]("Monthly Sales Data")

# Display the plot


[Link]()

Output:

Customization Options:

Matplotlib offers extensive customization for line charts, including:

 Line Style: Change the line style (e.g., 'dotted', 'dashed', '-.') using the linestyle or ls argument
in [Link]().

 Line Color: Set the line color using the color or c argument (e.g., 'r' for red, 'blue', or
hexadecimal color codes).

 Markers: Add markers at data points using the marker argument (e.g., 'o' for circles, 's' for
squares).

 Multiple Lines: Plot multiple lines on the same chart by calling [Link]() multiple times with
different datasets before calling [Link]().

 Legend: Add a legend to differentiate multiple lines using [Link]().

 Gridlines: Include gridlines for easier data interpretation using [Link](True).

196
2. Bar Chart:

A bar plot or bar chart is a graph that represents the category of data with rectangular bars with
lengths and heights that is proportional to the values which they represent.

The bar plots can be plotted horizontally or vertically. A bar chart describes the comparisons between
the discrete categories. It can be created using the bar( ) method.

When to use:

Bar charts are best used when you need to compare categories or groups of data, or to show
how values change over time for discrete categories. They are particularly effective when comparing
multiple categories or subcategories, analyzing proportions within categories, and highlighting
rankings.

Creation of Bar Chart:

1. Import Matplotlib: Begin by importing the pyplot module, typically aliased as plt.

import [Link] as plt

2. Prepare Data: Define your x-axis and y-axis data, usually as lists or NumPy arrays. The number of
elements in both lists/arrays should be equal, as each x-value corresponds to a y-value.

month = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun'] # X axis value


sales = [120, 150, 125, 160, 139, 230] # Y axis value (Height)

3. Plot the Data: Use [Link]() to create the bar chart, passing your x-values and y-values as
arguments.

[Link](x-values, y-values)

The [Link]() function is used to create vertical bar charts.

It typically takes 2 main arguments:

 x: The positions or labels for the bars on the x-axis (categories). This can be a list of strings or
numerical values.
 height: The heights of the bars, corresponding to the values for each category. This should be a
list or array of numerical values.

4. Add Labels and Title (Optional but Recommended): Enhance readability by adding labels for the
x and y axes and a title for the plot.

[Link]("X-Axis Label")
[Link]("Y-Axis Label")
[Link]("My Bar Chart")

5. Display the Plot: Finally, use [Link]() to render and display the generated bar chart.

[Link]()

197
Example:

import [Link] as plt

# Sample data
month = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
sales = [120, 150, 125, 160, 139, 230]

# Create Bar Chart


[Link](month, sales)

#Add Labels and Title


[Link]("Months")
[Link]("[Link] (Mobiles)")
[Link]("Monthly Sales Data")

# Display the plot


[Link]()

Output:

Creating Horizontal Bar Charts:

To create horizontal bar charts, the [Link]() function is used. The arguments are
similar, but x now represents the y-axis positions/labels and width represents the length of the bars.

Customization Options:

Matplotlib's bar() and barh() functions offer various parameters for customization, including:

 color: Sets the color of the bars. Can be a single color string or a list of colors for individual
bars.

 width: Controls the width of the bars (for vertical charts).

198
 align: Specifies how the bars are aligned relative to their x-axis positions (e.g., 'center', 'edge').

 label: Assigns a label to the bars for use in a legend.

 edgecolor, linewidth: Customize the appearance of the bar borders.

Adding Labels, Titles, and Legends:

 [Link](), [Link](): Set labels for the x and y axes.

 [Link](): Sets the title of the chart.

 [Link](): Displays a legend if labels have been assigned to the bars.

Stacked and Grouped Bar Charts:

Matplotlib also supports creating more complex bar charts like stacked bar charts (where segments of
bars represent different subcategories within a main category) and grouped bar charts (where multiple
bars are displayed side-by-side for each category). These often involve using multiple calls
to bar() with adjusted positions or utilizing libraries like Pandas for easier data handling and plotting.

3. Histogram:

A Histogram represents data provided in the form of some groups. It is an accurate method for
the graphical representation of numerical data distribution. It is a type of bar plot where the X-axis
represents the bin ranges while the Y-axis gives information about frequency.

When to use:

Histogram is best used when you need to understand the distribution of numerical data, especially
when dealing with continuous data or large datasets. It helps visualize the frequency of data points
falling within specific ranges or intervals. Histograms are useful for identifying patterns, outliers, and
the overall shape of a dataset's distribution.

Creation of Histogram:

1. Import Matplotlib: Begin by importing the pyplot module, typically aliased as plt.

import [Link] as plt

2. Prepare Data: In histogram, we generate the 1000 random values by using NumPy array built in
function as [Link](1000).

import numpy as np
# Generate some sample data
data = [Link](1000)

199
3. Plot the Data: Use [Link]() to create the histogram. It takes an array or sequence of numerical data
as input.

#Create a histogram
[Link](data)

4. Bins: Bins are consecutive, non-overlapping intervals that divide the range of the data. You can
specify the number of bins using the bins argument (e.g., bins=20) or provide a list of bin edges
(e.g., bins=[0, 1, 2, 3]).

[Link](data, bins=30, edgecolor='black') # 30 bins with black edges

Customization:

The [Link]() function offers various parameters for customization, including:

 color: Sets the fill color of the bars.

 edgecolor: Sets the color of the bar edges.

 density: If True, the histogram displays probability density instead of counts.

 cumulative: If True, the histogram displays the cumulative frequency.

 log: If True, uses a logarithmic scale for the y-axis.

5. Labels and Title:

You can add labels to the x and y axes using [Link]() and [Link](), and a title to the plot
using [Link]().

[Link]("X-Axis Label")
[Link]("Y-Axis Label")
[Link]("My Histogram")

6. Display the Plot: Finally, use [Link]() to render and display the generated histogram.
[Link]()

Example:
import [Link] as plt
import numpy as np
# Generate some sample data
data = [Link](1000)
# Create a basic histogram
[Link](data, bins='auto', color='skyblue', edgecolor='navy')
[Link]('Value')
[Link]('Frequency')
[Link]('Distribution of Data')
[Link]()

200
Output:

4. Scatter Chart / Plot:

A scatter plot displays individual data points, representing the relationship between two
numerical variables. Each point on the plot corresponds to a pair of values, with one variable plotted
on the horizontal (x) axis and the other on the vertical (y) axis.

When to use:

 To identify correlations or relationships.


 When dealing with paired numerical data.
 To detect patterns, clusters, and outliers.
 For data that is not continuous or where the relationship is unknown.

Creation of Scatter Chart:

1. Import Matplotlib: Begin by importing the pyplot module, typically aliased as plt.

import [Link] as plt

2. Prepare Data: Define your x-axis and y-axis data, usually as Python lists or NumPy arrays. The
number of elements in both lists/arrays should be equal, as each x-value corresponds to a y-value.

month = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun'] # X axis value


sales = [120, 150, 125, 160, 139, 230] # Y axis value

3. Plot the Data: Use [Link]() to create the scatter chart, passing your x-values and y-values as
arguments.

#Create Scatter Plot


[Link](x-values, y-values)

201
4. Add Labels and Title (Optional but Recommended): Enhance readability by adding labels for the
x and y axes and a title for the plot.

[Link]("X-Axis Label")
[Link]("Y-Axis Label")
[Link]("My Scatter Chart")

5. Display the Plot: Finally, use [Link]() to render and display the generated scatter chart.

[Link]()

Example:

import [Link] as plt

# Sample data
month = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
sales = [120, 150, 125, 160, 139, 230]

#Create Scatter Plot


[Link](month, sales)

#Add Labels and Title


[Link]("Months")
[Link]("[Link] (Mobiles)")
[Link]("Monthly Sales Data")

# Display the plot


[Link]()

Output:

202
Customization Options:
The [Link]() function offers various parameters for customizing the appearance of the scatter plot:

 s: Controls the size of the markers (points). Can be a single value for all points or an array for
varying sizes.

 c: Sets the color of the markers. Can be a single color, a list of colors, or an array of values to
be mapped to a colormap.

 marker: Specifies the marker style (e.g., 'o' for circle, 's' for square, '^' for triangle).

 alpha: Adjusts the transparency of the markers (value between 0 and 1).

 edgecolors: Sets the color of the marker edges.

 cmap: Defines the colormap to use when c is an array of values.

5. Pie Chart:

Pie Chart is a circular statistical plot that can display only one series of data. The area of the
chart is the total percentage of the given data. The area of slices of the pie represents the percentage of
the parts of the data. The slices of pie are called wedges. The area of the wedge is determined by the
length of the arc of the wedge.

Creation of Pie Chart:

1. Import Matplotlib: Begin by importing the pyplot module, typically aliased as plt.

import [Link] as plt

2. Prepare Data: Define your data, usually as lists or NumPy array.

month = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']


sales = [120, 150, 125, 160, 139, 230]

3. Plot the Data: To create a pie chart in Matplotlib, the [Link]() function is utilized. This function
takes an array of numerical data representing the size of each slice and can be customized with various
parameters.

#Create Pie Chart


[Link](numerical_data)

4. Add Title (Optional but Recommended): Enhance readability by adding a title for the plot.

[Link]("My Pie Chart")

5. Display the Plot: Finally, use [Link]() to render and display the generated pie chart.

[Link]()

203
Example:

import [Link] as plt


# Sample data
month = ['Jan', 'Feb', 'Mar', 'Apr']
sales = [300, 250, 230, 220]

#Create the pie chart


[Link](sales,labels=month,autopct='%1.1f%%')
[Link]("Monthly Sales for Mobiles")

# Display the plot


[Link]()

Output:

Adding Labels and Customizations:

The pie() function has several parameters for customization:

 labels: An array of strings providing labels for each slice.

 colors: A list of colors for the slices.

 explode: An array of values indicating how far each slice should be "exploded" or separated
from the center.

 autopct: A format string (e.g., '%1.1f%%') to display the percentage value on each slice.

 shadow: A boolean value to add a shadow effect.

 startangle: An angle in degrees to specify the starting point of the first slice.

 wedgeprops: A dictionary to customize the appearance of the wedges (e.g., {'linewidth': 1,


'edgecolor': 'black'}).

204
Ex:

import [Link] as plt


import numpy as np

# Data and labels


# Sample data
month = ['Jan', 'Feb', 'Mar', 'Apr']
sales = [300, 250, 230, 220]

mycolors = ["blue", "green", "red", "orange"]


myexplode = [0.1, 0, 0, 0] # Explode 'Category A'

# Create the pie chart with customizations


[Link](sales,
labels=month,
colors=mycolors,
explode=myexplode,
autopct='%1.1f%%',
shadow=True,
startangle=90)

# Ensure the pie chart is circular


[Link]('equal')

# Add a title
[Link]("Monthly Sales for Mobiles")

# Display the chart


[Link]()

Output:

205
Python Interview Programs

Python Programs on Numbers:

1. Write a program to reverse an integer in Python.


2. Write a program in Python to check whether an integer is Armstrong number or not.
3. Write a program in Python to check given number is prime or not.
4. Write a program in Python to print the Fibonacci series using iterative method.
5. Write a program in Python to print the Fibonacci series using recursive method.
6. Write a program in Python to check whether a number is palindrome or not using
iterative method.
7. Write a program in Python to check whether a number is palindrome or not using
recursive method.
8. Write a program in Python to find greatest among three integers.
9. Write a program in Python to check if a number is binary?
10. Write a program in Python to find sum of digits of a number using recursion?
11. Write a program in Python to swap two numbers without using third variable?
12. Write a program in Python to swap two numbers using third variable?
13. Write a program in Python to find prime factors of a given integer.
14. Write a program in Python to add two integer without using arithmetic operator?
15. Write a program in Python to find given number is perfect or not?
16. Python Program to find the Average of numbers with explanations.
17. Python Program to calculate factorial using iterative method.
18. Python Program to calculate factorial using recursion.
19. Python Program to check a given number is even or odd.
20. Python program to print first n Prime Number with explanation.
21. Python Program to print Prime Number in a given range.
22. Python Program to find Smallest number among three.
23. Python program to calculate the power using the POW method.
24. Python Program to calculate the power without using POW function.(using for loop).
25. Python Program to calculate the power without using POW function.(using while
loop).
26. Python Program to calculate the square of a given number.
27. Python Program to calculate the cube of a given number.
28. Python Program to calculate the square root of a given number.
29. Python program to calculate LCM of given two numbers.
30. Python Program to find GCD or HCF of two numbers.
31. Python Program to find GCD of two numbers using recursion.
32. Python Program to Convert Decimal Number into Binary.
33. Python Program to convert Decimal number to Octal number.
34. Python Program to check the given year is a leap year or not.
35. Python Program to convert Celsius to Fahrenheit.
36. Python Program to convert Fahrenheit to Celsius.
37. Python program to calculate Simple Interest with explanation.

Python Programs on String:

1. Python program to remove given character from String.


2. Python Program to count occurrence of a given characters in string.
3. Python Program to check if two Strings are Anagram.
4. Python program to check a String is palindrome or not.
5. Python program to check given character is vowel or consonant.
6. Python program to check given character is digit or not.
7. Python program to check given character is digit or not using isdigit() method.
8. Python program to replace the string space with a given character.
9. Python program to replace the string space with a given character using replace()
method.
10. Python program to convert lowercase char to uppercase of string.
11. Python program to convert lowercase vowel to uppercase in string.
12. Python program to delete vowels in a given string.
13. Python program to count the Occurrence Of Vowels & Consonants in a String.
14. Python program to print the highest frequency character in a String.
15. Python program to Replace First Occurrence Of Vowel With ‘-‘ in String.
16. Python program to count alphabets, digits and special characters.
17. Python program to separate characters in a given string.
18. Python program to remove blank space from string.
19. Python program to concatenate two strings using join() method.
20. Python program to concatenate two strings without using join() method.
21. Python program to remove repeated character from string.
22. Python program to calculate sum of integers in string.
23. Python program to print all non repeating character in string.
24. Python program to copy one string to another string.
25. Python Program to sort characters of string in ascending order.
26. Python Program to sort character of string in descending order.

Python Programs on Array:

1. Write a program in Python for, In array 1-100 numbers are stored, one number is
missing how do you find it.
2. Write a program in Python for, In a array 1-100 multiple numbers are duplicates, how
do you find it.
3. Write a program in Python for, How to find all pairs in array of integers whose sum is
equal to given number.
4. Write a program in Python for, How to compare two array is equal in size or not.
5. Write a program in Python to find largest and smallest number in array.
6. Write a program in Python to find second highest number in an integer array.
7. Write a program in Python to find top two maximum number in array?
8. Write a program in Python to remove duplicate elements form array.
9. Python program to find top two maximum number in array.
10. Python program to print array in reverse Order.
11. Python program to reverse an Array in two ways.
12. Python Program to calculate length of an array.
13. Python program to insert an element at end of an Array.
14. Python program to insert element at a given location in Array.
15. Python Program to delete element at end of Array.
16. Python Program to delete given element from Array.
17. Python Program to delete element from array at given index.
18. Python Program to find sum of array elements.
19. Python Program to print all even numbers in array.
20. Python Program to print all odd numbers in array.
21. Python program to perform left rotation of array elements by two positions.
22. Python program to perform right rotation in array by 2 positions.
23. Python Program to merge two arrays.
24. Python Program to find highest frequency element in array.
25. Python Program to add two number using recursion.

Python Programs on List:

1. Python Program to Find Largest Number in a List


2. Python Program to Find Second Largest Number in a List
3. Python Program to Print Largest Even and Largest Odd Number in a List
4. Python Program to Split Even and Odd Elements into Two Lists
5. Python Program to Find Average of a List
6. Python Program to Print Sum of Negative Numbers, Positive Even & Odd Numbers
in a List
7. Python Program to Count Occurrences of Element in List
8. Python Program to Find the Sum of Elements in a List using Recursion
9. Python Program to Find the Length of a List using Recursion
10. Python Program to Merge Two Lists and Sort it

11. Python Program to Remove Duplicates from a List

12. Python Program to Swap the First and Last Element in a List

13. Python Program to Sort a List According to the Second Element in Sublist

14. Python Program to Return the Length of the Longest Word from the List of Words

15. Python Program to Find the Number Occurring Odd Number of Times in a List
16. Python Program to Generate Random Numbers from 1 to 20 and Append Them to the
List
17. Python Program to Remove the ith Occurrence of the Given Word in a List

18. Python Program to Find the Cumulative Sum of a List

19. Python Program to Find the Union of Two Lists

20. Python Program to Find the Intersection of Two Lists

21. Python Program to Flatten a List without using Recursion

22. Python Program to Find the Total Sum of a Nested List Using Recursion

23. Python Program to Flatten a Nested List using Recursion

Python Programs on Tuples:

1. Write a program to input n numbers from the user. Store these numbers in a tuple.
Print the maximum and minimum number from this tuple.
2. Python Program to Create a List of Tuples with the First Element as the Number and
Second Element as the Square of the Number
3. Python Program to Remove All Tuples in a List Outside the Given Range
4. Python Program to Sort a List of Tuples in Increasing Order by the Last Element in
Each Tuple

Python Programs on Dictionary:

1. Python Program to Create Dictionary from an Object


2. Python Program to Check if a Key Exists in a Dictionary or Not
3. Python Program to Add a Key-Value Pair to the Dictionary
4. Python Program to Find the Sum of All the Items in a Dictionary
5. Python Program to Multiply All the Items in a Dictionary
6. Python Program to Remove a Key from a Dictionary
7. Python Program to Concatenate Two Dictionaries
8. Python Program to Map Two Lists into a Dictionary
9. Python Program to Create a Dictionary with Key as First Character and Value as
Words Starting with that Character
10. Python Program to Create Dictionary that Contains Number

11. Python Program to Count the Frequency of Each Word in a String using Dictionary

Python Programs on Sets:

1. Find the size of a Set in Python


2. Iterate over a set in Python
3. Python - Maximum and Minimum in a Set
4. Python - Remove items from Set
5. Python - Check if two lists have at-least one element common
6. Python program to find common elements in three lists using sets
7. Python - Find missing and additional values in two lists
8. Python program to find the difference between two lists
9. Python Set difference to find lost element from a duplicated array
10. Python program to count number of vowels using sets in given string

11. Concatenated string with uncommon characters in Python

12. Python - Program to accept the strings which contains all vowels

13. Python - Check if a given string is binary string or not

14. Python set to check if string is panagram

15. Python Set - Pairs of complete strings in two sets

16. Python program to check whether a given string is Heterogram or not

Python Programs on Classes & Objects:

1. Python Program to Create a Class which Performs Basic Calculator Operations


2. Python Program to Append, Delete and Display Elements of a List using Classes
3. Python Program to Find the Area of a Rectangle using Classes
4. Python Program to Find the Area and Perimeter of the Circle using Class
5. Python Program to Create a Class in which One Method Accepts a String from the
User and Another Prints it
6. Python Program to Create a Class and Get All Possible Distinct Subsets from a Set
Python Programs on File Handling:

1. Python Program to Read the Contents of the File


2. Python Program to Copy One File to Another File
3. Python Program to Count the Number of Lines in Text File
4. Python Program to Count the Number of Blank Spaces in a Text File
5. Python Program to Count the Occurrences of a Word in a Text File
6. Python Program to Count the Number of Words in a Text File
7. Python Program to Capitalize First Letter of Each Word in a File
8. Python Program to Counts the Number of Times a Letter Appears in the Text File
9. Python Program to Extract Numbers from Text File
10. Python Program to Print the Contents of File in Reverse Order

11. Python Program to Append the Content of One File to the End of Another File

12. Python Program to Read a String from the User and Append it into a File

Python Programs on Different Types of Patterns:


References:

[Link]

[Link]

You might also like