0% found this document useful (0 votes)
13 views16 pages

Python for Data Analysis Overview

Python is a versatile, high-level programming language known for its simplicity and readability, making it a popular choice for data analysis and development across various platforms. It supports multiple programming paradigms and has a rich ecosystem of libraries that facilitate data manipulation, visualization, and machine learning. The document also covers Python's unique features, data types, and basic programming concepts, including identifiers, keywords, and operators.

Uploaded by

piyushbohra579
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views16 pages

Python for Data Analysis Overview

Python is a versatile, high-level programming language known for its simplicity and readability, making it a popular choice for data analysis and development across various platforms. It supports multiple programming paradigms and has a rich ecosystem of libraries that facilitate data manipulation, visualization, and machine learning. The document also covers Python's unique features, data types, and basic programming concepts, including identifiers, keywords, and operators.

Uploaded by

piyushbohra579
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

PYTHON FOR DATA ANALYSIS

UNIT: I

Introduction: - Python is a free general-purpose programming language with beautiful syntax. It is


available across many platforms including Windows, Linux and Mac OS. Due to its inherently easy to
learn nature along with Object Oriented features, Python is used to develop and demonstrate
applications quickly. It’s a known fact that developers spend most of the time reading the code than
writing it and Python can speed up software development. Hosting solutions for Python applications
are also very cheap. A versatile language like Python can be used not only to write simple scripts for
handling file operations but also to develop massively trafficked websites for corporate IT
organizations.

What is python: - Python is a high-level, interpreted programming language known for its simplicity,
readability, and versatility. Created by Guido van Rossum and first released in 1991, Python emphasizes
code clarity and allows developers to express concepts in fewer lines of code compared to many other
languages. It supports multiple programming paradigms, including procedural, object-oriented, and
functional programming, making it suitable for a wide range of applications—from web development
and data analysis to artificial intelligence and automation. With a vast ecosystem of libraries and
frameworks, Python has become one of the most popular and widely used languages in both academia
and industry. Its user-friendly syntax and active community make it an ideal choice for beginners and
experienced programmers alike.

Why python is first choice for data scientists: - Python is the data scientist's language of choice because
it provides a special blend of simplicity, flexibility, and high-level power that makes data work both
productive and easy to understand. It’s simple, readable syntax enables users—even those without
formal programming skills—to learn quickly and easily concentrate on solving problems instead of
struggling with complicated code. Python's enormous universe of libraries and frameworks is another
key strength: libraries like NumPy and Pandas make data manipulation easy, Matplotlib and Seaborn
make data visualization simple, and machine learning libraries like Scikit-learn, Tensor Flow, and
PyTorch allow sophisticated analytics and predictive modeling. Python also easily fits with other
technologies and platforms, supports automation, and is supported by a wide global community that
keeps adding new tools, tutorials, and support. This makes it not only an efficient option for data
science work but also a future-proof one since it develops along with the field itself. Whether you're
scrubbing dirty datasets, developing advanced models, or implementing AI solutions, Python has a
single, graceful environment that enables data scientists to transform raw data into actionable
knowledge.

History of python: - Python is a widely used programming language today. Starting from the first
semester students to final year projects to industry personnel, Python finds its use in developing
programs for graphics applications, text processing, and data analysis, among others. Python was
developed by Guido van Rossam, a Dutch programmer, and released in 1991. The name is inspired from
a BBC comedy show Monty Python’s Flying Circus. Python is a successor of the ABC programming
language. At the time of this writing, Python 3 is the latest major release of Python (on your computers,
you may notice the program python3). Since the last 20 years, Python has been in the Top 10 most
popular programming languages. At the time of this writing (July 2022), Python is the most popular
language, surpassing C and Java (according to the TIOBE index).
Identifiers: - An identifier is a name given to a variable, function, class or module. Identifiers may be
one or more characters in the following format:
• Identifiers can be a combination of letters in lowercase (a to z) or uppercase (A to Z) or digits (0 to 9)
or an underscore (_). Names like myCountry, other_1 and good_ morning, all are valid examples. A
Python identifier can begin with an alphabet (A – Z and a – z and _).
• An identifier cannot start with a digit but is allowed everywhere else. 1plus is invalid, but plus1 is
perfectly fine.
• Keywords cannot be used as identifiers.
• One cannot use spaces and special symbols like !, @, #, $, % etc. as identifiers.
• Identifier can be of any length.

Keywords: - Keywords are a list of reserved words that have predefined meaning. Keywords are special
vocabulary and cannot be used by programmers as identifiers for variables, functions, constants or with
any identifier name. Attempting to use a keyword as an identifier name will cause an error. The
following TABLE 2.1 shows the Python keywords

Unique features of python:- Python is a dynamic, high-level, free open source, and interpreted
programming language. It supports object-oriented programming as well as procedural-oriented
programming. In Python, we don't need to declare the type of variable because it is a dynamically typed
language. For example, x = 10 Here, x can be anything such as String, int, etc. In this article we will see
what characteristics describe the python programming language.
1. Free and Open Source - Python language is freely available at the official website and you can
download it from the given download link below click on the Download Python keyword. Download
Python Since it is open-source, this means that source code is also available to the public. So you
can download it, use it as well as share it.
2. Easy to code - Python is a high-level programming language. Python is very easy to learn the
language as compared to other languages like C, C#, Javascript, Java, etc. It is very easy to code in
the Python language and anybody can learn Python basics in a few hours or days. It is also a
developer-friendly language.
3. Easy to Read - As you will see, learning Python is quite simple. As was already established, Python's
syntax is really straightforward. The code block is defined by the indentations rather than by
semicolons or brackets.
4. Object-Oriented Language – one of the key features of Python is Object-Oriented programming.
Python supports object-oriented language and concepts of classes, object encapsulation, etc.
5. High-Level Language - Python is a high-level language. When we write programs in Python, we do
not need to remember the system architecture, nor do we need to manage the memory.
6. Large Community Support - Python has gained popularity over the years. Our questions are
constantly answered by the enormous StackOverflow community. These websites have already
provided answers to many questions about Python, so Python users can consult them as needed.
7. Python is a Portable language - Python language is also a portable language. For example, if we
have Python code for Windows and if we want to run this code on other platforms such as Linux,
Unix, and Mac then we do not need to change it, we can run this code on any platform.
8. Python is an Integrated language - Python is also an Integrated language because we can easily
integrate Python with other languages like C, C++, etc.

#Dynamic typing example


X=5 #Integer
X = “Hello” #Now it’s string
Print(x)

Output:
Hello

Installation and Environment setup:


Installing Python on Windows will take some important steps :
Step 1): Go to the python official website at [Link]
Step 2): Choose the latest version of Python releases for Windows.
Step 3): After choosing the correct released version, Click on the download Python.
Step 4): Click on Install now, and you can add [Link] path.
Step 5) : After the installation completed , You will find python is installed in your system .
After a successful installation of Python, IDLE(Integrated Development and Learning Environment) will
also be installed on our local computer alongside some of the packages. For simple programs, we can
use IDLE.
Python programs can also be written in a notepad and run from the command prompt. For this, follow
the steps:
 Open the notepad.
 Write the code in it.
 Save the file with the .py extension.
 Open the terminal/command prompt.
 For type in the following command >> py [Link].
 The output will be displayed.
 Example: We have created a file named [Link] with the content print("Hello world").

Writing your first python programme


Module 1: Using IDLE
Print(“Welcome to Python Programming!”)

Output:
Welcome to Python Programming!

Method 2: Using Command Line


Print(“Hello python world!”)

Run using:
Python [Link]
Output:
Hello python world!

Data types:-
Data types in Python are a way to classify data items. They represent the kind of value, which
determines what operations can be performed on that data. Since everything is an object in Python
programming, Python data types are classes and variables are instances (objects) of these classes.
The following are standard or built-in data types in Python:
 Numeric: int, float, complex
 Sequence Type: string, list, tuple
 Mapping Type: dict
 Boolean: bool
 Set Type: set, frozenset
 Binary Types: bytes, bytearray, memoryview

Below code assigns variable 'x' different values of few Python data types - int, float, list, tuple and string.
Each assignment replaces previous value, making 'x' take on data type and value of most recent
assignment.
x = 50 # int
x = 60.5 # float
x = "Hello World" # string
x = ["geeks", "for", "geeks"] # list
x = ("geeks", "for", "geeks") # tuple

List Data Type


Lists are similar to arrays found in other languages. They are an ordered and mutable collection of
items. It is very flexible as items in a list do not need to be of the same type.

Creating a List in Python

Lists in Python can be created by just placing sequence inside the square brackets[].
a = []
a = [1, 2, 3]
print(a)
b = ["Geeks", "For", "Geeks", 4, 5]
print(b)

Output
[1, 2, 3]
['Geeks', 'For', 'Geeks', 4, 5]

Tuple Data Type


Tuple is an ordered collection of Python objects. The only difference between a tuple and a list is that
tuples are immutable. Tuples cannot be modified after it is created.

Creating a Tuple in Python

In Python, tuples are created by placing a sequence of values separated by a ‘comma’ with or without
the use of parentheses for grouping data sequence. Tuples can contain any number of elements and of
any datatype (like strings, integers, lists, etc.). Tuples can also be created with a single element, but it
is a bit tricky. Having one element in the parentheses is not sufficient, there must be a trailing ‘comma’
to make it a tuple.

tup1 = ()

tup2 = ('Geeks', 'For')


print("\nTuple with the use of String: ", tup2)

Output
Tuple with the use of String: ('Geeks', 'For')

Access Tuple Items

In order to access tuple items refer to the index number. Use the index operator [ ] to access an item
in a tuple.
tup1 = (1, 2, 3, 4, 5)
print(tup1[0])
print(tup1[-1])
print(tup1[-3])

Output
1
5
3

Set Data Type


In Python Data Types, Set is an unordered collection of data types that is iterable, mutable, and has no
duplicate elements. The order of elements in a set is undefined though it may consist of various
elements.
Create a Set in Python
Sets can be created by using the built-in set() function with an iterable object or a sequence by placing
the sequence inside curly braces, separated by a ‘comma’. The type of elements in a set need not be
the same, various mixed-up data type values can also be passed to the set.
s1 = set()

s1 = set("GeeksForGeeks")
print("Set with the use of String: ", s1)

s2 = set(["Geeks", "For", "Geeks"])


print("Set with the use of List: ", s2)

Output
Set with the use of String: {'s', 'o', 'F', 'G', 'e', 'k', 'r'}
Set with the use of List: {'Geeks', 'For'}

Access Set Items


Set items cannot be accessed by referring to an index, since sets are unordered the items have no index.
But we can loop through the set items using a for loop, or ask if a specified value is present in a set, by
using the keyword in.
set1 = set(["Geeks", "For", "Geeks"])
print(set1)
for i in set1:
print(i, end=" ")
print("Geeks" in set1)
Output
{'For', 'Geeks'}
For Geeks True

Dictionary Data Type


A dictionary in Python is a collection of data values, used to store data values like a map, unlike other
Python Data Types, a Dictionary holds a key: value pair. Key-value is provided in dictionary to make it
more optimized. Each key-value pair in a Dictionary is separated by a colon : , whereas each key is
separated by a ‘comma’.

Create a Dictionary in Python


Values in a dictionary can be of any datatype and can be duplicated, whereas keys can’t be repeated
and must be immutable. The dictionary can also be created by the built-in function dict().

Note - Dictionary keys are case sensitive, the same name but different cases of Key will be treated
distinctly.
d = {}

d = {1: 'Geeks', 2: 'For', 3: 'Geeks'}


print(d)
d1 = dict({1: 'Geeks', 2: 'For', 3: 'Geeks'})
print(d1)
Output
{1: 'Geeks', 2: 'For', 3: 'Geeks'}
{1: 'Geeks', 2: 'For', 3: 'Geeks'}
Accessing Key-value in Dictionary
In order to access items of a dictionary refer to its key name. Key can be used inside square brackets.
Using get() method we can access dictionary elements
d = {1: 'Geeks', 'name': 'For', 3: 'Geeks'}
print(d['name'])
print([Link](3))
Output
For
Geeks

List Tuple Set Dictionary

A list is a non- A Tuple is a non-


The set data A dictionary is also a
homogeneous data homogeneous data
structure is non- non-homogeneous
structure that stores structure that stores
homogeneous but data structure that
the elements in elements in columns of
stores the elements stores key-value
columns of a single a single row or multiple
in a single row. pairs.
row or multiple rows. rows.

The list can be A tuple can be The set can be The dictionary can be
represented by [ ] represented by ( ) represented by { } represented by { }

The Set will not The dictionary


The list allows Tuple allows duplicate
allow duplicate doesn't allow
duplicate elements elements
elements duplicate keys.

The list can be A tuple can be nested The set can be The dictionary can be
nested among all among all nested among all nested among all

Example: {1: "a", 2:


Example: [1, 2, 3, 4, Example: {1, 2, 3, 4,
Example: (1, 2, 3, 4, 5) "b", 3: "c", 4: "d", 5:
5] 5}
"e"}

A list can be created Tuple can be created A set can be A dictionary can be
using using created using created using
the list() function the tuple() function. the set() function the dict() function.

A list is mutable i.e A tuple is immutable i.e A set is mutable i.e A dictionary is
we can make any we can not make any we can make any mutable, its Keys are
changes in the list. changes in the tuple. changes in the set, not duplicated.
List Tuple Set Dictionary

its elements are not


duplicated.

Dictionary is ordered
List is ordered Tuple is ordered Set is unordered (Python 3.7 and
above)

Creating an empty Creating a set Creating an empty


Creating an empty list
Tuple a=set() dictionary
l=[]
t=() b=set(a) d={}

Python Operators
In Python programming, Operators in general are used to perform operations on values and variables.
 Operators: Special symbols like -, + , * , /, etc.
 Operands: Value on which the operator is applied.
Types of Operators in Python

Arithmetic Operators
Python Arithmetic operators are used to perform basic mathematical operations like addition,
subtraction, multiplication and division.
In Python 3.x the result of division is a floating-point while in Python 2.x division of 2 integers was an
integer. To obtain an integer result in Python 3.x floored (// integer) is used.
a = 15
b=4
print("Addition:", a + b)
print("Subtraction:", a - b)
print("Multiplication:", a * b)
print("Division:", a / b)
print("Floor Division:", a // b)
print("Modulus:", a % b)
print("Exponentiation:", a ** b)
Output
Addition: 19
Subtraction: 11
Multiplication: 60
Division: 3.75
Floor Division: 3
Modulus: 3
Exponentiation: 50625
Comparison Operators
In Python, Comparison (or Relational) operators compares values. It either returns True or False
according to the condition.
a = 13
b = 33

print(a > b)
print(a < b)
print(a == b)
print(a != b)
print(a >= b)
print(a <= b)
Output
False
True
False
True
False
True

Logical Operators
Python Logical operators perform Logical AND, Logical OR and Logical NOT operations. It is used to
combine conditional statements.
The precedence of Logical Operators in Python is as follows:
 Logical not
 logical and
 logical or
a = True
b = False
print(a and b)
print(a or b)
print(not a)
Output
False
True
False

Bitwise Operators
Python Bitwise operators act on bits and perform bit-by-bit operations. These are used to operate on
binary numbers.
Bitwise Operators in Python are as follows:
1. Bitwise NOT
2. Bitwise Shift
3. Bitwise AND
4. Bitwise XOR
5. Bitwise OR
a = 10
b=4
print(a & b)
print(a | b)
print(~a)
print(a ^ b)
print(a >> 2)
print(a << 2)
Output
0
14
-11
14
2
40

Assignment Operators
Python Assignment operators are used to assign values to the variables. This operator is used to assign
the value of the right side of the expression to the left side operand.
Example
a = 10
b=a
print(b)
b += a
print(b)
b -= a
print(b)
b *= a
print(b)
b <<= a
print(b)
Output
10
20
10
100
102400

Iterators in Python
An iterator in Python is an object used to traverse through all the elements of a collection (like lists,
tuples or dictionaries) one element at a time. It follows the iterator protocol, which involves two key
methods:
 __iter__(): Returns the iterator object itself.
 __next__(): Returns the next value from the sequence. Raises StopIteration when the sequence ends.
Why do we need iterators?
Here are some key benefits:
 Lazy Evaluation: Processes items only when needed, saving memory.
 Generator Integration: Pairs well with generators and functional tools.
 Stateful Traversal: Keeps track of where it left off.
 Uniform Looping: Same for loop works for lists, strings and more.
 Composable Logic: Easily build complex pipelines using tools like itertools.
Built-in Iterator Example
Let’s start with a simple example using a string. We will convert it into an iterator and fetch characters
one by one:
s = "GFG"
it = iter(s)

print(next(it))
print(next(it))
print(next(it))
Output
G
F
G

Generators in Python
A generator function is a special type of function that returns an iterator object. Instead of using return
to send back a single value, generator functions use yield to produce a series of results over time. This
allows the function to generate values and pause its execution after each yield, maintaining its state
between iterations.
Example:
def fun(max):
cnt = 1
while cnt <= max:
yield cnt
cnt += 1

ctr = fun(5)
for n in ctr:
print(n)
Output
1
2
3
4
5

Explanation: This generator function fun yields numbers from 1 up to a specified max. Each call to next()
on the generator object resumes execution right after the yield statement, where it last left off.
Why Do We Need Generators?
 Memory Efficient : Handle large or infinite data without loading everything into memory.
 No List Overhead : Yield items one by one, avoiding full list creation.
 Lazy Evaluation : Compute values only when needed, improving performance.
 Support Infinite Sequences : Ideal for generating unbounded data like Fibonacci series.
 Pipeline Processing : Chain generators to process data in stages efficiently.
Applications of Generators in Python
Suppose we need to create a stream of Fibonacci numbers. Using a generator makes this easy, you just
call next() to get the next number without worrying about the stream ending.
Generators are especially useful for processing large data files, like logs, because:
 They handle data in small parts, saving memory
 They don’t load the entire file at once
 While iterators can do similar tasks, generators are quicker to write since you don’t need to define
__next__ and __iter__ methods manually.

Python range() function


The Python range() function returns a sequence of numbers, in a given range. The most common use
of it is to iterate sequences on a sequence of numbers using Python loops.
Example
In the given example, we are printing the number from 0 to 4.
for i in range(5):
print(i, end=" ")
print()
Output:
01234

Syntax of Python range() function


Syntax: range(start, stop, step)

Parameter :

 start: [ optional ] start value of the sequence


 stop: next value after the end value of the sequence
 step: [ optional ] integer value, denoting the difference between any two numbers in the sequence
Return : Returns an object that represents a sequence of numbers

What is the use of the range function in Python


In simple terms, range() allows the user to generate a series of numbers within a given range.
Depending on how many arguments the user is passing to the function, the user can decide where
that series of numbers will begin and end, as well as how big the difference will be between one
number and the next. Python range() function takes can be initialized in 3 ways.

 range (stop) takes one argument.


 range (start, stop) takes two arguments.
 range (start, stop, step) takes three arguments.

Python If Else Statements


In Python, If-Else is a fundamental conditional statement used for decision-making in programming.
If...Else statement allows to execution of specific blocks of code depending on the condition is True or
False.
if....else Statement
if...else statement is a control statement that helps in decision-making based on specific conditions.
When the if condition is False. If the condition in the if statement is not true, the else block will be
executed.

Simple if-else
i = 20
if i > 0:
print("i is positive")
else:
print("i is 0 or Negative")

If Else in One-line
If we need to execute a single statement inside the if or else block then one-line shorthand can be
used.
a = -2

res = "Positive" if a >= 0 else "Negative"


print(res)
Output
Negative

Nested If Else Statement


Nested if...else statement occurs when if...else structure is placed inside another if or else block.
Nested If..else allows the execution of specific code blocks based on a series of conditional checks.
Example of Nested If Else Statement:
i = 10
if i == 10:

if i < 15:
print("i is smaller than 15")

if i < 12:
print("i is smaller than 12 too")
else:
print("i is greater than 15")
else:
print("i is not equal to 10")

Output:
i is smaller than 15
i is smaller than 12 too

if…elif…else Statement
if-elif-else statement in Python is used for multi-way decision-making. This allows us to check multiple
conditions sequentially and execute a specific block of code when a condition is True. If none of the
conditions are true, the else block is executed.
Example:
i = 25
if i == 10:
print("i is 10")
elif i == 15:
print("i is 15")
elif i == 20:
print("i is 20")
else:
print("i is not present")

Output:
i is not present

Loops in Python - For, While and Nested Loops


Loops in Python are used to repeat actions efficiently. The main types are For loops (counting through
items) and While loops (based on conditions).
For Loop
For loops is used to iterate over a sequence such as a list, tuple, string or range. It allow to execute a
block of code repeatedly, once for each item in the sequence.
n=4
for i in range(0, n):
print(i)
Output
0
1
2
3

Explanation: This code prints the numbers from 0 to 3 (inclusive) using a for loop that iterates over a
range from 0 to n-1 (where n = 4).
Example:
Iterating Over List, Tuple, String and Dictionary Using for Loops
in Python
li = ["geeks", "for", "geeks"]
for x in li:
print(x)

tup = ("geeks", "for", "geeks")


for x in tup:
print(x)

s = "abc"
for x in s:
print(x)

d = dict({'x':123, 'y':354})
for x in d:
print("%s %d" % (x, d[x]))

set1 = {10, 30, 20}


for x in set1:
print(x),
Output
geeks
for
geeks
geeks
for
geeks
a
b
c
x 123
y 354
10
20
30

While Loop
In Python, a while loop is used to execute a block of statements repeatedly until a given condition is
satisfied. When the condition becomes false, the line immediately after the loop in the program is
executed.
In below code, loop runs as long as the condition cnt < 3 is true. It increments the counter by 1 on
each iteration and prints "Hello Geek" three times.
cnt = 0
while (cnt < 3):
cnt = cnt + 1
print("Hello Geek")
Output
Hello Geek
Hello Geek
Hello Geek

Infinite While Loop


If we want a block of code to execute infinite number of
times then we can use the while loop in Python to do so.
Code given below uses a 'while' loop with the condition
"True", which means that the loop will run infinitely until
we break out of it using "break" keyword or some other
logic.
while (True):
print("Hello Geek")

Loop Control Statements


Loop control statements change execution from their normal sequence. When execution leaves a
scope, all automatic objects that were created in that scope are destroyed. Python supports the
following control statements.

Continue Statement
The continue statement in Python returns the control to the beginning of the loop.
for letter in 'geeksforgeeks':
if letter == 'e' or letter == 's':
continue
print('Current Letter :', letter)
Output:
Current Letter : g
Current Letter : k
Current Letter : f
Current Letter : o
Current Letter : r
Current Letter : g
Current Letter : k
Explanation: The continue statement is used to skip the current iteration of a loop and move to the
next iteration. It is useful when we want to bypass certain conditions without terminating the loop.

Break Statement
The break statement in Python brings control out of the loop.
for letter in 'geeksforgeeks':
if letter == 'e' or letter == 's':
break

print('Current Letter :', letter)


Output
Current Letter : e

Explanation: break statement is used to exit the loop prematurely when a specified condition is met.
In this example, the loop breaks when the letter is either 'e' or 's', stopping further iteration.

Pass Statement
We use pass statement in Python to write empty loops. Pass is also used for empty control statements,
functions and classes.
for letter in 'geeksforgeeks':
pass
print('Last Letter :', letter)
Output
Last Letter : s

Explanation: In this example, the loop iterates over each letter in 'geeksforgeeks' but doesn't perform
any operation, and after the loop finishes, the last letter ('s') is printed.

You might also like