FILE HANDLING
What is a File?
● A File is a bunch of bytes stored on some storage device like hard disk,
thumb drive, etc
OR
● A File is a document or a data stored on a permanent storage device which
can be read , written or re-written according to the requirement
Why are files needed?
OR
What are the advantages of a file?
● Data in files is persistent
● Python allows us to read the data from and write the data to external text
files
permanently on secondary storage
● It helps in preserving the data or information generated after running the
program.
● Large storage capacity: Using files, you need not worry about the problem of
storing data in bulk.
What is a File Object/File Handle?
● File Object/File Handle serves as a link to a file residing in the hard disk of
your computer.
What are the types of File
1) Text File
• It consists of a sequence of lines and line is the sequence of characters.
• A text file stores information in ASCII or Unicode characters in permanent
storage media
• In a text file,each line is terminated by an EOL(End of Line)character .i.e. the
delimiter
• Internal translations take place when the EOL character is read or written. In
Python it is ‘\n’ or carriage return.
• Text files are stored in human readable form and can be created using any
text editor.
2) Binary File
• A binary file is a file which has the information in the same format as held in
the memory.
• It is used to store images, video files,audio files,etc
• A binary file stores information in the binary form
• Binary files don’t have a delimiter for a line.
• The file content is raw .i.e. it is not in human readable form
• No character translations can be carried out in a binary [Link] a result binary
files are much easier and faster than text files for carrying out the reading
and writing operations on data.
Differentiate between text and binary files. – Frame it from the above lines
Differentiate between text and csv files
● A CSV file is specifically designed for tabular data. Each line represents a row
in the table, and commas separate the values (fields) in each row whereas A
text file can contain data in a plain text format. There’s no strict structure like
CSV files.
● CSV File: The ASCII codes in a CSV file follow a predictable pattern based on
delimiters (like commas) and line breaks, making the file structured and
machine-readable for tabular data. Text File: The ASCII codes in a text file are
stored in a free-flowing, unstructured way, with no specific rules for how
characters are organized.
● CSV file has delimiter for fields which is absent in textfiles
● Can write about the read/write methods.
To open a file stored in another path, other than Python’s default folder
OR
To write Absolute path in open()
There are 2 ways to write the path
1) Double slashes(as \ is a special character,it is used alongwith escape
sequence) eg)f=open(“e:\\exam\\[Link]” , ”r”)
eg)f=open(“e:\\exam\\[Link]” , ”w”)
2) Give raw path by prefixing the path with r
eg)f=open(r“e:\exam\[Link]” , ”r”)
Absolute and Relative Paths
• Path of a file is a sequence of the directories to access that file on a
computer.
Absolute Path is the complete path starting from the root directory
Example of an absolute path is:
C:\Users\user\Documents\[Link]
This shows the full path starting from the C: drive.
Relative Path is a path specifies the location of a file or directory in relation to the
current directory you're in.
Example:
Documents\[Link]
If you're in the C:\Users\user directory, this points to
C:\Users\user\Documents\[Link].
Access Specifiers in Files/Modes of Files
Access Access Description File Pointer
Mode Mode Position
for Text for
Files Binary
Files
r Rb Read mode. Beginning of
•Opens a file for reading. File
•If the file does not exist, open() raises a
FileNotFoundError.
r+ rb+ •It opens the file for both reading and Beginning of
writing. File
•If the file does not exist, open() raises a
FileNotFoundError.
w Wb Write mode. Beginning of
•It opens the file for writing only. File
•If the file exists, the old content of the file
gets truncated and the new content is
written from the beginning of the file.
•If the file does not exist, it is created.
w+ wb+ •The w+ mode opens the file for both Beginning of
writing and reading. File
•Like w, If the file exists, the old content of
the file gets truncated and the new
content is written from the beginning of
the file.
•. If the file does not exist, it is created.
a Ab • The a mode opens the file for End of File
appending.
• If the file exists, the old content of the
file does not get truncated and the new
content is written from the end of the
file.
• If the file does not exist, it is created.
a+ ab+ • The a+ mode opens the file for both End of File
appending and reading.
• If the file exists, the new content is
added after the existing content to the
End of the file.
• If the file does not exist, it is created.
Default mode for file opening in “r” read mode. If we didn’t specify mode during the
opening of the file then it will automatically open the file in read mode.
Random Access Methods
1) seek()
• The seek() is used to change the file pointer(cursor)position.
Syntax: [Link](offset, referencepoint)
where offset is number of positions to move fwd,
referencepoint is from where
• Returns: Does not return any value
• The reference point(from what position) accepts 3 values
0: beginning of the file
1: current file position
2: end of the file
• By default referencepoint argument is set to 0.
Note: Reference point at current position(1) / end of file (2) can be set in text
mode only when offset is equal to 0.
2) tell()
The tell() method returns the current file pointer(cursor) position in a file
stream.
Syntax: [Link]()
flush() – this method is used to force the data to be written to the file stream
residing on the disk immediately.
Binary files:
Pickle module: pickle module is used in binary file for load( ) and dump( )
methods which are used for reading and writing into binary file respectively.
Pickling/Serialization: It is the process of converting python object into byte
stream. Pickling is done at the time of writing into a binary file.
Unpickling/Unserialization: It is the process of converting a byte stream into
python object. Unpickling is done at the time reading from a binary file.
Examples of Binary Files
1) Image Files:
● JPEG (.jpg, .jpeg), PNG (.png), GIF (.gif), BMP (.bmp), TIFF (.tiff)
2) Audio Files:
● MP3 (.mp3), WAV (.wav), FLAC (.flac), AAC (.aac)
3) Video Files:
● MP4 (.mp4), AVI (.avi), MKV (.mkv), MOV (.mov)
4) Compressed Files:
● ZIP (.zip), GZ (.gz), RAR (.rar), Tar (.tar)
5) Executable Files:
● Windows Executables (.exe), MacOS Executables (.app)
Advantage of with open() statement
• The primary advantage of using with open() is that it automatically closes
the file once the block of code inside the with statement is finished
executing, even if an exception occurs.
• with open() ensures that resources are efficiently released after they are
no longer needed.