Python Tarfile
Python works with several different file formats, such as CSV, XLSX, txt, and others, using its various modules as well as methods. The tarfile format is used to combine several files into single files. To perform different operations on the tar file, the “tarfile” module is used in Python.
This article explains a details guide on Python tarfile using numerous examples and following the below contents:
- What is the Tar File in Python?
- Creating Tarfile in Python Using “tarfile” Module
- Determining Whether the File is Tar or Not
- Reading and Printing the File Exists in the Tar File Using “tarfile” Module
- Appending Files to Tarfile Using the “tarfile” Module
- Extract a Particular File from the Specified Tar File
What is the Tar File in Python?
The “tar” archive is a file format that is utilized for combining multiple files into a single file, often compressed utilizing gzip or bzip2. In Python, the “tarfile” module provides functionality for working/dealing with “tar” archives. This module can be used to perform various functions, such as creating tar archives, extracting files from existing tar archives, adding or appending files to tar archives, and others.
The following are the different types of file modes that are utilized to open the tarfile in Python:
Modes | Explanation |
---|---|
r | This mode is used to open a tar file and read an uncompressed tar file. |
w or w | This mode is used to allow uncompressed writing by opening a TAR archive. |
a or a | The data can be appended into a tar file by opening the file in this mode. |
r:gz | This mode is used to open the gzip compressed tar archive for reading. |
w:gz | This mode is used to open the gzip compressed tar archive for writing. |
r:bz2 | This mode is used to open the bzip2 compressed tar archive for reading. |
w:bz2 | This mode is used to open the bzip2 compressed tar archive for writing. |
Example 1: Creating Tarfile in Python Using “tarfile” Module
In this example, the tarfile module is used to create the tarfile by utilizing the following files:
Take the following code as an example:
file1 = tarfile.open(r"C:\Users\p\Documents\program\new.tar", "w")
file1.add("sample.txt")
file1.add("sample1.txt")
file1.add("sample2.txt")
file1.close()
Here,
- Firstly, the “tarfile” module is imported.
- Next, the “tarfile.open()” method of the tarfile module is used to open the tar file in “w” mode.
- After opening, the “add()” method is employed to combine multiple files into single files to create a tar archive file.
- Finally, the “close()” method is employed to close/exit the file.
Output
The tar file has been created successfully:
We can also list the files using the “os.listdir()” method of the “os” module and add them to the tar file by creating the tarfile using the tarfile module. Here is an example code that adds all the listed files into a tar file:
file1 = tarfile.open("new.tar", "w")
for i in os.listdir(r"C:\Users\p\Documents\program"):
file1.add(i)
file1.close()
The above code successfully listed and created the tar file:
Example 2: Determining Whether the File is Tar or Not
The “tarfile.is_tarfile()” is used in Python to determine whether the specific file is tar or not. This method retrieves a boolean value that checks whether the file is tar. In the below code, the “tarfile.is_tarfile()” method takes the file as an argument and retrieves the “True” or “False”:
file1 =r"C:\Users\p\Documents\program\new.tar"
print("File is Tar: ", tarfile.is_tarfile(file1))
The below snippet verified that the specified file is tar archive file:
Example 3: Reading and Printing the File Exists in the Tar File Using “tarfile” Module
In the below code, the specific “r” read mode is passed to the “tarfile.open()” method to open the particular file in the specified read mode. After that, the “for” loop is used along with the “file.getnames()” method to iterate over the file names present in the tar archive file. The file names are printed using the “print()” function and closed via the “close()” method:
file = tarfile.open("new.tar", "r")
for i in file.getnames():
print(i)
file.close()
The below snippet displays the file names that exist in the tar archive file:
Example 4: Appending Files to Tarfile Using the “tarfile” Module
To append the tar file, we can use the “a” append as the parameter to the “tarfile.open()” method. The for loop and the “getnames()” method is used multiple times in a program to iterate over the tar file before and after appending. The file is appended using the “add()” method inside the for loop:
file = tarfile.open("new.tar", "a")
print('Files Before Appending New Files: ')
for i in file.getnames():
print(i)
file.add('example.csv')
file.add('example1.csv')
print('\nFiles After Appending New Files: ')
for x in file.getnames():
print(x)
file.close()
The below snippet shows the multiple files before and after appending to the tar archive files:
Example 5: Extract a Particular File from the Specified Tar File
To extract a particular file from the specified tar file, the “extractfile()” method is used in Python. For example, in the following files, the “sample.txt” is placed inside the “new.tar” archive file.:
To extract the “sample.txt” file from the tar file, the following code is used in Python. In this code, the “tarfile.open()” method is used to open the tar file in “r” read mode and extract them using the “file.extractfile()” method:
file = tarfile.open('new.tar', "r")
file1 = file.extractfile("sample.txt")
file.close()
The file name “sample.txt” has been extracted the file successfully:
We can also extract all the files from the tar file using the “extractall()” method. According to the below code, the “extractall()” method takes the directory name as an argument and extracts all the tar archive files:
file = tarfile.open('new.tar', "r")
file.extractall("program")
file.close()
The “new.tar” has been extracted successfully:
Note: We did not specify the complete path because we’re in the current working directory. If the tar file is in another location, use the complete path to avoid errors.
Conclusion
The Python “tarfile” module is used to perform various functions on tarfile, such as creating and reading tar files in various modes, extracting tar files, and many more. The “tarfile.is_tarfile()” method can also be used to check whether the specified file is a tarfile or not. The “extractfile()” and “extractall()” method is used in Python to extract the single or multiple tar file. This guide delivered an in-depth tutorial on Python tar files using various examples.
Source: linuxhint.com