Pandas Drop All Columns Except
Python supports different libraries to work with data analysis and manipulation tasks. The “Pandas” library is the popular data analysis library that is used to work with DataFrames. In Pandas, a commonly performed task is to select a subset of columns from a DataFrame and drop all unwanted columns. Dropping all columns except the specified one can make our data clean, organized, and ready to use.
This blog will present you with the following methods to drop all columns except the specified ones:
Method 1: Using “df.loc()” Method to Drop All Columns Except the Specific Columns
In Python, the “df.loc()” method of the Pandas module is used to access or select the DataFrame rows or columns based on the labels. We used this method to drop all the columns except the specific one in Python.
Let’s explore this method via the following examples:
Example 1: Dropping All Columns Except the Single Column
This example drops all the columns of Pandas DataFrame except the selected column:
df = pandas.DataFrame({'name': ['Alex', 'Henry', 'Joseph', 'Anna'],
'age': [18, 12, 29, 34],
'marks': [55, 77, 87, 99],
'Grade': ['C-', 'B+', 'A', 'A+']})
print(df, '\n')
df = df.loc[:, ['name']]
print(df)
In the above code, we import the “pandas” module and create a “DataFrame” with multiple columns. After that, the “df.loc[]” method is used to drop all the Pandas DataFrame columns except the “name” column.
Output
All columns except the specified ones have been dropped.
Example 2: Dropping All Columns Except the Selected Multiple Columns
To drop all columns except the multiple specific columns, utilize the below code:
df = pandas.DataFrame({'name': ['Alex', 'Henry', 'Joseph', 'Anna'],
'age': [18, 12, 29, 34],
'marks': [55, 77, 87, 99],
'grade': ['C-', 'B+', 'A', 'A+']})
print(df, '\n')
df = df.loc[:, ['name', 'grade']]
print(df)
Here, the “df.loc()” method drops all the columns of Pandas DataFrame except the multiple columns “name” and “grade”.
Output
All columns except the specified multiple columns have been dropped.
Method 2: Using Double Square Brackets to Drop All Columns Except the Specific Columns
The double square brackets can also be utilized to drop all the DataFrame columns except the specified single or multiple columns. To understand this, let’s have a look at the below examples:
Example 1: Dropping All Columns Except the Single Column
Let’s overview the below code:
df = pandas.DataFrame({'name': ['Alex', 'Henry', 'Joseph', 'Anna'],
'age': [18, 12, 29, 34],
'marks': [55, 77, 87, 99],
'Grade': ['C-', 'B+', 'A', 'A+']})
print(df, '\n')
df = df[['name']]
print(df)
In this code, the “df[[‘column_name’]]” syntax is used to drop all the columns except the specified column label “name” that is passed inside the double brackets.
Output
Example 2: Dropping All Columns Except the Multiple Columns
The below example code drops all columns except the specified multiple columns:
df = pandas.DataFrame({'name': ['Alex', 'Henry', 'Joseph', 'Anna'],
'age': [18, 12, 29, 34],
'marks': [55, 77, 87, 99],
'Grade': ['C-', 'B+', 'A', 'A+']})
print(df, '\n')
df = df[['name', 'age']]
print(df)
Here, the “df[[‘column_name’]]” syntax drops all columns except the specified multiple columns: “name” and “age”.
Output
Method 3: Using “df.drop()” Method to Drop All Columns Except Specific Columns
The “df.drop()” method is used in Python to drop the specified columns and rows from the Pandas DataFrame based on the provided labels. This method is utilized with the “df.columns.difference()” method to drop all the columns except the specified one. Use the following examples to gain an in-depth understanding:
Example 1: Dropping All Columns Except the Single Column
Take the below code to drop all columns except the single column:
df = pandas.DataFrame({'name': ['Alex', 'Henry', 'Joseph', 'Anna'],
'age': [18, 12, 29, 34],
'marks': [55, 77, 87, 99],
'grade': ['C-', 'B+', 'A', 'A+']})
print(df, '\n')
df.drop(df.columns.difference(['name']),axis=1, inplace=True)
print(df)
This code creates a pandas DataFrame with four columns and drops all columns except the “name” column using the “drop()” method of pandas DataFrame. The “df.columns.difference()” method retrieves the complement of the passed data. So, in this case, it keeps only the “name” column and drops all other columns of DataFrame.
Output
Example 2: Dropping All Columns Except the Multiple Columns
To drop all columns except the multiple columns, take the below example code:
df = pandas.DataFrame({'name': ['Alex', 'Henry', 'Joseph', 'Anna'],
'age': [18, 12, 29, 34],
'marks': [55, 77, 87, 99],
'grade': ['C-', 'B+', 'A', 'A+']})
print(df, '\n')
df.drop(df.columns.difference(['name', 'grade']),axis=1, inplace=True)
print(df)
In this code, the complement returned by the “df.columns.difference()” method is passed to the “df.drop()” method to drop all the DataFrame columns except the specified ones.
Output
All DataFrame columns except the “name” and “grade” have been dropped.
Conclusion
In Python, the “df.loc()” method, the “Double Square” brackets, and the “df.drop()” method is used to drop all columns except the specified columns. These methods can keep single or multiple columns and remove all others. Using numerous examples, this Python guide presented a detailed guide on dropping all columns except the single or multiple columns.
Source: linuxhint.com