| by Arround The Web | No comments

Pandas Not In

The popular Python library named “Pandas” is utilized to create and manipulate DataFrames. DataFrames are tubular structures that store data in rows and columns. While working with DataFrames, we are required to select only those DataFrame rows that do not contain a specific value in one or multiple columns. To achieve this we use the “NOT IN” filter along with the “df.isin()” method.

This guide will follow the below content:

How to Use Pandas “NOT IN” in Python?

Pandas do not have a “NOT IN” operator. But the simple NOT IN (~) operator is utilized along with the “df.isin()” method to filter the particular data from DataFrame. It is used to check whether the data is present in the DataFrame or not.

Syntax

df[~df['col_name'].isin(values_list)]

 

In this syntax, the “col_name” represents the name of the column, and “values_list” represents the list value that we used for filtering rows.

Example 1: Using Pandas “NOT IN” Filter to Filter Rows of Single Column

This example filters the rows according to the single DataFrame column value using the “NOT IN” and “df.isin()” method:

import pandas
df = pandas.DataFrame({'Name':["Jason", "Joseph", "Lily", "Anna", "Scarlet"],
                   'Age' :[22, 25, 23, 24, 26],
                   'Salary':[2000, 3000, 4000, 5000, 10000],
                   'Team':['IT', 'QA', 'Technical', 'IT', 'Video Editing']})
print(df, '\n')
df2 = df[~df['Name'].isin(["Jason", "Lily"])]
print(df2)

 

In the above code:

  • We imported the “Pandas” module and created the DataFrame multiple columns.
  • Next, the “~” Not in operator is used along with the “df.isin()” method to retrieve all the rows except the row containing the “Jason” and “Lily” column values.

Output

This snippet shows the filtration of DataFrame rows according to the single column value.

Example 2: Using Pandas “NOT IN” Filter to Filter Rows of Multiple Column

The below code filters the Pandas DataFrame rows according to the multiple column values:

import pandas
df = pandas.DataFrame({'Name':["Jason", "Joseph", "Lily", "Anna", "Scarlet"],
                   'Age' :[22, 25, 23, 24, 26],
                   'Salary':[2000, 3000, 4000, 5000, 10000],
                   'Team-1':['IT', 'QA', 'Technical', 'IT', 'Video Editing'],
                       'Team-2':['Author', 'IT', 'QA', 'IT', 'Graphics']})
print(df, '\n')
df1 = df[~df[['Team-1', 'Team-2']].isin(['QA', 'Graphics']).any(axis=1)]
print(df1)

 

Here:

  • We imported the Pandas module and created the DataFrame with multiple columns. (Some columns have common values)
  • Next, the Not in “~” operator is used to access the multiple columns and verify the presence of value using the “df.isin()” method. The rows containing the specified column values are eliminated.

Output

The specified values of DataFrame have been filtered successfully.

Example 3: Using NumPy With “NOT IN” Filter

This code utilizes the “numpy.isin()” method with the NOT IN “~” operator to filter the DataFrame rows:

import pandas, numpy
df = pandas.DataFrame({'Name':["Jason", "Joseph", "Lily", "Anna", "Scarlet"],
                   'Age' :[22, 25, 23, 24, 26],
                   'Salary':[2000, 3000, 4000, 5000, 10000],
                   'Team':['IT', 'QA', 'Technical', 'IT', 'Video Editing']})
print(df, '\n')
df1 = df[~numpy.isin(df['Name'], ["Jason", "Lily"])]
print(df1)

 

According to the above code:

  • The Not In “~” operator is used along with the “numpy.isin()” operator to filter rows.
  • First, the “numpy.isin()” checks whether the value is found in the DataFrame.
  • If the value is present, the NOT IN “~” operator reverses the value and returns all results other than the ones found.

Output

The DataFrame rows have been filtered successfully.

Conclusion

The NOT IN “~” operator is used along with the “DataFrame.isin()” method of Pandas to filter the rows of single or multiple DataFrame columns. Pandas do not contain the “NOT IN” operator separately. We can also use the “np.isin()” method with the NOT IN “~” operator to filter the specified row values. This tutorial delivered a detailed guide on the Pandas “NOT IN” operator using several examples.

Share Button

Source: linuxhint.com

Leave a Reply