| by Arround The Web | No comments

Pandas Select Row by Value

In Data analysis operations, we often require extracting specific information from large or complex datasets. The “Pandas” library in Python supports various methods for selecting or extracting specific rows from a DataFrame based on the values columns.

This tutorial will demonstrate how to use various methods to select rows by value with several examples.

How to Select Pandas Row by Value in Python?

To select a pandas row by specified column value the following methods are used in Python:

Method 1: Select Pandas Row by Value Using “Comparison Operators”

The comparison operator is used along with other boolean operators, such as and, or and not to select pandas row by a particular value.

Example 1: Select Rows by Specific Value Using “==” Operator

In the below code, the “pandas.DataFrame()” method is created with multiple columns. Next, the “==” operator is used to select the specified row based on the given value:

import pandas
df = pandas.DataFrame({'Name':["Joseph","Anna","Jason","Jordan","Lily"],
                   'Age' :[22,25,23,24,26],'Sex':['M','F','M', 'M', 'F'],
                   'Height':[5.5, 5.7, 4.7, 5.9, 4.6]})
print(df, '\n')
df1=df[df["Name"] == 'Jordan']
print(df1)

 

The specified rows have been selected and retrieved to the output:

Example 2: Select Rows by Specific Value Using “!=” Operator

We can also use the “!=” operator to select all the rows other than the specified condition value. Here in the below code, the rows containing the name “Jordan” will be excluded and all other rows have been selected and retrieved to the output.

import pandas
df = pandas.DataFrame({'Name':["Joseph","Anna","Jason","Jordan","Lily"],
                   'Age' :[22,25,23,24,26],'Sex':['M','F','M', 'M', 'F'],
                   'Height':[5.5, 5.7, 4.7, 5.9, 4.6]})
print(df, '\n')
df1=df[df["Name"] != 'Jordan']
print(df1)

 

The below output verified that the specified rows have been selected successfully:

Example 3: Select Rows by Specific Value Using Multiple Condition With “&” Operator

We can also select the rows based on multiple conditions using the “&” operator. In the below code, the “df.loc()” method is utilized along with the “&” operator to select those rows that have an age value greater or equal to “24” and have a height greater than or equal to “5.5”:

import pandas
df = pandas.DataFrame({'Name':["Joseph","Anna","Jason","Jordan","Lily"],
                   'Age' :[22,25,23,24,26],'Sex':['M','F','M', 'M', 'F'],
                   'Height':[5.5, 5.7, 4.7, 5.9, 4.6]})
print(df, '\n')
df1=df.loc[(df['Age'] >= 24) & (df['Height'] >= 5.5 )]
print(df1)

 

The particular rows have been selected successfully:

Method 2: Select Pandas Row by Value Using “isin()” Method

The “isin()” method can also be used to select pandas rows by single and multiple values. Here in the below code we pass the multiple columns value as a list to the “isin()” method to select DataFrame rows:

import pandas
df = pandas.DataFrame({'Name':["Joseph","Anna","Jason","Jordan","Lily"],
                   'Age' :[22,25,23,24,26],'Sex':['M','F','M', 'M', 'F'],
                   'Height':[5.5, 5.7, 4.7, 5.9, 4.6]})
print(df, '\n')
values=["Joseph","Jordan", "Lily"]
print(df[df["Name"].isin(values)] )

 

The above code execution will retrieve the selected rows:

Method 3: Select Pandas Row by Value Using “Numpy.where()” Method

The “numpy.where()” method can also be used to select or filter DataFrame rows in Python. This method retrieves the copy of the given DataFrame with values replaced by “NaN” that do not satisfy the condition. Here in the below code, the “df.where()”  method will select the rows that have an age value greater or equal to “24”. All other rows are replaced with “NaN” values:

import pandas
df = pandas.DataFrame({'Name':["Joseph","Anna","Jason","Jordan","Lily"],
                   'Age' :[22,25,23,24,26],'Sex':['M','F','M', 'M', 'F'],
                   'Height':[5.5, 5.7, 4.7, 5.9, 4.6]})
print(df, '\n')
df1 = df.where(df['Age'] >= 24)
print(df1)

 

The DataFrame rows have been selected or filtered:

Method 4: Select Pandas Row by Value Using “DataFrame.apply()” Method

The “DataFrame.apply()” method can also be used to select Pandas rows. This method is utilized along with the “df.isin()” method to retrieve the selected Pandas row. Here in the below code, the “DataFrame.apply()” method applies the lambda function to select DataFrame rows having age values “25” and “22”:

import pandas
df = pandas.DataFrame({'Name':["Joseph","Anna","Jason","Jordan","Lily"],
                   'Age' :[22,25,23,24,26],'Sex':['M','F','M', 'M', 'F'],
                   'Height':[5.5, 5.7, 4.7, 5.9, 4.6]})
print(df, '\n')
df1 = df.apply(lambda row: row[df['Age'].isin([25, 22])])
print(df1)

 

The rows have been selected successfully:

Method 5: Select Pandas Row by Value Using “DataFrame.loc[]”

The “DataFrame.loc[]” is used to select specific rows of DataFrame using the “==” operator. In the below code, we select the DataFrame rows having the “Sex” column value “F”:

import pandas
df = pandas.DataFrame({'Name':["Joseph","Anna","Jason","Jordan","Lily"],
                   'Age' :[22,25,23,24,26],'Sex':['M','F','M', 'M', 'F'],
                   'Height':[5.5, 5.7, 4.7, 5.9, 4.6]})
print(df, '\n')

df1=df.loc[df['Sex'] == "F"]
print(df1)

 

The DataFrame rows satisfying the condition have been retrieved to the output:

Method 6: Select Pandas Row by Value Using “DataFrame.query()” Method

The “DataFrame.query()” method is used to evaluate the specified string expression passed as an argument. We can also employ this to select Pandas row by specified column value:

import pandas
df = pandas.DataFrame({'Name':["Joseph","Anna","Jason","Jordan","Lily"],
                   'Age' :[22,25,23,24,26],'Sex':['M','F','M', 'M', 'F'],
                   'Height':[5.5, 5.7, 4.7, 5.9, 4.6]})
print(df, '\n')

df1=df.query("Age == 23")
print(df1)

 

The DataFrame rows having an age value equal to “23” retrieved to output:

Conclusion

The “Comparison Operators”, “isin()”, “np.where()”, “df.apply()”, “df.loc[]” and “DataFrame.query()” methods are used to select a Pandas row by value. These methods are used to select single or multiple rows by the specified column value in Python. This write-up demonstrated an in-depth explanation on selecting rows by specific value using several examples.

Share Button

Source: linuxhint.com

Leave a Reply