| by Arround The Web | No comments

Pandas Groupby Max

In Python, the “Pandas” library supports different modules and methods to perform several data operations such as DataFrame creation, data selection, data extraction, and others. The “groupby()” is one of the Pandas methods that is used in Python to create a group based on column values. To find the maximum value for each specified group the “max()” function is used in Python.

This article will provide you with a detailed guide on how to determine the maximum value of the selected columns group on single or multiple columns. For this, consider the content provided below:

How to Determine the Max Value From the Grouped Data of Pandas DataFrame?

To determine the max value from the grouped data, the “df.groupby()” method is used along with the “max()” method. Here is the syntax:

df.groupby([Col1])[Col2].max()

For further understanding of the “df.groupby()” method, you can check this detailed guide.

Now, let’s explore this method using the following examples:

Example 1: Find the Maximum Value From the Grouped Data of the Single Column

Let’s overview the following example:

import pandas
data = pandas.DataFrame({'Team': ['X', 'X', 'X', 'Y', 'Y', 'Y'],'Players': [10, 20, 30, 5, 22, 33],'Points': [242, 321, 221, 318, 319, 212],'Medals': [4, 5, 2, 5, 2, 1]})
print(data, '\n')
print(data.groupby('Team')['Points'].max())

In the above code:

  • The “pd.DataFrame()” function creates/constructs a DataFrame.
  • The “groupby()” method is utilized to group the data based on the “Team” column.
  • The “max()” method is then applied to the “Points” column of each group to find the maximum number of points scored by each team.

Output

The above output is a new DataFrame object that contains two columns named “Team” and “Points”, where each row represents a team and its maximum score.

Example 2: Find the Maximum Value From the Grouped Data of the Multiple Column

Let’s understand this example by the following code:

import pandas
data = pandas.DataFrame({'Team': ['X', 'X', 'X', 'Y', 'Y', 'Y'],'Players': ['A', 'B', 'B', 'A', 'B', 'A'],'Points': [242, 321, 221, 318, 319, 212],'Medals': [4, 5, 2, 5, 2, 1]})
print(data, '\n')
print(data.groupby(['Team', 'Players'])['Points'].max())

In the above code:

  • The “pd.DataFrame()” function of the “pandas” module is used to create a DataFrame.
  • The “groupby()” method groups on multiple columns and the “max()” function is used to determine the maximum value of each group in the selected columns.

Output


The maximum value of the “Points” column has been determined for each group created on multiple columns “Team” and “Players”.

Example 3: Group Data By a Specific Column and Extract Maximum Value From Multiple Columns

Take the following code to understand this example:

import pandas
data = pandas.DataFrame({'Team': ['X', 'X', 'X', 'Y', 'Y', 'Y'],'Players': [10, 20, 30, 5, 22, 33],'Points': [242, 321, 221, 318, 319, 212],'Medals': [4, 5, 2, 5, 2, 1]})
print(data, '\n')
print(data.groupby('Team')['Points', 'Players'].max())

Here in this code:

  • The “groupby()” method groups the data of DataFrame based on the “Team” column.
  • The “max()” method is then applied to the “Points” and “Players” columns of each group to find the maximum value.

Output

The maximum value of the multiple columns of the specified group has been displayed.

Example 4: Determining and Sorting the Maximum Value

To sort the maximum value of the specified group data, use the below code:

import pandas
data = pandas.DataFrame({'Team': ['X', 'X', 'Y', 'Y', 'Z', 'Z'],'Players': [10, 20, 30, 5, 22, 33],'Points': [242, 321, 221, 318, 319, 212],'Medals': [4, 5, 2, 5, 2, 1]})
print(data, '\n')
print(data.groupby('Team')['Points'].max().reset_index().sort_values(['Points'], ascending=True))

In this example:

  • The “groupby()” method groups the data based on the “Team” column and the “max()” method determines the maximum value of the selected column “Points”.
  • The “reset_index()” method is used to reset the index of the DataFrame and “sort_values()” is used to sort the maximum value in ascending order.

Output

The maximum points of the teams have been sorted in ascending order.

Conclusion

The “DataFrame.groupby()” method is used along with the “max()” function to calculate the max value from the grouped data. The “groupby()” is used to group the data based on single or more than two columns. The “sort_values()” function can also be used with the “groupby()” and “max()” functions to sort the maximum value. This tutorial has presented an extensive guide on Pandas “groupby” max using numerous examples.

Share Button

Source: linuxhint.com

Leave a Reply