| by Arround The Web | No comments

Pandas Groupby Average

Average” or mean in mathematics is determined by adding all the given values and dividing it by the total numbers of values. While working with group data of DataFrame in Python, sometimes we need to determine the mean or average of specific columns. The “df.groupby()” method is used along with the “mean()” method to determine the average/mean of specified DataFrame single or multiple columns for each group.

This post provides a comprehensive tutorial on determining the mean/average of the DataFrame group data.

How to Determine the Mean/Average by Group in Pandas DataFrame?

The “groupby()” is used along with the “mean()” method to group the data based on single and multiple columns and find the mean/average of the single or multiple columns.

Let’s explore this method by utilizing the below example code:

Example 1: Determine the Mean of a Column Group by a Single DataFrame Column

Let’s utilize the below code to determine the mean of a columned that is grouped by a single column:

import pandas
data1 = {'Name': ['Joseph', 'Lily', 'Anna', 'Henry', 'Joseph', 'Anna'],'Age': [15, 23, 32, 18, 14, 32],'Height': [5.6, 6.2, 3.7, 6.1, 4.3, 5.3]}
df = pandas.DataFrame(data1)
print(df, '\n')
df1 = df.groupby(['Name'])['Age'].mean()
print(df1)

 
Here in this code:

    • The “pandas” module is imported.
    • The “pd.DataFrame()” method takes the dictionary data as an argument and creates the DataFrame.
    • The “df.groupby()” method is used to group the data based on the single column “Name”.
    • After grouping data based on a single column, the “mean()” method is used to determine the mean or average of another column named “Age”, based on the group data.

Output


The “mean/average” of the single column based on the DataFrame group has been calculated.

Example 2: Determine the Mean of a Column Group by Multiple DataFrame Columns

Let’s overview the below code:

import pandas
data1 = {'Name': ['Joseph', 'Lily', 'Anna', 'Lily', 'Joseph', 'Anna'],'Age': [15, 32, 23, 18, 15, 23],'Height': [5.6, 6.2, 3.7, 6.1, 4.3, 5.3]}
df = pandas.DataFrame(data1)
print(df, '\n')
df1 = df.groupby(['Name', 'Age'])['Height'].mean()
print(df1)

 
In the above code:

    • The “df.groupby()” method groups data based on the multiple columns “Name” and “Age”.
    • The “mean()” method is used along with the “groupby()” method to determine the mean or average of the single column based on the group data.

Output


The “mean/average” of the multiple columns based on the DataFrame group has been calculated.

Example 3: Determine the Mean of Multiple Column Group by Single DataFrame Column

This example is used to determine the mean of multiple columns based on the group data:

import pandas
data1 = {'Name': ['Joseph', 'Lily', 'Anna', 'Lily', 'Joseph', 'Anna'],'Age': [15, 32, 23, 18, 15, 23],'Height': [5.6, 6.2, 3.7, 6.1, 4.3, 5.3]}
df = pandas.DataFrame(data1)
print(df, '\n')
df1 = df.groupby(['Name'])[['Age','Height']].mean()
print(df1)

 
In the above code:

    • The “df.groupby()” method is used along with the “mean()” method to determine the mean of multiple columns “Age” and “Height” based on the data group by a single column.

Output

Alternative Method: Using the “agg()” Function to Determine the Mean/Average of DataFrame Groups

The “agg()” function can also be used to determine the mean/average of the Pandas DataFrame data group by single or multiple columns. Let’s apply this method in the below example:

import pandas
data1 = {'Name': ['Joseph', 'Lily', 'Anna', 'Henry', 'Joseph', 'Anna'],'Age': [15, 23, 32, 18, 14, 32],'Height': [5.6, 6.2, 3.7, 6.1, 4.3, 5.3]}
df = pandas.DataFrame(data1)
print(df, '\n')
df1 = df.groupby(['Name'])['Age'].agg('mean')
print(df1)

 
In the above code:

    • The “df.groupby()” groups the data of DataFame based on the multiple columns named “Age” and “Name”.
    • The “agg()” method takes the attribute “mean” as an argument and determines the mean/average of the specified column based on the group data.

Output


The mean/average has been determined successfully.

Conclusion

In Python, the “groupby()” method is used along with the “mean()” method to determine the mean/average of single or multiple columns for each group data. The “mean()” method is used to determine the average of single or multiple columns based on the group data of DataFrame. The “agg()” method can also be utilized as an alternative to determining the mean/average for each group. This write-up presented a detailed guide on finding the mean/average for each group data using numerous examples.

Share Button

Source: linuxhint.com

Leave a Reply