| by Arround The Web | No comments

Pandas Sum Column

In Python, the “Pandas” library is used for performing data analysis and manipulation on small and large groups of data. The Pandas library provides various methods to perform simple to complex tasks. One such task is to sum the values of a column or multiple columns in a DataFrame. This can be done utilizing the “DataFrame.sum()” method of Python.

This Python blog presents a detailed guide on how to sum columns in Pandas DataFrame via the below outline:

What is the “DataFrame.sum()” Method in Python?

In Python, the “DataFrame.sum()” method is utilized to calculate/determine the sum of all values in each column.

Syntax

DataFrame.sum(axis=None, skipna=True, numeric_only=False, min_count=0, **kwargs)

In the above syntax:

  • The “axis” parameter is an optional parameter that specifies which axis to verify/check. It can be 0 (index) or 1 (columns). The default is None, which means both axes are summed.
  • The “skipna” parameter determines/verifies whether to exclude/ignore NA/null values when computing the sum. The default is “True”, which means “NA” values are ignored.
  • The “numeric_only” parameter indicates whether to include only numeric columns in the sum. The default is “False”, which means all columns are included.
  • The “min_count” parameter sets the minimum number of valid values to perform the sum.

Return Value

The “DataFrame.sum()” method returns the sum/addition of the values over the requested or specified axis.

Example 1: Adding all the Columns of Pandas DataFrame

The following example add all the columns of Pandas DataFrame:

import pandas
data1 = {"Name":["anna","henry","joseph"],"Marks1" :[1, 5, 3],"Marks2" :[7, 9, 5]}
df = pandas.DataFrame(data1, index=['A', 'B', 'C'])
print('Given DataFrame:\n', df)
print(df.sum(axis=1))

In the above code, the “pandas.DataFrame()” method is used to create the DataFrame with a specified index value. After that, the “df.sum()” method takes the “axis=1” as an argument to get the sum of all the columns.

Output

The column sum has been shown in the above output.

Example 2: Adding Specific Columns of Pandas DataFrame

This example is used to add the specific columns of Pandas DataFrame:

import pandas
data1 = {"Name":["anna","henry","joseph"],"Marks1" :[1, 5, 3],"Marks2" :[7, 9, 5], "Marks3" :[4, 9, 8]}
df = pandas.DataFrame(data1, index=['A', 'B', 'C'])
print('Given DataFrame:\n', df)
df['Sum'] = df[['Marks1', 'Marks3']].sum(axis=1)
print('\n',df)

In the above code, the “df.sum()” method takes the specified column’s name as an argument and retrieves the new column with the sum of the specified columns.

Output

The specified columns have been added successfully.

Example 3: Adding Specific Columns Using “DataFrame.iloc[]” or “DataFrame.loc[]” Method Along With “DataFrame.sum()” Method

The “DataFrame.iloc[]” method is used to access/invoke a group of columns and rows by integer/int position(s). It can be used to sum the values of a column based on its index position or a range of positions.

Let’s explore this method using the following code:

import pandas
data1 = {"Name":["anna","henry","joseph"],"Marks1" :[1, 5, 3],"Marks2" :[7, 9, 5], "Marks3" :[4, 9, 8]}
df = pandas.DataFrame(data1, index=['A', 'B', 'C'])
print('Given DataFrame:\n', df)
df['Sum']=df.iloc[:,[1,3]].sum(axis=1)
print('\n',df)

In the above example code, the “df.iloc()” is used with the “DataFrame.sum()” method to sum the specified columns of DataFrame.

Output

The above output shows that the columns named “Marks1” and “Marks3” have been added.

We can also use the “DataFrame.loc()” method to access/call a group of columns and rows by particular label(s) or a Boolean array. It can also be used to sum the values of a column based on a condition or a list of conditions.

Here is an example:

import pandas
data1 = {"Name":["anna","henry","joseph"],"Marks1" :[1, 5, 3],"Marks2" :[7, 9, 5], "Marks3" :[4, 9, 8]}
df = pandas.DataFrame(data1, index=['A', 'B', 'C'])
print('Given DataFrame:\n', df)
df['Sum'] = df.loc['B':'C',['Marks1','Marks3']].sum(axis = 1)
print('\n',df)

In the above code, the “df.loc()” method takes the column label as an argument and sums the specified columns using the “df.sum()” method. In this case, the sum operation is performed from index range “B” to “C” and on specific columns labels “Marks1” and “Marks3”.

Output

The specified columns of the DataFrame have been added successfully.

Alternative Method: Adding all the Columns of Pandas DataFrame Using the “DataFrame.eval()” Function

The “DataFrame.eval()” function takes the string as an argument and evaluates the DataFrame columns based on the operation described in the string. It is used to sum the values of multiple columns using arithmetic operators.

Example:

This example is used to add all the columns of Pandas DataFrame using the “eval()” function:

import pandas
data1 = {"Name":["anna","henry","joseph"],"Marks1" :[1, 5, 3],"Marks2" :[7, 9, 5], "Marks3" :[4, 9, 8]}
df = pandas.DataFrame(data1, index=['A', 'B', 'C'])
print('Given DataFrame:\n', df)
df2 = df.eval('Sum = Marks1 + Marks2')
print('\n', df2)

In the above code, the “df.eval()” function takes the string expression “Sum = Marks1 + Marks2” as an argument and performs the operation on DataFrame.

Output

The particular DataFrame columns have been added successfully.

Conclusion

The “DataFrame.sum()” and “DataFrame.eval()” methods are used to add all the columns or specified columns of Pandas DataFrame in Python. The “DataFrame.sum()” method retrieves the sum of the specified columns by using the “df.loc()” and “df.iloc()” methods. The “DataFrame.eval()” method can also be used to sum specific columns based on the passed string operation. This blog provided an extensive tutorial on the Pandas sum column utilizing numerous examples.

Share Button

Source: linuxhint.com

Leave a Reply