| by Arround The Web | No comments

Geometric Mean Pandas

In statistics, the “geometric mean” is a measure of central tendency that represents the average value of a collection of numbers by utilizing the product of those numbers rather than their sum. A “geometric mean” is a useful tool for analyzing data that follows an exponential growth pattern, such as rates of change or investment returns.

This Python post provides a complete guide on calculating geometric means in Python using pandas. Following are the contents that will be covered:

What is Geometric Mean?

The “geometric mean” is utilized to calculate the central tendency of a set/collection of numbers. Unlike “arithmetic mean”, which adds up all the numbers and then divides them by the total number of items, the geometric mean multiplies all the numbers instead and then accept the “nth” root of the resulting product, where “n” represents the number of set items.

The formula for calculating the “geometric mean” of a set/collection of numbers is shown below:

GM = (x1 * x2 * x3 * ... * xn) ^ (1/n)

Here, “x1”, “x2”, “x3”, …, and “xn” are the numbers in the set, and “n” refers to the number of items in the set.

How to Calculate Geometric Mean Using Python?

Python provides several built-in functions to perform mathematical calculations, including the calculation of the geometric mean. To compute/calculate the geometric mean of a collection of numbers in Python, use the combined “pow()” and “len()” functions in the below example.

Example
Overview of the below-given code:

import math
numbers = [2, 4, 6, 8, 10]
product = math.prod(numbers)
geometric_mean = pow(product, 1/len(numbers))
print("Geometric Mean:", geometric_mean)

In the above code snippet:

  • The “math” module is imported and the list containing numbers is initialized, respectively.
  • The “prod()” function is used to compute the product of all the list numbers.
  • The “pow()” function is used to calculate the geometric mean of these numbers by raising the product of all numbers to the power of “1/5” (the length of the list), which is equivalent to taking the fifth root of that product.

Output

The above output implies that the geometric mean has been calculated accordingly.

How to Calculate Geometric Mean Using Pandas?

Pandas provides several functions to perform mathematical calculations, including the calculation of the geometric mean. To compute/calculate the geometric mean of a column in pandas DataFrame, use the “geometric_mean()” function from the “scipy.stats” module.

Example
Go through the below-provided lines of code:

import pandas as pd
from scipy.stats import gmean
df = pd.DataFrame({'numbers': [2, 4, 6, 8, 10]})
geometric_mean = gmean(df['numbers'])
print("Geometric Mean:", geometric_mean)

In the above code block:

  • The “pandas” and “scipy.stats” modules are imported, respectively.
  • After that, the “pd.DataFrame()” function is used to create a DataFrame.
  • Lastly, the “gmean()” function from the “scipy.stats” module is utilized to find/calculate the geometric mean of the numbers in the “numbers” column of the given dataframe.

Output

The above output shows that the geometric mean has been calculated successfully.

Alternative Approach: Calculate Geometric Mean Using the “statistics” Module

Python also has a built-in “statistics” module that includes a “geometric_mean()” function that can be applied to calculate the geometric mean of a list of numbers. To use this function with a “Pandas Series” or “DataFrame object”, we first need to convert it to a list.

Example
Consider the below-stated code:

import pandas as pd
import statistics as stats
data = pd.Series([2, 4, 6, 8, 10])
data_list = data.tolist()
gm = stats.geometric_mean(data_list)
print("The geometric mean is:", gm)

In the above code lines:

  • Firstly, the “pandas” and “statistics” modules are imported.
  • In the next step, the “pd.Series()” function is utilized to create/make the series object.
  • Now, the “data.tolist()” function converts the series object into a list.
  • Finally, the “stats.geometric_mean()” function takes the converted list as an argument and calculates the geometric mean of the passed list.

Output

The above output signifies that the geometric mean of the passed list has been computed successfully.

Conclusion

To calculate the “geometric mean” in Python, use the combined “pow()” and “len()” functions, the “gmean()” function, or the “statistics” module. The former approaches can be applied in simple Python, the “gmean()” function, however, can be used to compute the geometric mean of a given collection of integers, lists, or DataFrame in pandas. This post presented a complete guide on calculating the geometric mean of the given input using “pandas” and other modules.

Share Button

Source: linuxhint.com

Leave a Reply