AMZ DIGICOM

Digital Communication

AMZ DIGICOM

Digital Communication

Calculation of averages with the Pandas Mean () function

PARTAGEZ

The Python Pandas function DataFrame.mean() is used for Calculate the average value on one or more axes of a dataframa. Pandas mean() is essential for data analysis, as it provides valuable information on their distribution and averages.

Web accommodation

Flexible, efficient and safe web accommodation

  • SSL certificate and DDOS protection
  • Data backup and restoration
  • Assistance 24/7 and personal advisor

The syntax of the function DataFrame.mean() pandas

Pandas function mean() accepted up to three settings and follows a simple basic syntax:

DataFrame.mean(axis=None, skipna=True, numeric_only=None)

python

What are the relevant parameters?

Using different parameters, the behavior of DataFrame.mean() Pandas can be adapted to your personal use case.

Parameters Description Default value
axis Determine if the calculation should be carried out on lines (axis=0) or columns (axis=1)) 0
skipna If Truenan values ​​are ignored True
numeric_only If Trueonly digital data types are taken into account in the calculation False

Function application mean() pandas

The function DataFrame.mean() Pandas can be applied in different ways.

Calculate the averages for each column

In the examples of the code below, we consider a dataframe pandas with the following data:

import pandas as pd
data = {
    'A' : [1, 2, 3, 4],
    'B' : [4, 5, 6, 7],
    'C' : [7, 8, 9, 10]
}
df = pd.DataFrame(data)
print(df)

python

The resulting dataframa is as follows:

A     B     C
0  1     4     7
1  2     5     8
2  3     6     9
3  4     7    10

To calculate the average value of each column, we can use the pandas function mean() with the default parameter axis=0 ::

column_means = df.mean()
print(column_means)

python

In this way, the averages of each column (A, B and C) are calculated by dividing the sum of the elements by the number of elements in each column. The result is the following serie Pandas:

A 2.5
B 5.5
C 8.5
dtype: float64

Calculate the averages for each line

If you rather want to calculate the average for each line, just put the parameter axis at 1:

row_means = df.mean(axis=1)
print(row_means)

python

The averages of each line are calculated using the function mean() of pandas, by dividing the sum of the elements by the number of elements in each line. The call for the function gives the following output:

0 4.0
1 5.0
2 6.0
3 7.0
dtype: float64

Ignore nan values

In the following example, we consider another dataframe which contains some values ​​nan ( » Not a number »):

import pandas as pd
import numpy as np
data = {
    'A' : [1, 2, np.nan, 4],
    'B' : [4, np.nan, 6, 7],
    'C' : [7, 8, 9, np.nan]
}
df = pd.DataFrame(data)
print(df)

python

The result of the code above is in the following dataaframe:

A     B     C
0  1.0  4.0  7.0
1  2.0  NaN  8.0
2  NaN  6.0  9.0
3  4.0  7.0  NaN

To calculate the average taking into account Nan values, we use the parameter skipna. The default is Truewhich means that nan values ​​are automatically ignored by the function mean(). If skipna=Falsethe average for each column containing at least one nan value would also be nan.

mean_with_nan = df.mean()
print(mean_with_nan)

python

Pandas function call mean() then allows to obtain:

A 2.333333
B 5.666667
C 8.000000
dtype: float64

Télécharger notre livre blanc

Comment construire une stratégie de marketing digital ?

Le guide indispensable pour promouvoir votre marque en ligne

En savoir plus

Souhaitez vous Booster votre Business?

écrivez-nous et restez en contact