The Python Pandas function DataFrame.mean() is used for Calculate the average value on one or more axes of a dataframa. Pandas mean() is essential for data analysis, as it provides valuable information on their distribution and averages.
Web accommodation
Flexible, efficient and safe web accommodation
- SSL certificate and DDOS protection
- Data backup and restoration
- Assistance 24/7 and personal advisor
The syntax of the function DataFrame.mean() pandas
Pandas function mean() accepted up to three settings and follows a simple basic syntax:
DataFrame.mean(axis=None, skipna=True, numeric_only=None)
python
What are the relevant parameters?
Using different parameters, the behavior of DataFrame.mean() Pandas can be adapted to your personal use case.
| Parameters | Description | Default value |
|---|---|---|
axis
|
Determine if the calculation should be carried out on lines (axis=0) or columns (axis=1))
|
0
|
skipna
|
If Truenan values are ignored
|
True
|
numeric_only
|
If Trueonly digital data types are taken into account in the calculation
|
False
|
Function application mean() pandas
The function DataFrame.mean() Pandas can be applied in different ways.
Calculate the averages for each column
In the examples of the code below, we consider a dataframe pandas with the following data:
import pandas as pd
data = {
'A' : [1, 2, 3, 4],
'B' : [4, 5, 6, 7],
'C' : [7, 8, 9, 10]
}
df = pd.DataFrame(data)
print(df)
python
The resulting dataframa is as follows:
A B C
0 1 4 7
1 2 5 8
2 3 6 9
3 4 7 10
To calculate the average value of each column, we can use the pandas function mean() with the default parameter axis=0 ::
column_means = df.mean()
print(column_means)
python
In this way, the averages of each column (A, B and C) are calculated by dividing the sum of the elements by the number of elements in each column. The result is the following serie Pandas:
A 2.5
B 5.5
C 8.5
dtype: float64
Calculate the averages for each line
If you rather want to calculate the average for each line, just put the parameter axis at 1:
row_means = df.mean(axis=1)
print(row_means)
python
The averages of each line are calculated using the function mean() of pandas, by dividing the sum of the elements by the number of elements in each line. The call for the function gives the following output:
0 4.0
1 5.0
2 6.0
3 7.0
dtype: float64
Ignore nan values
In the following example, we consider another dataframe which contains some values nan ( » Not a number »):
import pandas as pd
import numpy as np
data = {
'A' : [1, 2, np.nan, 4],
'B' : [4, np.nan, 6, 7],
'C' : [7, 8, 9, np.nan]
}
df = pd.DataFrame(data)
print(df)
python
The result of the code above is in the following dataaframe:
A B C
0 1.0 4.0 7.0
1 2.0 NaN 8.0
2 NaN 6.0 9.0
3 4.0 7.0 NaN
To calculate the average taking into account Nan values, we use the parameter skipna. The default is Truewhich means that nan values are automatically ignored by the function mean(). If skipna=Falsethe average for each column containing at least one nan value would also be nan.
mean_with_nan = df.mean()
print(mean_with_nan)
python
Pandas function call mean() then allows to obtain:
A 2.333333
B 5.666667
C 8.000000
dtype: float64

