The Python Pandas function DataFrame.fillna() is used for Replace the missing values in a dataframa. This is useful in many cases to facilitate data cleaning processes or to carry out analyzes.
Definition
A nan value ( » Not a number ) Represents a missing or indefinite data in a pandas dataframe. It occurs when data is absent or an operation cannot produce a valid result. Pandas uses Nan to indicate these absences, thus making it possible to manage them effectively with functions such as fillna().
Pandas fillna() : How is the syntax of the method present?
The function fillna() takes up to five settings and is syntactically structured as follows:
DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None)
python
What are the relevant parameters?
The behavior of DataFrame.fillna() of pandas can be adapted using different parameters:
| Parameters | Description | Default value |
|---|---|---|
value
|
A scales or a python dictionary or a series to replace the nans | None
|
method
|
Indicates the filling method; front filling (ffill) or rear filling (bfill))
|
None
|
axis
|
Determines the axis along which the operation is carried out (0 Or index For lines, 1 Or columns for columns)
|
0
|
inplace
|
If Truethe changes are made directly in the original dataframa
|
False
|
limit
|
Integer limiting the number of nan values to replace | None
|
Note
In future versions, the parameter methodwill probably not be supported. Programmers may then use obj.ffill() Or obj.bfill(). These two functions have the same effect as the parameter method corresponding.
In which cases do we apply the method DataFrame.fillna() of pandas?
Pandas function fillna() can be used in different ways:
Replace nan values with a fixed value
At first, we define a dataframe:
import pandas as pd
# Exemple de DataFrame avec différentes valeurs
data = {
'A' : [1, 2, None, 4],
'B' : [None, 2, 3, 4],
'C' : [1, None, 3, 4]
}
df = pd.DataFrame(data)
print(df)
python
The dataframa that we have just defined is following:
A B C
0 1.0 NaN 1.0
1 2.0 2.0 NaN
2 NaN 3.0 3.0
3 4.0 4.0 4.0
Note
Note that in Python Pandas, the value None is interpreted as NaN In data and series, especially for digital or floating columns.
To replace the missing values now with value 0, we can use the pandas function fillna() ::
# Remplacer les valeurs manquantes par la valeur 0
df_filled = df.fillna(0)
print(df_filled)
python
In the end, each nan was replaced by the value 0 passed to the function:
A B C
0 1.0 0.0 1.0
1 2.0 2.0 0.0
2 0.0 3.0 3.0
3 4.0 4.0 4.0
Use of the front filling method ffill
If Nan values should be filled with the previous values of each column, the method can be used ffill which has passed in parameter to the function:
# Remplacement de toutes les valeurs NaN par la valeur précédente
df_ffill = df.fillna(method='ffill')
print(df_ffill)
python
In this example, the nan values of columns A and C have been replaced by the previous values of the same column. As there was no previous value in column B, the Nan value remained there:
A B C
0 1.0 NaN 1.0
1 2.0 2.0 1.0
2 2.0 3.0 3.0
3 4.0 4.0 4.0
Use line by line of the rear filling method bfill
NAN values can also be replaced by the following values of the same line. To do this, it is not enough to use the method bfillit is also necessary to define the parameter axis at 1:
df_bfill = df.fillna(method='bfill', axis=1)
print(df_bfill)
python
The NA values of the first and third line were replaced by their respective successors. There is only one value nan remains in the first column, because no following value is available in this line.
A B C
0 1.0 1.0 1.0
1 2.0 2.0 NaN
2 3.0 3.0 3.0
3 4.0 4.0 4.0
Web accommodation
Flexible, efficient and safe web accommodation
- SSL certificate and DDOS protection
- Data backup and restoration
- Assistance 24/7 and personal advisor

