AMZ DIGICOM

Digital Communication

AMZ DIGICOM

Digital Communication

Pandas Fillna (): The method explained

PARTAGEZ

The Python Pandas function DataFrame.fillna() is used for Replace the missing values ​​in a dataframa. This is useful in many cases to facilitate data cleaning processes or to carry out analyzes.

Definition

A nan value ( » Not a number ) Represents a missing or indefinite data in a pandas dataframe. It occurs when data is absent or an operation cannot produce a valid result. Pandas uses Nan to indicate these absences, thus making it possible to manage them effectively with functions such as fillna().

Pandas fillna() : How is the syntax of the method present?

The function fillna() takes up to five settings and is syntactically structured as follows:

DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None)

python

What are the relevant parameters?

The behavior of DataFrame.fillna() of pandas can be adapted using different parameters:

Parameters Description Default value
value A scales or a python dictionary or a series to replace the nans None
method Indicates the filling method; front filling (ffill) or rear filling (bfill)) None
axis Determines the axis along which the operation is carried out (0 Or index For lines, 1 Or columns for columns) 0
inplace If Truethe changes are made directly in the original dataframa False
limit Integer limiting the number of nan values ​​to replace None

Note

In future versions, the parameter methodwill probably not be supported. Programmers may then use obj.ffill() Or obj.bfill(). These two functions have the same effect as the parameter method corresponding.

In which cases do we apply the method DataFrame.fillna() of pandas?

Pandas function fillna() can be used in different ways:

Replace nan values ​​with a fixed value

At first, we define a dataframe:

import pandas as pd
# Exemple de DataFrame avec différentes valeurs
data = {
    'A' : [1, 2, None, 4],
    'B' : [None, 2, 3, 4],
    'C' : [1, None, 3, 4]
}
df = pd.DataFrame(data)
print(df)

python

The dataframa that we have just defined is following:

A    B    C
0  1.0  NaN  1.0
1  2.0  2.0  NaN
2  NaN  3.0  3.0
3  4.0  4.0  4.0

Note

Note that in Python Pandas, the value None is interpreted as NaN In data and series, especially for digital or floating columns.

To replace the missing values ​​now with value 0, we can use the pandas function fillna() ::

# Remplacer les valeurs manquantes par la valeur 0
df_filled = df.fillna(0)
print(df_filled)

python

In the end, each nan was replaced by the value 0 passed to the function:

A    B    C
0  1.0  0.0  1.0
1  2.0  2.0  0.0
2  0.0  3.0  3.0
3  4.0  4.0  4.0

Use of the front filling method ffill

If Nan values ​​should be filled with the previous values ​​of each column, the method can be used ffill which has passed in parameter to the function:

# Remplacement de toutes les valeurs NaN par la valeur précédente
df_ffill = df.fillna(method='ffill')
print(df_ffill)

python

In this example, the nan values ​​of columns A and C have been replaced by the previous values ​​of the same column. As there was no previous value in column B, the Nan value remained there:

A    B    C
0  1.0  NaN  1.0
1  2.0  2.0  1.0
2  2.0  3.0  3.0
3  4.0  4.0  4.0

Use line by line of the rear filling method bfill

NAN values ​​can also be replaced by the following values ​​of the same line. To do this, it is not enough to use the method bfillit is also necessary to define the parameter axis at 1:

df_bfill = df.fillna(method='bfill', axis=1)
print(df_bfill)

python

The NA values ​​of the first and third line were replaced by their respective successors. There is only one value nan remains in the first column, because no following value is available in this line.

A    B    C
0  1.0  1.0  1.0
1  2.0  2.0  NaN
2  3.0  3.0  3.0
3  4.0  4.0  4.0

Web accommodation

Flexible, efficient and safe web accommodation

  • SSL certificate and DDOS protection
  • Data backup and restoration
  • Assistance 24/7 and personal advisor

Télécharger notre livre blanc

Comment construire une stratégie de marketing digital ?

Le guide indispensable pour promouvoir votre marque en ligne

En savoir plus

Web Marketing

ICS file: open and import calendar data

With the ICS format, you can import, export, share or publish appointments and events in digital calendars. With its worldwide usage, it is supported by

Web Marketing

What is reinforcement learning?

Reinforcement learning is a subfield of machine learning in which an agent learns, using rewards and penalties, to make optimal decisions in a given environment.

Souhaitez vous Booster votre Business?

écrivez-nous et restez en contact