The Python Pandas function DataFrame.where() is used to perform Conditional data handling In data. It allows programmers to replace or hide values in a dataframa pandas based on a specific condition.
Web accommodation
Flexible, efficient and safe web accommodation
- SSL certificate and DDOS protection
- Data backup and restoration
- Assistance 24/7 and personal advisor
Pandas syntax DataFrame.where()
The function where() accepted up to five settings and follows the basic syntax shown below:
DataFrame.where(cond, other=nan, inplace=False, axis=None, level=None)
python
In this case, the function is applied to a dataframe and only the values which fulfill the specified condition (cond) remain unchanged. All other values are replaced by the values specified in other.
Relevant parameters
Pandas DataFrame.where() Accepts different parameters that allow flexible adaptation of data handling:
| Parameters | Description | Default value |
|---|---|---|
cond
|
Condition which must be fulfilled so that the values are kept in the dataframe | Mandatory |
other
|
Replacement value for values that do not meet the condition | NaN
|
inplace
|
If Truethe operation is carried out directly on the existing dataframa
|
False
|
axis
|
Indicate if the operation should be applied to the lines (0) or the columns (1))
|
None
|
level
|
Indicates at what level of the multi-index the condition should be applied | None
|
Pandas application DataFrame.where()
The function where() can be used in many situations where Conditional data handling are necessary. For example, this is data cleaning or Creation of new columns based on conditions.
Conditional replacement of values
Suppose you have a dataframa containing the sales results of a company and that you want to display only positive results. All negative results must be replaced by 0. This can be done with Pandas DataFrame.where(). First of all, a dataaframa is created:
import pandas as pd
# Création d’un DataFrame d’exemple
data = {
'Région': ['Nord', 'Sud', 'Est', 'Ouest'],
'Ventes_Q1': [15000, -5000, 3000, -1000],
'Ventes_Q2': [20000, 25000, -7000, 5000]
}
df = pd.DataFrame(data)
print(df)
python
The above code returns the following dataframa:
Région Ventes_Q1 Ventes_Q2
0 Nord 15000 20000
1 Sud -5000 25000
2 Est 3000 -7000
3 Ouest -1000 5000
By calling where()you can now replace all negative values with 0. To do this, you must make sure that only columns containing digital values are taken into accountotherwise the comparison will not work.
# Remplacement conditionnel de valeurs
df_positive = df.copy()
df_positive[['Ventes_Q1', 'Ventes_Q2']] = df[['Ventes_Q1', 'Ventes_Q2']].where(df[['Ventes_Q1', 'Ventes_Q2']] > 0, 0)
print(df_positive)
python
The resulting dataframa df_positive only contains positive sales results and replaces all negative values with 0 As desired:
Région Ventes_Q1 Ventes_Q2
0 Nord 15000 20000
1 Sud 0 25000
2 Est 3000 0
3 Ouest 0 5000
Conditional masking of values
Pandas DataFrame.where() Can also be used to hide values, that is to say only make visible certain parts of a dataframa. In what follows, the dataframe should only display the values higher than a certain threshold (in this case, 10000). Here again, you should make sure that only digital columns are taken into account:
# N’afficher que les valeurs supérieures à 10 000
df_masked = df.copy()
df_masked[['Ventes_Q1', 'Ventes_Q2']] = df[['Ventes_Q1', 'Ventes_Q2']].where(df[['Ventes_Q1', 'Ventes_Q2']] > 10000)
print(df_masked)
python
In this case, the resulting dataframa df_masked only displayed the values higher than 10000. All other values are displayed as NaN ::
Région Ventes_Q1 Ventes_Q2
0 Nord 15000.0 20000.0
1 Sud NaN 25000.0
2 Est NaN NaN
3 Ouest NaN NaN
The function where() De Pandas is therefore a powerful tool for filtering, transforming and cleaning data effectively in a dataaframa, while preserving its original structure.

