AMZ DIGICOM

Digital Communication

AMZ DIGICOM

Digital Communication

DataFrame pandas: indexing – ionos

PARTAGEZ

The indexing of data in Pandas Python allows effective and direct access to specific data within a dataframa. The use of a Pandas DataFrame index allows you to select specific lines and columns, which can considerably facilitate data analysis.

Web accommodation

Flexible, efficient and safe web accommodation

  • SSL certificate and DDOS protection
  • Data backup and restoration
  • Assistance 24/7 and personal advisor

What's going on during indexing?

The indexing of a dataframa pandas aims to Facilitate the selection of specific dataframe elements. We can thus select lines and columns depending on their positions or their labels. Indexes can help find and process data faster by providing A kind of « address system » for the data structure.

Pandas syntax DataFrame.index

You can see the index values ​​of a dataframe pandas with the property index. The syntax is as follows:

Dataframas indexing syntax

There are several ways to index dataframes pandas. The indexing syntax varies depending on the desired operation.

Indexing with labels (column names)

DataFrames pandas can use column names for indexing. To do this, we will first create an example of DataFrame:

import pandas as pd
# Création d'un DataFrame d'exemple
données = {
    'Nom': ['Alice', 'Bob', 'Charlie'],
    'Âge': [25, 30, 35],
    'Ville': ['New York', 'Los Angeles', 'Chicago']
}
df = pd.DataFrame(données)
print(df)

python

Dataframa presents itself as follows:

Nom  Âge     Ville
0    Alice   25     New York
1      Bob   30  Los Angeles
2  Charlie   35      Chicago

If you want to access all the values ​​of a specific column, you can use its name in combination with the operator []. Just specify the name of the column in the indexing operator in the form of a chain (string) Python:

# Accès à la colonne « Âge »
print(df['Âge'])

python

You get the different age values ​​as a result:

0    25
1    30
2    35
Name: Âge, dtype: int64

If you are not interested in one, but in several columns, you can simply specify their names, separated by commas, in the indexing operator.

Indexing with loc[] (line name)

If you want to access a specific line of your dataframe, you need the Pandas Indexer loc[]. You then pass either the name of the line or the line number. In the following code example, we consider the same dataframe as before; We want to extract the front line containing the values ​​for « Alice »:

As expected, the values ​​corresponding to « Alice » are visible in the result:

Nom         Alice
Âge            25
Ville    New York
Name: 0, dtype: object

Indexing with iloc[] (lines of lines and columns)

Another way to access specific elements of your dataaframa is to use line and column numbers. To work with the digital index of the dataframe pandas, you need the DataFrame Iloc property[].

# Accès à la première ligne
print(df.iloc[0])
# Accès à la valeur dans la première ligne et la deuxième colonne
print(df.iloc[0, 1])

python

The results of using iloc[] look like this and refer the expected values:

Nom         Alice
Âge            25
Ville    New York
Name: 0, dtype: object

Index individual values

If you are only interested in a specific value of your dataframe, the indexor at is an effective way to extract this value. Simply define the line and column in which the value must be found, with their names. Thus, if the place of residence of Bob is interesting, we need the « city » column and the first line:

print(df.at(1, 'Ville'))

python

As requested, the exit is the city of residence of Bob, ie « Los Angeles ».

You can also use the Indexer iatwhich works in the same way as atbut who awaits the position instead of the name. The same result as in the example of previous code is obtained with the use of iat ::

print(df.iat[1, 2])

python

Boolean indexation

It is possible to index sub-assemblies of a dataframa based on a particular condition. In this case, we are talking about Boolean indexing. The condition to be verified must be assessed at True or at False and is placed directly in the indexing operator. To extract only the lines in which the person is over 30 years old, we can proceed as follows:

# Sélection des lignes où l'âge est supérieur à 30
print(df[df['Âge'] > 30])

python

The above condition only applies to « Charlie », 35 years old. The outing is therefore as follows:

Nom  Âge    Ville
2  Charlie   35  Chicago

Indexing is a fundamental pandas tool which allows effectively access to data and to extract sub-assemblies relevant for analysis.

Note

Note that in Boolean indexing, you can use all Boolean comparison operators that assess either to True or towards False. To find out more about the different Python operators, consult our guide article on the subject.

Télécharger notre livre blanc

Comment construire une stratégie de marketing digital ?

Le guide indispensable pour promouvoir votre marque en ligne

En savoir plus

Web Marketing

Ubuntu FTP server: How to configure it?

An Ubuntu FTP server allows both downloading and sending files, each access being controlled by a separate connection. In this tutorial on the Ubuntu FTP

Web Marketing

Pandas loc[] : explanation of the function

Pandas DataFrame.loc[] is a DataFrame property in the Python Pandas library used to select data from a dataaframa depending on labels. Thus, the lines and

Souhaitez vous Booster votre Business?

écrivez-nous et restez en contact