Like almost all other programming languages, developers working in R programming can also save items of the same data type in the Array data structure and benefit from useful features.
Why do we use arrays in R?¶
Often called tables in French, arrays are a standard data structure, like character strings in R, which mainly helps with data organization and access. With indexing, programmers can efficiently access data stored in an array.
Even operations that concern the entire data set can be executed efficiently and easily with arrays. R arrays with several dimensions also make it possible to represent multidimensional data, for example in matrices or tensors.
Creating arrays in R: procedure¶
Programmers can create arrays in R having different dimensions. Whether you want to represent a simple vector or a complex multidimensional structure, everything is possible with an array in R. Two-dimensional arrays are often used, which can be in the form of a table or a matrix.
For example, a simple two-dimensional array containing the numbers 1 to 6 can be created as follows using the “array()” function:
exempledarray <- array(1:6, dim = c(2, 3))
R
In this example, you assign two parameters to the “array()” function: you first indicate the range of values “1:6” that your R array contains. The second parameter represents the dimensions of the array. In this example, a 2×3 array is created.
Ensure that within an array in R, only elements of the same data type can be saved. If you want to save different types of data in the same data structure, we advise you to consult the R data type “list”.
With “array()”, you can not only create new arrays, but also convert vectors or matrices already existing in your code into arrays. To do this, simply call the function with the element you want to sort into an array and specify the desired dimensions:
vector <- 1:9
vector_als_array <- array(vector, dim = c(3,3))
# Conversion de matrices en arrays
matrix <- matrix(1:9, nrow = 3, ncol = 3)
matrix_als_array <- array(matrix, dim = dim(matrix))
R
Indexing: learn how to access array elements¶
You can access the elements of your array using indexing. Indices of the desired element, as in many other programming languages, are given in square brackets. For multidimensional arrays, you can also display an entire row or column in addition to specific elements.
exempledarray <- array(1:6, dim = c(2, 3))
# Accès à l’élément dans la première ligne, deuxième colonne
element <- exempledarray[1, 2]
# Accès à la première ligne
ligne <- exempledarray[, 1]
# Accès à la première colonne
colonne <- exempledarray[1, ]
R
If you already use other programming languages but are still new to programming with R, indexing in R may seem surprising to you. Unlike many other languages, the count does not start at 0, but starts from 1, like counting in natural language.
Calculate with arrays¶
With arrays you can apply various mathematical functions to a whole volume of data. For example, you can calculate the sum of two arrays, which you can think of as adding two matrices. To do this, make sure the arrays have the same dimensions or lengths, which you can find using the R Array length function.
array1 <- array(1:4, dim = c(2,2))
array2 <- array(5:8, dim = c(2,2))
résultat <- array1 + array2
R
Apart from basic arithmetic operations, available as R operators, various functions are also defined on arrays in R to help you perform various calculations. For example, you can calculate the average of all the elements of an array with the R command “mean()”:
Another advantage is that using the R-Array function “apply(array, MARGIN, FUN)”, you can apply various functions to a dimension of your choice. The function has different parameters:
- array: array being examined
- MARGIN: dimension for which the function must be used where 1 is for the rows and 2 for the columns.
- FUN: vector function which returns a scalar result.
Here is an example of using “apply()”:
# Créer un array
testarray <- array(1:6, dim = c(2,3))
# Application de apply()
moyenne_colonnes <- apply(array, MARGIN = 2, FUN = mean)
# Affichage du résultat
print(moyenne_colonnes)
R
Displaying the programming code above returns three values, each indicating the average value of the column: 1.5, 3.5 5.5.