Dynamic time warping is used to check similarities between different lengths of sequences of values as part of a pattern comparison. For this purpose, the dynamic temporal warping algorithm generates warping paths which allow, thanks to backtracking and differential measurements, to highlight similarities by disregarding temporal warping or speed differences. This concept is notably used in the context of voice recognition, digital signature recognition or financial market analysis.
What does “dynamic time warping” mean?¶
The term “dynamic time warping” is translated, in French, by the concept of “dynamic temporal deformation”. If this technology may seem straight out of Star Trek, it is not. In truth, this algorithm is simply used to compare different sequences of times or values. To do this, he puts them in correspondence before confronting them with each other. Thanks to the “warping” of these sequences, a term which moreover gave its name to the algorithm, it is possible to highlight common points and correspondences between the models of the different sequences, even if their length or their speed are not the same.
For example, if two different sequences involving walking have to be analyzed to look for matches, the algorithm is able to recognize identical models, even when there are differences in walking speed or distance travelled. With regard to the comparison of signatures of the same person, it is also possible to highlight common points, even if the size of the writing is not the same.
What is the purpose of dynamic time warping?¶
It must be recognized that at first sight, the DTW has everything of an abstract concept without any real practical utility. And yet: dynamic time warping fulfills an important analytical function in many fields of application, in particular the study of video and audio sequences and the sequential analysis of graphs or statistical models, that is, sequences that can be reproduced and compared in a linear fashion. By identifying similarities, the DTW enables the triggering of specific actions, signals or functions.
Furthermore, measuring and recognizing patterns in sequences of values make it possible to study the evolution of similar systems, even over different periods. The dynamic time warping algorithm can therefore also be used with advanced technologies, such as machine learning, supervised learning or artificial neural networks to train the analysis and reaction capacities of self-learning systems. and evaluate datasets more efficiently.
How does dynamic time warping work?
In order to identify patterns and similarities in different sequences of values, the DTW searches for optimal matches, also known as ” optimal matches ” in English. The principles “One-to-many” or “Many-to-one” can be useful here. Various rules and conditions then apply:
- Within a sequence, each value must be compared to one or more values in the second sequence (and vice versa).
- The first value of a sequence must be compared to the first value of the second sequence.
- The last value of one sequence must be compared to the last value of the second sequence.
- The mapping of the series of values from the first sequence to the series of values from the second sequence must increase uniformly. The positions of the start and end values of the sequences must match, with no omissions or duplicates.
For an optimal match, all specifications and conditions must be met. To do this, it is possible to use a cost function. This makes it possible to create a measurement of the difference between the values of the sequences. To achieve an optimal match, the cost function should be as low as possible.
Furthermore, the algorithm creates a strain path which is able to identify optimal matches in sequences of different lengths. The trace return makes it possible to create this deformation path, in which the algorithm matches one or more values of a sequence with points of the second sequence. Despite this temporal deformation, or rather thanks to it, it is then possible to highlight concordances, even when the sequences do not have the same length.
What are the fields of application of dynamic time warping?¶
It is possible to use dynamic time warping in all areas where datasets can be represented and compared as linear sequences. The DTW is therefore often used as a pre- or post-study as part of data analysis. This can be audio, video and alphanumeric data, or datasets based on Big Data.
The following examples also belong to the fields of application of the DTW:
- Pattern recognition in audio sequences: voice recognition is a very important application case for the DTW. In this particular case, recorded and saved linguistic models are compared with each other thanks to the DTW. For audio sequences that do not have the same length or the same speed, the DTW uses so-called “adaptive” time warping to create a warping path. This process makes it possible to identify matches in vowels and consonants, even if they are not pronounced at the same speed or if their length is different.
- Pattern recognition in video sequences: For gesture recognition and motion detection, the DTW can map video sequences, including motion sequences, onto each other. This makes it possible to compare these sequences of movements and to recognize certain patterns in them, even when their length or speed is different.
- Pattern recognition in financial data: financial markets and corporate finance are also important areas of application. By using the DTW, it is possible to make forecasts regarding financial cycles, turnover curves or stock market trends. The DTW makes it possible, over different periods, to identify and visualize certain similar or identical cycles and trends in market and company data. This form of study applies to forecasts as well as commercial and financial data collected.
Which tools apply dynamic time warping?¶
The DTW algorithm is present in the libraries of several open source software. Here are some examples:
- DTW Suitewhich offers programming packages for Python and R;
- FastDTWwhich provides a Java implementation for DTW’s algorithms;
- MatchBoxwhich proposes to implement the DTW to audio signals;
- mlpy and pydtw, which also provide functions of DTW as Python libraries for machine learning;
- Gesture Recognitionwhich offers DTW algorithms for real-time gesture recognition.
To use the DTW in the most efficient way, it is also possible to use the rapid calculation techniques following:
- PlumDTW
- SparseDTW
- FastDTW
- MultiscaleDTW
Example: the dynamic time warping algorithm in Python¶
To illustrate the complexity of dynamic time warping, below is an example of a DTW algorithm in Python code:
def dtw(s, t, window):
n, m = len(s), len
w = np.max([window, abs(n-m)])
dtw_matrix = np.zeros((n+1, m+1))
for i in range(n+1):
for j in range(m+1):
dtw_matrix[i, j] = np.inf
dtw_matrix[0, 0] = 0
for i in range(1, n+1):
for j in range(np.max([1, i-w]), np.min([m, i+w])+1):
dtw_matrix[i, j] = 0
for i in range(1, n+1):
for j in range(np.max([1, i-w]), np.min([m, i+w])+1):
cost = abs(s[i-1] - t[j-1])
# take last min from a square box
last_min = np.min([dtw_matrix[i-1, j], dtw_matrix[i, j-1], dtw_matrix[i-1, j-1]])
dtw_matrix[i, j] = cost + last_min
return dtw_matrix
Python
Three settings are first filled in the function shown in the sample code: first, the two signals that need to be analyzed are filled in the parameters “ s ” And ” you “. Typically, signals correspond to arrays or vectors. The parameter ” windows is used to specify the number of other elements with which a match is possible.
In the function, a matrix is first created and then infinitely initialized with the corresponding value. The central stage of the DTW takes place in the last two nested for loops : the value saved in the variable ” last_min is added to the previous costs, saved in the variable “ costs ”, which result from the distance between the two input values at the corresponding index.
This value results from already calculated values, given the dynamic programming. The lowest of the neighboring values already calculated beforehand is selected, then added to the cost also calculated previously. During this step, it is a question of deciding between the direct matching of two elements, the addition of an element or the deletion of an element.
Once the function is executed, it produces a distance matrix from which the deformation path can be read.
Alternatives to dynamic time warping¶
In addition to DTW, other solutions for recognizing patterns in value and time sequences can be used, including:
- the concept of ” correlation optimized warping (COW, deformation optimized by correlation);
- functional data analysis;
- the hidden Markov model;
- the Viterbi algorithm.