The longest common sequence computer problem

ven, 13 octobre 2023

PARTAGEZ

The search for the longest common sequence is among the classic problems in computer science. Solutions to LCS problems are often covered in programming talks and algorithm courses.

Longest common sequence problem: what is it?

This problem consists of find the longest common subsequence (“longest common subsequence”) with two sequences. A subsequence comes from an original sequence. The elements of a subsequence are still in the same order, but some of them may have been removed. Discover with us some examples that illustrate this principle:

Sequence `X`	Sequence `Y`	`LCS(X, Y)`
`PÈRE`	`MÈRE`	`ÈRE`
`PAPA`	`PAPI`	`PAP`
`DAVID`	`DANIEL`	`DAI`

It is appropriate to differentiate the longest common subsequence from the longest common substring (“longest common substring”). Unlike subsequence, substring cannot contain any spaces. In the example “|DAVID|DANIEL|DAI| ”, the longest common substring would therefore be “ DA”, because the “I” is found after a “V” or an “N”.

Practical example of the LCS problem

The problem of the longest common sequence is of importance in all areas where working with derived sequences is relevant. Solutions to the LCS problem are particularly used to examine similarities and differences between two texts, which helps detect plagiarism. The popular utility diffwhich highlights changes made to source text files, is also based on the LCS problem.

In the bioinformatics field, the problem of the longest common sequence is also encountered in theDNA sequence analysis. Over time, different mutations can change DNA bases at isolated locations. The presence of a long subsequence common to two sequences then indicates a strong genetic link. It is therefore possible to understand genetic changes between species in the context of evolution.

Linguists use the problem of the longest common sequence to study theevolution of languages over the centuries. If, in two different languages, words share the same meaning and a long common subsequence, this means that these languages are of the same origin. It is therefore possible to deduce a historical evolution from the correspondence of different letters.

How do solutions to the longest common sequence problem work?

The LCS problem can first be solved by a so-called “naive” approach. This strategy is the most obvious; it does not require any optimization or special method. It consists of determining all the subsequences of the two sequences, and finding the longest subsequence common to them. This approach is unfortunately not very effective, it is only suitable for the shortest sequences.

Discover three effective solutions to the LCS problem:

Recursive approach
Optimization by memoization
Dynamic programming

These approaches have one thing in common : they distinguish three different cases in relation to the two sequences they examine:

The last letter is identical.
The last letter is different.
The length of one of the sequences is zero.

However, they have differences in terms of temporal complexity (asymptotic execution) and spatial complexity (memory requirement):

Approach	Execution	Memory requirement
Naive approach	`O(n * n²)`	`O(n)`
Recursive approach	`O(n²)`	`O(1)`
Optimization by memoization	`O(n * m)`	`O(n * m)`
Dynamic programming	`O(n * m)`	`O(n * m)`

The algorithms presented below determine, for each case, the length of the longest common subsequence. It sometimes happens that several concrete subsequences of the same length exist, these can be determined by other steps.

Determine the longest common sequence recursively

The LCS problem presents an “optimal substructure”; this means that the problem can be divided into several sub-problems. To resolve them, it is then appropriate to adopt a recursive approach. Discover with us the implementation of such an algorithm in three popular programming languages.

Python

def lcs(X, Y, m, n):
  # Base case
  if m == 0 or n == 0:
    return 0
  # Last elements are equal
  elif X[m-1] == Y[n-1]:
    return 1 + lcs(X, Y, m-1, n-1)
  # Last elements differ
  else:
    return max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n))


X, Y = "DANIEL", "DAVID"
lcs_len = lcs(X, Y, len(X), len(Y))
print(f"Length of LCS is: {lcs_len}")

python

Java

import java.io.*;

class LCS {
  public static int lcs(String A, String B, int m, int n)
  {
    // Base case
    if (m == 0 || n == 0)
      return 0;
    // Last elements are equal
    if (A.charAt(m - 1) == B.charAt(n - 1))
      return 1 + lcs(A, B, m - 1, n - 1);
    // Last elements differ
    else
      return Math.max(lcs(A, B, m, n - 1),
       lcs(A, B, m - 1, n));
  }
  
  // Let’s test
  public static void main(String[] args)
    {
      String X = "DAVID";
      String Y = "DANIEL";
      int lcsLength = LCS.lcs(X, Y, X.length(), Y.length());
      System.out.println("Length of LCS is: " + lcsLength);
    }
}

Java

Cheap Internet domain

Much more than just a domain!

Personalize your online presence with a relevant domain name.

E-mail

SSL certificate

24/7 support

C++

#include <iostream> using namespace std; int lcs(string X, string Y, int m, int n) { // Base case if (m == 0 || n == 0) { return 0; } // Last elements are equal if (X[m - 1] == Y[n - 1]) { return 1 + lcs(X, Y, m - 1, n - 1); } // Last elements differ else { return max(lcs(X, Y, m, n - 1), lcs(X, Y, m - 1, n)); } } // Let’s test int main() { // Initialize variables string X = "DAVID"; string Y = "DANIEL"; // Compute and output length of LCS cout << "Length of LCS is " << lcs(X, Y, X.size(), Y.size()); return 0; }

c++

Optimize the recursive approach by memoization

Under the recursive approach, overlapping subsequences are calculated. This property, commonly called “overlapping subproblems”, is present in the Fibonacci sequence. In both cases, sequences are calculated recursively and repeatedly until the solution is obtained. To optimize the efficiency of the process, it may be relevant to use memorization. In other words, it involves caching the sub-problems already calculated in a two-dimensional matrix.

Python

def lcs(X, Y, m, n, table): # Base case if (m == 0 or n == 0): return 0 # Already computed value at given position if (table[m][n] != -1): return table[m][n] # Last elements are equal if X[m - 1] == Y[n - 1]: table[m][n] = 1 + lcs(X, Y, m - 1, n - 1, table) # Last elements differ else: table[m][n] = max(lcs(X, Y, m, n - 1, table), lcs(X, Y, m - 1, n, table)) return table[m][n] # Let’s test X = "DAVID" Y = "DANIEL" m, n = len(X), len(Y) # Initialize table fields to `-1` table = [[-1 for i in range(n + 1)] for j in range(m + 1)] // Compute and output length of LCS print(f"Length of LCS is: {lcs(X, Y, m, n, table)}")

python

Java

import java.io.*; class LCS { public static int lcs(String X, String Y, int m, int n, int[][] table) { // Base case if (m == 0 || n == 0) { return 0; } // Already computed value at given position if (table[m][n] != -1) { return table[m][n]; } // Last elements are equal if(X.charAt(m - 1) == Y.charAt(n - 1)) { table[m][n] = 1 + lcs(X, Y, m - 1, n - 1, table); return table[m][n]; } // Last elements differ else { table[m][n] = Math.max(lcs(X, Y, m, n - 1, table),lcs(X, Y, m - 1, n, table)); return table[m][n]; } } // Let’s test public static void main(String args[]){ // Initialize variables String X = "DAVID"; String Y = "DANIEL"; int m = X.length(); int n = Y.length(); int[][] table = new int[m + 1][n + 1]; // Initialize table fields to `-1` for(int i=0; i < m + 1; i++) { for(int j=0; j < n + 1; j++) { table[i][j] = -1; } } // Compute and output length of LCS int lcsLength = lcs(X, Y, m, n, table); System.out.println("Length of LCS is: " + lcsLength); } }

Java

C++

#include <bits/stdc++.h> using namespace std; int lcs(char* X, char* Y, int m, int n, vector<vector<int>>& table) { // Base case if (m == 0 || n == 0) return 0; // Already computed value at given position if (table[m][n] != -1) { return table[m][n]; } // Last elements are equal if (X[m - 1] == Y[n - 1]) { table[m][n] = 1 + lcs(X, Y, m - 1, n - 1, table); return table[m][n]; } // Last elements differ else { table[m][n] = max(lcs(X, Y, m, n - 1, table), lcs(X, Y, m - 1, n, table)); return table; } } // Let’s test int main() { // Initialize variables char X[] = "DAVID"; char Y[] = "DANIEL"; int m = strlen(X); int n = strlen(Y); // Initialize table with `-1` vector<vector<int>> table(m + 1, vector<int>(n + 1, -1)); // Compute and output length of LCS cout << "Length of LCS is: " << lcs(X, Y, m, n, table); return 0; }

c++

Program the longest common sequence dynamically

Dynamic programming is a non-recursive technique that allows you to solve an optimization problem by dividing it into multiple subproblems, then solved using a bottom-up approach. Dynamic programming is notably used as a solution to problems related to path-finding algorithms. The longest common sequence problem can also be solved using dynamic programming, via a two-dimensional matrix.

Python

def lcs(X , Y, m, n): # Initialize dynamic programming table fields to `None` table = [[None] * (n + 1) for _ in range(m + 1)] # Compute table values for i in range(m + 1): for j in range(n + 1): # Base case if i == 0 or j == 0 : table[i][j] = 0 # Last elements are equal elif X[i - 1] == Y[j - 1]: table[i][j] = table[i - 1][j - 1]+ 1 # Last elements differ else: table[i][j] = max(table[i - 1][j] , table[i][j - 1]) return table[m][n] # Let’s test X = "DAVID" Y = "DANIEL" # Compute and output length of LCS lcs_len = lcs(X, Y, len(X), len(Y)) print(f"Length of LCS is: {lcs_len}")

python

Java

import java.io.*; class LCS { public static int lcs(String X, String Y, int m, int n) { // Initialize dynamic programming table fields int table[][] = new int[m + 1][n + 1]; for (int i = 0; i <= m; i++) { for (int j = 0; j <= n; j++) { // Base case if (i == 0 || j == 0) table[i][j] = 0; // Last elements are equal else if (X.charAt(i - 1) == Y.charAt(j - 1)) table[i][j] = table[i - 1][j - 1] + 1; // Last elements differ else table[i][j] = Math.max(table[i - 1][j], table[i][j - 1]); } } return table[m][n]; } // Let’s test public static void main(String args[]){ // Initialize variables String X = "DAVID"; String Y = "DANIEL"; int m = X.length(); int n = Y.length(); // Compute and output length of LCS int lcsLength = lcs(X, Y, m, n); System.out.println("Length of LCS is: " + lcsLength); } }

Java

C++

#include <bits/stdc++.h> using namespace std; int lcs(string X, string Y, int m, int n) { // Initialize dynamic programming table int table[m + 1][n + 1]; // Compute table values for (int i = 0; i <= m; i++) { for (int j = 0; j <= n; j++) { // Base case if (i == 0 || j == 0) table[i][j] = 0; // Last elements are equal else if (X[i - 1] == Y[j - 1]) table[i][j] = table[i - 1][j - 1] + 1; // Last elements differ else table[i][j] = max(table[i - 1][j], table[i][j - 1]); } } return table[m][n]; } // Let’s test int main() { // Initialize variables string X = "DAVID"; string Y = "DANIEL"; int m = X.size(); int n = Y.size(); // Compute and output length of LCS cout << "Length of LCS is " << lcs(X, Y, m, n); return 0; }

c++

Télécharger notre livre blanc

Comment construire une stratégie de marketing digital ?

Le guide indispensable pour promouvoir votre marque en ligne

Email

PrécédentArticle précédentInstall MySQL on Ubuntu 22.04: step by step instructions

Article suivantUbuntu install MySQL: installation on Ubuntu 20.04Suivant

En savoir plus

Web Marketing

What does “RTFM” mean? Origin and use of the abbreviation

Abbreviations like “LOL” are extremely popular on the internet. In this article, discover with us the meaning of the abbreviation “RTFM”, its origin and its

26 juillet 2024

Web Marketing

BigDecimal in Java: everything you need to know about the class

With the Java BigDecimal class, it is possible to process complex floating point numbers precisely. These must first be created, and can then be used

26 juillet 2024

Souhaitez vous Booster votre Business?

écrivez-nous et restez en contact

Contactez-nous

AMZ DIGICOM

AMZ DIGICOM

The longest common sequence computer problem

PARTAGEZ

Longest common sequence problem: what is it?

Practical example of the LCS problem

How do solutions to the longest common sequence problem work?

Determine the longest common sequence recursively

Python

Java

C++

Optimize the recursive approach by memoization

Python

Java

C++

Program the longest common sequence dynamically

Python

Java

C++

Télécharger notre livre blanc

Comment construire une stratégie de marketing digital ?

Le guide indispensable pour promouvoir votre marque en ligne

En savoir plus

What does “RTFM” mean? Origin and use of the abbreviation

BigDecimal in Java: everything you need to know about the class

Souhaitez vous Booster votre Business?

écrivez-nous et restez en contact

Suivez-nous:

© 2024 AMZ DIGICOM All Rights Reserved

AMZ DIGICOM

AMZ DIGICOM

The longest common sequence computer problem

PARTAGEZ

Longest common sequence problem: what is it?¶

Practical example of the LCS problem¶

How do solutions to the longest common sequence problem work?

Determine the longest common sequence recursively¶

Python¶

Java¶

C++¶

Optimize the recursive approach by memoization¶

Python¶

Java¶

C++¶

Program the longest common sequence dynamically¶

Python¶

Java¶

C++¶

Télécharger notre livre blanc

Comment construire une stratégie de marketing digital ?

Le guide indispensable pour promouvoir votre marque en ligne

En savoir plus

What does “RTFM” mean? Origin and use of the abbreviation

BigDecimal in Java: everything you need to know about the class

Souhaitez vous Booster votre Business?

écrivez-nous et restez en contact

Suivez-nous:

© 2024 AMZ DIGICOM All Rights Reserved

Longest common sequence problem: what is it?

Practical example of the LCS problem

Determine the longest common sequence recursively

Python

Java

C++

Optimize the recursive approach by memoization

Python

Java

C++

Program the longest common sequence dynamically

Python

Java

C++