The Longest Common Subsequence problem is a classic example of dynamic programming. It asks us to find the longest sequence of characters that appear left-to-right (not necessarily contiguously) in both input strings. For example, the LCS of "abcde"
and "ace"
is "ace"
with a length of 3. This problem is fundamental in text comparison, DNA sequence alignment, and file differencing tools.
To solve the LCS problem efficiently, we use a 2D dynamic programming table. Suppose the input strings are text1
and text2
of lengths m
and n
respectively. We create a table dp
where dp[i][j]
stores the length of the LCS between text1[0..i-1]
and text2[0..j-1]
.
We initialize a matrix of size (m+1) x (n+1)
with zeros. Then we iterate through the characters of both strings. If the characters at current positions match, we extend the LCS by 1; otherwise, we take the maximum LCS length from the previous row or column.
m
and n
are the lengths of the two input strings. We fill a matrix of size m×n.
If memory is a concern, you can reduce the space usage by only storing the previous row of the DP matrix. Since each cell dp[i][j]
only depends on dp[i-1][j]
, dp[i][j-1]
, and dp[i-1][j-1]
, a single row or two rows suffice.
The Longest Common Subsequence is a cornerstone problem in computer science education. Mastering its dynamic programming solution provides a strong foundation for tackling more complex string processing challenges. Whether you're preparing for coding interviews or implementing text comparison tools, the LCS algorithm is essential knowledge.