Longest Common Subsequence
Given two sequences and wish to find a maximum-length subsequence of X and Y.
Definition
Subsequence: Given a sequence X = <x1, x2, ......, xm>, another sequence Z = <z1, z1, ......zk> is a subsequence of X if there exists a strictly increasing sequence <i1, i2, ......ik> of indices of X such that for all j = 1, 2, ...... k, we have xij = zj.
Common Subsequence: Given two sequences X and Y, we say that s sequence Z is a common subsequence of X and Y if Z is a subsequence of both X and Y.
Theorem: Optimal substructure of an LCS
Let X = <x1, x2,..., xm> and Y = <y1, y2,...,yn> be sequences, and
let Z = <z1, z2,...,zk> be any LCS of X and Y.
1. If xm = yn, then z_k = x_m = y_n and Z_(k-1) is an LCS of X_(m-1)
and Y_(n-1).
2. If xm != yn, then z_k != x_m implies that Z is an LCS of X_(m-1)
and Y.
3. If xm != yn, then z_k != y_n implies that Z is an LCS of X and
Y_(n-1).
Longest Common Subsequence
Analysis:
Subproblem: f[i][j]: the length of LCS ending at position i of string A and position j of string B
Initialize: f[i][0]=0 and f[0][j]=0. Length of LCS for a string and an empty string is 0
Function: if A.charAt(i-1)==B.charAt(j-1), f[i][j]=f[i-1][j-1]+1; if A.charAt(i-1) != B.charAt(j-1), f[i][j] = MAX(f[i-1][j], f[i][j-1])
Answer: f[A.length()][B.length()]
public int longestCommonSubsequence(String A, String B) {
if (A == null || B == null)
return 0;
int M = A.length();
int N = B.length();
int[][] LCS = new int[M+1][N+1];
// Initialization
// Could skip cause Java automatically initialize arrays
for (int i=0; i<=M; i++) {
LCS[i][0] = 0;
}
for (int j=1; j<=N; j++) {
LCS[0][j] = 0;
}
// Function
for (int i=1; i<=M; i++) {
for (int j=1; j<=N; j++) {
LCS[i][j] = Math.max(LCS[i-1][j], LCS[i][j-1]);
if (A.charAt(i-1) == B.charAt(j-1))
LCS[i][j] = LCS[i-1][j-1]+1;
}
}
// Answer
return LCS[M][N];
}
Longest Common Substring
Analysis: Substring should occur continuously in original string. This is different with subsequence.
Subproblem: f[i][j]: common substring length with first i elements of s1 and first j elements of s2, not need to be the longest!
Initialize: f[i][0]=f[0][1]=0, cause no common substring with empty string
Function: if s1.charAt(i-1)==s2.charAt(j-1), f[i][j]=f[i-1][j-1]+1; else, f[i][j]=0
Answer: Need to search through the whole 2-dimentional array to get the max_value
public int longestCommonSubstring(String A, String B) {
if (A == null || B == null || A.length() == 0 || B.length() == 0)
return 0;
int M = A.length();
int N = B.length();
int[][] dp = new int[M+1][N+1];
// Initialize
// No need to initialize, arrays initialize to 0 in Java
// Function
for (int i=1; i<=M; i++) {
for (int j=1; j<=N; j++) {
if (A.charAt(i-1) == B.charAt(j-1))
dp[i][j] = dp[i-1][j-1]+1;
else
dp[i][j] = 0;
}
}
// Answer
// Max value of the 2-dimentional array
int max = 0;
for (int i=0; i<=M; i++) {
for (int j=0; j<=N; j++) {
max = Math.max(max, dp[i][j]);
}
}
return max;
}