This will be a complete deep understanding guide so you can use LCS intuition for many DP problems.
Table of Contents
Subsequence vs Substring
- Subsequence: Characters in order, but not necessarily contiguous.
- Substring: Characters must be continuous.
#Subsequence
String = "ABCDE"
Valid subsequences:
A
ACE
BDE
ABCDE
#Substring
ABC ✔
BCD ✔
ACE ❌PythonProblem Definition
Given two strings. Find the length of the longest subsequence common to both strings.
Brute Force Thinking (Important for intuition)
At each character we have two choices
For strings
A B C D
A C B D
We can:
take a char
skip a char
Brute force explores all subsequences.
Total subsequences of length n = 2^n
So brute force complexity becomes: O(2^(n+m)) #Too slow.
n length of string 1
m length of string 2PythonKey Insight
Compare characters: text1[i] and text2[j]
Case 1: Characters match
- text1[i] == text2[j]
- Then the character must be part of LCS.
- So: LCS(i, j) = 1 + LCS(i+1, j+1)
Case 2: Characters don’t match
- We skip one character.
- Two possibilities:
- skip text1[i]
- skip text2[j]
- So: LCS(i, j) = max(LCS(i+1, j), LCS(i, j+1))
LCS(i,j)
/ \
(match) / \ (mismatch)
/ \
1 + LCS(i+1,j+1) max(LCS(i+1,j),LCS(i,j+1))PythonRecursive Code
#Variation 1
def lcs(i, j):
if i == len(text1) or j == len(text2):
return 0
if text1[i] == text2[j]:
return 1 + lcs(i+1, j+1)
return max(
lcs(i+1, j),
lcs(i, j+1)
)
#Variation 2
def lcs(i, j):
if i < 0 or j < 0:
return 0
if text1[i] == text2[j]:
return 1 + lcs(i-1, j-1)
return max(
lcs(i-1, j),
lcs(i, j-1)
)
PythonMemoization (Top-Down DP)
The Repeating Subproblem Appears
LCS(0,0)
/ \
LCS(1,0) LCS(0,1)
| |
LCS(2,1) LCS(1,2)
\ /
\ /
LCS(2,2)
LCS(2,2) is computed multiple times. PythonCode
def lcs(i, j):
if i == len(text1) or j == len(text2):
return 0
if (i, j) in memo:
return memo[(i, j)]
if text1[i] == text2[j]:
ans = 1 + lcs(i + 1, j + 1)
else:
ans = max(
lcs(i + 1, j),
lcs(i, j + 1)
)
memo[(i, j)] = ans
return ans
text1 = "abcde"
text2 = "ace"
memo = {}
print(lcs(0, 0))Python- Time : O(nm)
- Space : O(nm)
Bottom-Up Code
Example
text1 = ABCD
text2 = ACBD
DP table
Initialize
A C B D
0 0 0 0 0
A 0
B 0
C 0
D 0
Final table
A C B D
0 0 0 0 0
A 0 1 1 1 1
B 0 1 1 2 2
C 0 1 2 2 2
D 0 1 2 2 3
PythonCode
def longestCommonSubsequence(text1, text2):
n = len(text1)
m = len(text2)
dp = [[0]*(m+1) for _ in range(n+1)]
for i in range(1, n+1):
for j in range(1, m+1):
if text1[i-1] == text2[j-1]:
dp[i][j] = 1 + dp[i-1][j-1]
else:
dp[i][j] = max(
dp[i-1][j],
dp[i][j-1]
)
return dp[n][m]PythonReconstruct the LCS String
It’s so simple , just save the characters
if text1[i-1] == text2[j-1]:
dp[i][j] = 1 + dp[i-1][j-1]
#save charaters herePythonNoop
we cannot use the same for loops for reconstruction is because reconstruction does not follow a fixed order like the DP filling phase.
Reconstruction is Different
- Reconstruction does not visit every cell.
- Instead, it follows one path from: (n, m) → (0, 0)
# Reconstruct LCS
i = n
j = m
lcs = []
while i > 0 and j > 0:
if text1[i-1] == text2[j-1]:
lcs.append(text1[i-1])
i -= 1
j -= 1
elif dp[i-1][j] > dp[i][j-1]:
i -= 1
else:
j -= 1
return ''.join(reversed(lcs))Python- We move in reverse direction, then reversed the end string
- dp[i-1][j] > dp[i][j-1] Meaning the best LCS came from: dp[i-1][j] So we move up. i -= 1
- dp[i][j-1] >= dp[i-1][j] Meaning the best LCS came from: dp[i][j-1] So we move left. j -= 1
Longest Common Substring
s1 = "abcdgh"
s2 = "acdghr"
a c d g h r
0 0 0 0 0 0 0
a 0 1 0 0 0 0 0
b 0 0 0 0 0 0 0
c 0 0 1 0 0 0 0
d 0 0 0 2 0 0 0
g 0 0 0 0 3 0 0
h 0 0 0 0 0 4 0Pythondef longestCommonSubstring(s1, s2):
n = len(s1)
m = len(s2)
dp = [[0]*(m+1) for _ in range(n+1)]
longest = 0
for i in range(1, n+1):
for j in range(1, m+1):
if s1[i-1] == s2[j-1]:
dp[i][j] = 1 + dp[i-1][j-1]
longest = max(longest, dp[i][j])
else:
dp[i][j] = 0
return longestPython- Time : O(n * m)
- Space : O(n * m)
Space Optimized Version
We only need previous row.
def longestCommonSubstring(s1, s2):
n = len(s1)
m = len(s2)
prev = [0]*(m+1)
longest = 0
for i in range(1, n+1):
curr = [0]*(m+1)
for j in range(1, m+1):
if s1[i-1] == s2[j-1]:
curr[j] = 1 + prev[j-1]
longest = max(longest, curr[j])
prev = curr
return longestPython- Space O(m)
Print the Substring
Track the ending index.
def longestCommonSubstring(s1, s2):
n = len(s1)
m = len(s2)
dp = [[0]*(m+1) for _ in range(n+1)]
longest = 0
end_index = 0
for i in range(1, n+1):
for j in range(1, m+1):
if s1[i-1] == s2[j-1]:
dp[i][j] = 1 + dp[i-1][j-1]
if dp[i][j] > longest:
longest = dp[i][j]
end_index = i
return s1[end_index-longest:end_index]Python