Want Help Cracking FAANG?

(Then click this)

×
Back to Question Bank

10. Regular Expression Matching - Leetcode Solution

Code Implementation

class Solution:
    def isMatch(self, s: str, p: str) -> bool:
        m, n = len(s), len(p)
        dp = [[False] * (n + 1) for _ in range(m + 1)]
        dp[0][0] = True

        for j in range(2, n + 1):
            if p[j - 1] == '*':
                dp[0][j] = dp[0][j - 2]

        for i in range(1, m + 1):
            for j in range(1, n + 1):
                if p[j - 1] == '.' or p[j - 1] == s[i - 1]:
                    dp[i][j] = dp[i - 1][j - 1]
                elif p[j - 1] == '*':
                    dp[i][j] = dp[i][j - 2]
                    if p[j - 2] == '.' or p[j - 2] == s[i - 1]:
                        dp[i][j] = dp[i][j] or dp[i - 1][j]
        return dp[m][n]
    
class Solution {
public:
    bool isMatch(string s, string p) {
        int m = s.size(), n = p.size();
        vector<vector<bool>> dp(m + 1, vector<bool>(n + 1, false));
        dp[0][0] = true;
        for (int j = 2; j <= n; ++j) {
            if (p[j - 1] == '*') {
                dp[0][j] = dp[0][j - 2];
            }
        }
        for (int i = 1; i <= m; ++i) {
            for (int j = 1; j <= n; ++j) {
                if (p[j - 1] == '.' || p[j - 1] == s[i - 1]) {
                    dp[i][j] = dp[i - 1][j - 1];
                } else if (p[j - 1] == '*') {
                    dp[i][j] = dp[i][j - 2];
                    if (p[j - 2] == '.' || p[j - 2] == s[i - 1]) {
                        dp[i][j] = dp[i][j] || dp[i - 1][j];
                    }
                }
            }
        }
        return dp[m][n];
    }
};
    
class Solution {
    public boolean isMatch(String s, String p) {
        int m = s.length(), n = p.length();
        boolean[][] dp = new boolean[m + 1][n + 1];
        dp[0][0] = true;
        for (int j = 2; j <= n; j++) {
            if (p.charAt(j - 1) == '*') {
                dp[0][j] = dp[0][j - 2];
            }
        }
        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (p.charAt(j - 1) == '.' || p.charAt(j - 1) == s.charAt(i - 1)) {
                    dp[i][j] = dp[i - 1][j - 1];
                } else if (p.charAt(j - 1) == '*') {
                    dp[i][j] = dp[i][j - 2];
                    if (p.charAt(j - 2) == '.' || p.charAt(j - 2) == s.charAt(i - 1)) {
                        dp[i][j] = dp[i][j] || dp[i - 1][j];
                    }
                }
            }
        }
        return dp[m][n];
    }
}
    
var isMatch = function(s, p) {
    const m = s.length, n = p.length;
    const dp = Array.from({length: m + 1}, () => Array(n + 1).fill(false));
    dp[0][0] = true;
    for (let j = 2; j <= n; j++) {
        if (p[j - 1] === '*') {
            dp[0][j] = dp[0][j - 2];
        }
    }
    for (let i = 1; i <= m; i++) {
        for (let j = 1; j <= n; j++) {
            if (p[j - 1] === '.' || p[j - 1] === s[i - 1]) {
                dp[i][j] = dp[i - 1][j - 1];
            } else if (p[j - 1] === '*') {
                dp[i][j] = dp[i][j - 2];
                if (p[j - 2] === '.' || p[j - 2] === s[i - 1]) {
                    dp[i][j] = dp[i][j] || dp[i - 1][j];
                }
            }
        }
    }
    return dp[m][n];
};
    

Problem Description

Given a string s and a pattern p, implement regular expression matching with support for '.' and '*':

  • '.' matches any single character.
  • '*' matches zero or more of the preceding element.

The matching should cover the entire input string (not partial). Return true if s matches the pattern p, otherwise return false.

Constraints:

  • Both s and p consist of only lowercase English letters, '.', and '*'.
  • p will not start with '*' and no two consecutive '*' will appear.
  • There is only one valid answer for each input (no ambiguity).
  • All characters of s must be matched; you cannot reuse characters in s.

Thought Process

At first glance, this problem looks like a typical string matching question, but the presence of '.' and '*' makes it much more complex. The '.' wildcard is straightforward, but '*' means we have to consider multiple matching scenarios for each character in the pattern. A brute-force approach would try every possible way '*' could expand, leading to an exponential number of possibilities.

To optimize, we realize that many subproblems repeat (for the same indices of s and p), so dynamic programming is a natural fit. We can use a 2D table where each entry represents whether a substring of s matches a substring of p, and build up our answer from smaller subproblems.

Solution Approach

We use dynamic programming to solve this problem efficiently. Here's how:

  1. DP Table Definition:
    • Let dp[i][j] be true if the first i characters of s match the first j characters of p.
  2. Base Case:
    • dp[0][0] = true, meaning an empty string matches an empty pattern.
    • If p has '*' that could match zero characters, fill in dp[0][j] accordingly.
  3. Filling the Table:
    • For each character in s (index i) and each character in p (index j):
      • If p[j-1] is a normal character or '.', check if it matches s[i-1] (or '.'), and set dp[i][j] = dp[i-1][j-1].
      • If p[j-1] is '*':
        • First, assume '*' means zero occurrence: dp[i][j] = dp[i][j-2].
        • Then, if p[j-2] matches s[i-1] (or is '.'), dp[i][j] |= dp[i-1][j] (meaning, we use one more occurrence of the character before '*').
  4. Final Answer:
    • The value at dp[len(s)][len(p)] tells us if the entire string matches the entire pattern.

This approach avoids redundant calculations and ensures we consider all valid ways '*' can be interpreted.

Example Walkthrough

Let's walk through the example s = "aab", p = "c*a*b":

  1. The pattern "c*" can match zero 'c's, so we move to "a*b".
  2. "a*" can match two 'a's in s, so after consuming "aa", we're left with "b" in s and "b" in p.
  3. The last 'b' matches, so the whole string matches the pattern.
  4. In the DP table, this is reflected as we move through dp[3][5] (the bottom-right corner), which becomes true after propagating the possible matches.

Each step in the table considers whether to use '*' as zero or more occurrences, and whether the current character matches.

Time and Space Complexity

Brute-force Approach:

  • Each '*' can branch into two choices (use or skip), leading to exponential time: O(2^{m+n}).
Optimized DP Approach:
  • We fill a table of size (m+1) * (n+1), so time complexity is O(mn).
  • Space complexity is also O(mn) for the DP table.

This is a huge improvement over brute-force, as each subproblem is solved only once.

Summary

The Regular Expression Matching problem is a classic example of using dynamic programming to efficiently handle overlapping subproblems. By carefully defining our DP state and considering the special roles of '.' and '*', we can solve the problem in polynomial time. The key insight is to recognize when to skip or use '*', and to build our solution from small subproblems. This approach is both efficient and elegant, making it a great demonstration of DP techniques in string processing.