140. Word Break II - Leetcode Solution

Code Implementation

from typing import List

class Solution:
    def wordBreak(self, s: str, wordDict: List[str]) -> List[str]:
        word_set = set(wordDict)
        memo = {}

        def dfs(start):
            if start == len(s):
                return ['']
            if start in memo:
                return memo[start]
            res = []
            for end in range(start + 1, len(s) + 1):
                word = s[start:end]
                if word in word_set:
                    for sub in dfs(end):
                        if sub:
                            res.append(word + ' ' + sub)
                        else:
                            res.append(word)
            memo[start] = res
            return res

        return dfs(0)

#include <vector>
#include <string>
#include <unordered_set>
#include <unordered_map>

using namespace std;

class Solution {
public:
    vector<string> wordBreak(string s, vector<string>& wordDict) {
        unordered_set<string> wordSet(wordDict.begin(), wordDict.end());
        unordered_map<int, vector<string>> memo;
        return dfs(s, 0, wordSet, memo);
    }

private:
    vector<string> dfs(const string& s, int start, unordered_set<string>& wordSet, unordered_map<int, vector<string>>& memo) {
        if (start == s.size()) return {""};
        if (memo.count(start)) return memo[start];
        vector<string> res;
        for (int end = start + 1; end <= s.size(); ++end) {
            string word = s.substr(start, end - start);
            if (wordSet.count(word)) {
                vector<string> subs = dfs(s, end, wordSet, memo);
                for (const string& sub : subs) {
                    if (sub.empty())
                        res.push_back(word);
                    else
                        res.push_back(word + " " + sub);
                }
            }
        }
        memo[start] = res;
        return res;
    }
};

import java.util.*;

class Solution {
    public List<String> wordBreak(String s, List<String> wordDict) {
        Set<String> wordSet = new HashSet<>(wordDict);
        Map<Integer, List<String>> memo = new HashMap<>();
        return dfs(s, 0, wordSet, memo);
    }

    private List<String> dfs(String s, int start, Set<String> wordSet, Map<Integer, List<String>> memo) {
        if (start == s.length()) return Arrays.asList("");
        if (memo.containsKey(start)) return memo.get(start);
        List<String> res = new ArrayList<>();
        for (int end = start + 1; end <= s.length(); end++) {
            String word = s.substring(start, end);
            if (wordSet.contains(word)) {
                List<String> subs = dfs(s, end, wordSet, memo);
                for (String sub : subs) {
                    if (sub.isEmpty())
                        res.add(word);
                    else
                        res.add(word + " " + sub);
                }
            }
        }
        memo.put(start, res);
        return res;
    }
}

/**
 * @param {string} s
 * @param {string[]} wordDict
 * @return {string[]}
 */
var wordBreak = function(s, wordDict) {
    const wordSet = new Set(wordDict);
    const memo = new Map();

    function dfs(start) {
        if (start === s.length) return [''];
        if (memo.has(start)) return memo.get(start);
        const res = [];
        for (let end = start + 1; end <= s.length; ++end) {
            const word = s.substring(start, end);
            if (wordSet.has(word)) {
                const subs = dfs(end);
                for (const sub of subs) {
                    if (sub) res.push(word + ' ' + sub);
                    else res.push(word);
                }
            }
        }
        memo.set(start, res);
        return res;
    }

    return dfs(0);
};

Problem Description

Given a non-empty string s and a list of strings wordDict (called the word dictionary), return all possible sentences where s can be segmented into a sequence of one or more dictionary words, with spaces added between the words. Each word in wordDict can be used multiple times in the segmentation.

Return all valid sentences that can be formed.
Each word in wordDict can be reused any number of times.
The answer should include all possible ways to break s into words from the dictionary.
If there is no valid segmentation, return an empty list.

Example:
Input: s = "catsanddog", wordDict = ["cat","cats","and","sand","dog"]
Output: ["cats and dog", "cat sand dog"]

Thought Process

At first glance, the problem seems to ask for all ways to break a string into words, similar to generating all possible sentences using the dictionary. A naive approach would be to try every possible way to split the string, and for each split, check if the left part is a valid word, then recursively try to break the remaining right part. This brute-force approach quickly becomes inefficient as the number of possible splits grows exponentially with the string length.

To optimize, we can notice that many subproblems repeat: for example, the way to break the substring starting at a certain index doesn't depend on how we got there. This is a typical case for memoization (caching results), where we store the results of subproblems and reuse them when needed, avoiding redundant work.

The key insight is to use recursion to explore all possible splits, but to cache the results for each starting index. This way, if we need to solve the same subproblem again, we can just look up the answer instead of recomputing it.

Solution Approach

Step 1: Represent the Dictionary Efficiently
Store wordDict as a set for O(1) word lookups.
Step 2: Use Recursion to Explore All Splits
Create a recursive helper function (let's call it dfs(start)) that returns all sentences that can be formed from s[start:].
Step 3: Try Every Possible Split
For each possible end index from start+1 to len(s), check if s[start:end] is in the dictionary.
- If it is, recursively call dfs(end) to get all ways to break the rest of the string.
- For each result from the recursive call, append the current word (and a space if needed) to form a new sentence.
Step 4: Use Memoization
Store the result of dfs(start) in a hash map (memo). If we call dfs(start) again, return the cached value.
Step 5: Base Case
If start == len(s), return a list with an empty string, representing a valid sentence ending.
Step 6: Collect and Return Results
Start recursion from index 0. The final result is all possible sentences breaking the entire string.

This approach ensures that each unique substring is only processed once, making the solution efficient even for larger inputs.

Example Walkthrough

Input: s = "catsanddog", wordDict = ["cat","cats","and","sand","dog"]

Start at index 0. Try all possible prefixes:
- "c" (not in dict), "ca" (not in dict), "cat" (in dict)
Take "cat", recursively solve for "sanddog" (i.e., dfs(3)):
- At index 3, "s" (not in dict), "sa" (not in dict), "san" (not in dict), "sand" (in dict)
Take "sand", recursively solve for "dog" (dfs(7)):
- "dog" (in dict), so return ["dog"]
Build up: "cat sand dog"
Back at index 0, try "cats" (in dict), recursively solve for "anddog" (dfs(4)):
- At index 4, "a" (not in dict), "an" (not in dict), "and" (in dict)
Take "and", recursively solve for "dog" (dfs(7)), which we already solved, so use memoized result ["dog"].
Build up: "cats and dog"
No other valid splits are found. The final answer is ["cat sand dog", "cats and dog"].

Time and Space Complexity

Brute-force Approach:
- Time Complexity: Exponential, O(2ⁿ), since every character might be a split point.
- Space Complexity: Also exponential, as all possible sentences are generated and stored.
Optimized Memoized Approach:
- Time Complexity: O(n² * k), where n is the length of s and k is the number of sentences generated (since each substring is considered only once, and for each, we try all possible splits).
- Space Complexity: O(n * k), due to memoization and storage of all possible sentences for each substring.
The actual number of sentences can be very large, so the output size dominates the complexity.

Summary

The Word Break II problem is a classic example of recursive problem solving with overlapping subproblems, making it ideal for memoization. By caching results for each substring starting index, we avoid redundant work and efficiently generate all possible sentences. The elegance of the solution comes from combining recursion (to explore all possibilities) with memoization (to avoid repeated calculations), resulting in a solution that is both correct and performant for reasonable input sizes.

AlgoMap.io

Want Help Cracking FAANG?