The Blind 75 Leetcode Series: Word Break

Photo by Chris Ried on Unsplash

Today, we are working on 139. Word Break

Given a string and a dictionary of strings , return if can be segmented into a space-separated sequence of one or more dictionary words.

Note that the same word in the dictionary may be reused multiple times in the segmentation.

We are given a string and a list of words. We want to see if the string can be segmented into one or more words listed in the word list. To put it in a simpler way, if we see some segments that do not belong to the list, we return false. Otherwise we return true.

Several things to check first:

  1. do we expect empty string or empty wordDict?
  2. do we treat uppercase and lowercase the same?
  3. do we expect repeated words in the wordDict?

If the interviewer tells us that we will always see some values for the string and the wordDict, and we will only see lowercase letters. On top of that, each word in the wordDict is unique, then we can eliminate bunch of corner cases where we have to deal with zero-length of these inputs.

We can go ahead and start then.

We need to segment the string into different words and check if these words exist in the wordDict. To say this in a different way, we check if a substring of the overall string belongs to a word in wordDict. We can keep elongate the substring until we reach the end of string, and we keep track of result as we go.

This sounds like a build-it-up dynamic programming problem.

We first build a tracker and set it to have . The reason for the is because the first element refers to an empty substring. Taking the first example, the would have length of 9. First element refers to the empty substring , then index=1 refers to , index=2 refers to and so on.

Starting from first letter, we iterate through each word in wordDict and check if this substring is a word in the wordDict. We want to check for 2 things:

  1. does this substring contain a word in wordDict? Again using the first example, as we reach as our substring, this specific word doesn’t belong to the wordDict, but it does end with a word in wordDict, so we check if the substring a word.
  2. Because we only check in first condition, we need to make sure the previous word is in the wordDict as well, namely . The way we do it is through check the we build up. Subtract the length of the found word from your and check if that position in is set to true.

Again using the example. will be set to true. which refers to will be set to true because is found in wordDict, and which is is true, so we set to true.

Similarly, once we reach , which endswith , a word in wordDict, = = true, so we can set = true.

Once we’ve gone through the entire string, we just check for the last element of our and see if that is true. If it is, it means we can segment the string into different word(s) from the wordDict.

To put everything in code:

def wordBreak(s, wordDict):
# building tracker to track if a word can be formed at each position
dp = [False] * (len(s) + 1)
dp[0] = True # 0th position marks an "empty segment"
for i in range(1, len(s) + 1):
for word in wordDict:
if s[:i].endswith(word) and (dp[i - len(word)]):
# check if this substring contains a word in wordDict and check if previous word is in the wordDict as well
dp[i] = True
return dp[-1]

What’s the time complexity? We loop through the string once, and for each we loop through the wordDict. Inside the if condition, we have a slicing operation which will be linear to the length of string again. If we use for length of string and for length of wordDict, we are looking at O(n²m) for time complexity.

def wordBreak(s, wordDict):
dp = [False] * (len(s) + 1)
dp[0] = True
for i in range(1, len(s) + 1): # O(n)
for word in wordDict: # O(m)
if s[:i].endswith(word) and (dp[i - len(word)]): # O(n)
dp[i] = True
return dp[-1]

Do we have another solution?

For the dynamic programming solution, we kept adding more letters to the substring and check again the wordDict. What if we instead take away letters and check if the string goes to empty at the end? If it does, we know that all parts of string can be found in the wordDict.

We can achieve this through a BFS approach.

As other BFS solutions, we need to build a queue and keep feeding new elements into the queue. Pop from the beginning, check against some conditions, and have a base condition to return.

In this problem, the elements to feed into the queue is the substrings. The conditions to check against is similar but slightly different to the dynamic programming approach. Instead of , we check as now we want to remove part of the string as we go.

Once we find a word in wordDict from the substring, we take a slice of the current substring and form a new candidate substring. If this candidate is an empty string, we know we have segmented all words, so we return true. Otherwise we check if we have seen this substring before, if not, we add this candidate back to the queue then repeat the process.

If we finish all elements in the queue without finding an empty string, we know there are some substring that we cannot match onto a word in wordDict, so we return false.

To put everything in code:

def wordBreak(s, wordDict):
# start with the entire string
queue = deque([s])
seen = set()
while queue:
next_word = queue.popleft()
for word in wordDict:
# ex. "leetocde" starts with "leet"
if next_word.startswith(word):
# new candidate is the substring after taking out the found word
candidate = next_word[len(word):]
if candidate == '':
return True
# if we haven't seen this new substring, add it to queue and seen
if candidate not in seen:
seen.add(candidate)
queue.append(candidate)
# no more substring found in queue but we didn't see an empty string, return False
return False

What’s the time complexity? This one is a bit more complicated. In the worse case, every letter will be found in wordDict and we will insert new candidate to queue at every iteration. This gives us O(n) for the while loop. The for-loop is another O(m) and slicing operation is another O(n). Once again we are looking at O(n²m).

This problem tests our ability to deal with different scenarios when a string can be segmented in different ways. We have to take care all of those. If you can walk through the problem even without coding, you should have some ideas of how to approach it. The rest is simply finding a good way to track. We can either build it up from smaller string, but break it down from larger string. Make sure to test it out when you finish writing. It’ll give the interviewer (and yourself) a clearer idea what’s going on.

That’s it! Another problem down.

Buy my a coffee: https://www.buymeacoffee.com/jonathanckz

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jonathan Chao

Jonathan Chao

219 Followers

I am a software developer who has been in this industry for close to a decade. I share my experience to people who are or want to get into the industry