The Blind 75 Leetcode Series: Minimum Window Substring

--

Today, we are working on 76. Minimum Window Substring

Given two strings `s` and `t` of lengths `m` and `n` respectively, return the minimum window substring of `s` such that every character in `t` (including duplicates) is included in the window. If there is no such substring, return the empty string `""`.

The testcases will be generated such that the answer is unique.

A substring is a contiguous sequence of characters within the string.

This is another “string A inside string B” type of question. We have two strings. We want to know if string A’s letters are all found in string B, and if so, what’s the smallest substring that contains all string A’s letter. Couple questions we want to consider:

1. do we consider empty strings?
2. do uppercase and lowercase count as the same letter or different letter?

If the interviewer tells us that empty string CAN happen and lowercase letters do not count as the same, then we have

Given two strings `s` and `t` of lengths `m` and `n` respectively, return the minimum window substring of `s` such that every character in `t` (including duplicates) is included in the window. If there is no such substring, return the empty string `""`. Uppercase and lowercase of the same letter count as different letters. Either string can be empty.

Let’s do it.

We want to find a minimum substring, so we need to keep track the length of the substring every time we find a match (i.e. all target’s letters are in the substring). We want to have a way to expand the substring and/or shrink the substring as we go. Every time we change the substring, we check if the target’s letters are all inside the substring. If so, check if the length is the minimum, and if that’s the case, record the substring.

This looks like a job for a sliding window. To build a sliding window, we need 2 pointers, one for the left bound and one for the right bound. The window then will be `s[left_bound : right_bound + 1]`

We start with both indices at 0-th position. Keep moving the right pointer to the right and consequentially increase the length of the substring. As we move, we check if all `t`'s letters are in the substring. Once we confirm that, we start moving left pointer to the right, thus decreasing the length of substring until we no longer find `t` inside the substring.

To achieve this, we also need to track the number of letters in a substring, and we need a reference for checking if all letters are found. To track the number, we can use a dictionary with letter as the key and count as value. Python builtin function provides a simple function `Counter`. You can pass in a string or a list, and it will transform the input into a dict using each element as key and its count as value.

We can build the reference for `t` using `Counter(t)` . If `t = 'abc'` , then `Counter(t) == {1: 1, 2: 1, 3: 1}` (NOTE: it’s actually a Counter class instead of a dict, but all operations of dict can be done the same). On the other hand, we can use a `defaultdict` as the tracker of the substring and the letter count. Simply use `defaultdict(int)` to build a dictionary with non-existing element default to 0 (so we can do `d['a'] += 1` without first checking if `a` is inside `d`)

Last but not least, we want to check how many letters are satisfied in the substring (i.e. number of a letter in `t` == number of the same letter in substring). Let’s call it `letter_found` and set it to `0` .

To put everything into code:

`def minWindow(s, t):    if len(t) > len(s):        return ''    if len(t) == 0 or len(s) == 0:        return ''    # 2 pointers for the sliding window    left = right = 0    n = len(s)    # count of letters in t    target_counter = Counter(t)    # len of smallest sliding window & the indices of bounds    shortest, left_bound, right_bound = float('inf'), None, None    # track s' the letter counts    s_tracker = defaultdict(int)    # num of letters satisfying the target's count    letter_found = 0    while right < n:        # move right pointer and increase tracker's count        s_tracker[s[right]] += 1        # check if tracker's count for a letter == letter count in t        if s[right] in target_counter and s_tracker[s[right]] == target_counter[s[right]]:            letter_found += 1        # if all letters found, start decreasing window by moving left index & decrease count in tracker as we take out letters        while left <= right and letter_found == len(target_counter):            # check min length and record            if right - left + 1 < shortest:                shortest, left_bound, right_bound = right - left + 1, left, right            s_tracker[s[left]] -= 1            # make sure to keep track of num of letters satisfying the target's count            if s[left] in target_counter and s_tracker[s[left]] < target_counter[s[left]]:                letter_found -= 1            left += 1        right += 1    return "" if shortest == float('inf') else s[left_bound: right_bound + 1]`

What’s the time complexity for this? We are potentially visiting the entire `s` twice, once with the right pointer, and once with the left pointer. We are building `t` into a dict, so we visit it once. Therefore if `len(s) == m` and `len(t) == n`, then we have O(m + n) as our overall time complexity.

Leetcode actually has an improved version of the sliding window, but that only applies to certain cases. I’m going to skip that. Feel free to check the improved sliding window approach yourself though.

The idea of the approach is simple, but because we need to keep tracking many components, very often we lost track of what is where at which time. My advice for this is, besides more practice, to write comments or notes to keep reminding yourself in a very quick way. If you’re doing online interview and you’re actually typing, keep a docstring section on top to list all variables and what they are doing. Use it when you’re walking through your test cases and write down the snapshot of these variables. If you’re whiteboarding, it’s even easier. Find a corner on the whiteboard and keep track of these.

Sliding window problem is another common theme I’ve seen. Do you want to pick the right amount of fruits (the big G)? Do you like to know how many packages you’ve picked up (the fruit that keeps doctors away)? Do you know if the assembly line has the correct machine parts (online auction)? And I’m guessing this problem originally comes from Adobe judging from the examples they use.

It’s a good practice to know how to deal with different types of sliding window problems. Just remember that you want to have 2 pointers for sliding, several dictionaries for tracking, and the “answer” (whether it’s minimum length, actual elements, or whatever) that you want to compare and return at the end. Know what conditionals you’re checking. Know when to slide these pointers and what consequence it would have (increase/decrease trackers, adding new elements, etc.)

There you go! Another problem down.