The Blind 75 Leetcode Series: Longest Palindromic Substring

Today, we are working on 5. Longest Palindromic Substring

Given a string s, return the longest palindromic substring in s.

First things first, a palindrome, or a palindromic string is a string where reading from the left is the same as reading from the right

Once again a very short description, so we want to look for some clarifications:

  1. do special characters come into play? i.e. do strings like abc##cba count as palindromic string?
  2. Do uppercase and lowercase of the same letter count as the same letter or different letter? i.e. do strings like AbCBa count as palindromic string?
  3. Do spaces matter? will we see spaces at all? i.e. how do we treat Race Car where spaces exit in a string but otherwise is a palindrome?

Let’s say the answers are yes, yes, and yes, then the question becomes

Given a string s, return the longest palindromic substring in s where s may contain special characters and uppercase and lowercase of the same letter count as the same letter. The string might also contain spaces which should not affect the palindromic status.

Great, let’s get started. As usual, we start with brute force approach. The easiest approach would be iterating through each character, and check the later characters to see if a palindrome forms. If so, check the length and compare to max length. If not, move on.

We first want to develop a util function to check if a given string is a palindrome. In python, simplest way would be

def is_palindrome(s):
return s == s[::-1] # flip the string and check if they are the same.

Unfortunately, we cannot use this function as is since we have to worry about the upper/lower case and the spaces, so we can modify it as

def is_palindrome(s):
s = s.replace(' ', '') # replace space with nothing
s = s.lower() # make everything lowercase so case sensitivity doesn't exist
return s == s[::-1]

With the brute force approach, we start with a for-loop and use the character as starting point. Then we start another for-loop from that point to get a substring. The code will look like

def longestPalindrome(s):
max_sub_string = ''
for i in range(len(s)): # start with a letter
for j in range(i, len(s)): # loop through the rest to form substring
substring = s[i:j+1]
if self.is_palindrome(substring):
max_sub_string = max([max_sub_string, substring], key=len) # this takes the longer string of the two
return max_sub_string

I want to emphasize that Leetcode does not treat the full string as a substring of the string, so abcba would have the longest substring of bcb instead of abcba . It doesn’t make sense to me but that’s technicality. I will use Leetcode’s cases as gold sample.

This solution does check all combinations of a string, but it’s simply too slow.

def is_palindrome(s):
s = s.replace(' ', '') # O(1)
s = s.lower() # O(1)
return s == s[::-1] # O(n)
def longestPalindrome(s):
max_sub_string = ''
for i in range(len(s)): # O(n)
for j in range(i, len(s)): # O(n)
substring = s[i:j+1]
if self.is_palindrome(substring): # O(n)
max_sub_string = max([max_sub_string, substring], key=len) # O(1)
return max_sub_string

We are looking at an O(n³) solution. Definitely not good. What can we do to improve it?

We can look into the characteristics of a palindrome. Since a word has to read forward and backward the same, We can divide the word into 2 halves, and each half is a mirror image of the other. Thus, if we choose a place to be the center, we can expand left and right and check if it’s still a valid palindrome. If it is, we keep going, and we return the length when it’s no longer valid or when we reach one end of the string.

To put that into code, we can create a function:

def expand_and_check(s, idx):
str_length = 0
i = j = idx
while i >= 0 and j <= len(s) - 1 and s[i] == s[j]: # check if the indices have reached the end of string and if the two letters are the same
i -= 1
j += 1
return s[i+1: j]

And now we need another for loop for each letter in the string, and we perform this operation for each letter. The expand_and_check is an O(n) operation (since it goes through the string each time) and the for-loop itself is another O(n), so we are looking at O(n²). We are good, right?

Unfortunately, we are only covering half of the cases. This approach works for palindromes that have a single letter as the center. However, we also have cases like abba where the center is between 2 letters. How do we take care of these cases?

Simplest approach: we do the same thing for these cases again then find max length between the two types of cases.

We can modify the utility function to take 2 indices. For the single-letter-as-center case, we pass in the same index; for the center-in-between case, we pass in the two indices next to the center.

For example, abcba, we pass in (2, 2), and for abba, we pass in (1, 2) to get correct answer.

The modified util function becomes:

def expand_and_check(s, idx1, idx2):
str_length = 0
i = idx1
j = idx2
while i >= 0 and j <= len(s) - 1 and s[i] == s[j]:
i -= 1
j += 1
return s[i+1: j]

And we have the overall solution:

def longestPalindrome(s):
max_sub = ''
for i in range(len(s)):
pal_len_1 = expand_and_check(s, i, i) # single-letter
pal_len_2 = expand_and_check(s, i, i + 1) # in-between
max_sub = max(max_sub, len(pal_len_1), len(pal_len_2))
return max_subdef expand_and_check(s, idx1, idx2):
s = s.replace(' ', '').lower()
str_length = 0
i = idx1
j = idx2
while i >= 0 and j <= len(s) - 1 and s[i] == s[j]:
i -= 1
j += 1
return s[i+1: j]

And this works! Let’s look at the time complexity one last time

def longestPalindrome(s):
max_sub = ''
for i in range(len(s)): # O(n)
pal_len_1 = expand_and_check(s, i, i) # O(n)
pal_len_2 = expand_and_check(s, i, i + 1) # O(n)
max_sub = max(max_sub, len(pal_len_1), len(pal_len_2)) # O(1)
return max_subdef expand_and_check(s, idx1, idx2):
s = s.replace(' ', '').lower()
str_length = 0
i = idx1
j = idx2
while i >= 0 and j <= len(s) - 1 and s[i] == s[j]: # O(n)
i -= 1
j += 1
return s[i+1: j]

Since the util function runs O(n) and we do it twice for each loop for n loops, we got total of O(2n²) which becomes O(n²). That’s our solution.

But is it the only viable solution?

Let’s introduce the concept of Dynamic Programming here.

Link to Wikipedia for Dynamic Programming

To put it simply, we start small, build it as we go, and then eventually get to our solution. Think of Fibonacci sequence: if we want to know the 11th number of the sequence, we start with first number 1, and second number 1, get the next number 2, and then get the number after that 3, and so on, eventually we will get our desired number.

How does this apply in our case? We check for some base cases, say a single letter, track the results, and build from it. In order to track the result, we need to build a truth table.

We design the truth table in a way which row and column represent the index of a string. This truth table, or matrix, will be a square with width and height being the length of the string. For example, aba will give us a 3 x 3 matrix, and element [1][2] represent “if starting from 1st element and ending on 2nd element, is the substring a palindrome?” We initially set all elements inside the matrix to be False (or false-y), then fill in each element according to conditions.

n = len(s)
dp = [[False] * n for _ in range(n)]
"""
dp = [
[False, False, False],
[False, False, False],
[False, False, False]
]
"""

Since rows and columns are representing the index of the letters, we know that if i == j, the value should always be true as s[i] == s[j] because they represent the exact same letter. From there, we want to traverse only the top-right part of the matrix. Why? Because the way we set up the matrix makes the top-right a forward check and bottom-left a reverse check. Since we only care about the forward check, we can ignore the bottom-left half.

We then want to iterate from the bottom-right BACKWARD to the top-left for the “forward check” half of the matrix. This mean we start from the largest i possible, then check from i to len(s) for j.

for i in range(n - 1, -1, -1):  # n is len(s)
for j in range(i, n):

Last, we need to check for 3 conditions.

  1. if i == j, s[i] and s[j] represent the same exact letter, mark it as true
  2. if i != j but s[i] == s[j], and i and j are only 1 letter apart, mark it as true. Ex. aa, i=0, j=1, s[i] == s[j] and j-i = 1
  3. if i != j but s[i] == s[j], and i and j are more than 1 letter apart, check the inner string from i+1 to j-1 and see if it is a palindrome (i.e. if the position is marked as True). Ex. abcba, i=0, j=4, s[i] == s[j], then check if bcb is a palindrome (dp[i+1][j-1] == 1?). The answer is yes, so mark dp[i][j] = True as well.

Combine all the parts, we get

s = s.replace(' ', '').lower()
n = len(s)
dp = [[False] * n for _ in range(n)]
longest_palindrome = ''
for i in range(n - 1, -1, -1): # n is len(s)
for j in range(i, n):
if i == j:
dp[i][j] = True
elif s[i] == s[j]:
dp[i][j] = (i == j-1) or dp[i + 1][j - 1]

Now we have to check if the new palindrome is the longest.

if dp[i][j]:  # from i-th letter to j-th letter is a palindrome
if len(longest_palindrome) < len(s[i:j+1]): # check max length
longest_palindrom = s[i:j+1]

Now putting everything together

s = s.replace(' ', '').lower()
n = len(s)
dp = [[False] * n for _ in range(n)]
longest_palindrome = ''
for i in range(n - 1, -1, -1):
for j in range(i, n):
if i == j:
dp[i][j] = True
elif s[i] == s[j]:
dp[i][j] = (i == j-1) or dp[i + 1][j - 1]
if dp[i][j]:
if len(longest_palindrome) < len(s[i:j+1]):
longest_palindrom = s[i:j+1]

And there’s our DP solution. Last but not least, let’s check the time complexity:

s = s.replace(' ', '').lower()
n = len(s)
dp = [[False] * n for _ in range(n)]
longest_palindrome = ''
for i in range(n - 1, -1, -1): # O(n)
for j in range(i, n): # O(n)
if i == j: # O(1)
dp[i][j] = True
elif s[i] == s[j]:
dp[i][j] = (i == j-1) or dp[i + 1][j - 1]
if dp[i][j]: # O(1)
if len(longest_palindrome) < len(s[i:j+1]): # O(1)
longest_palindrom = s[i:j+1]

So the total is O(n²), same as the expand_and_check solution. The difference is the space needed. For expand_and_check, the space needed is O(n) (we are storing a string/sub-string). For DP, we need O(n²) space since we are tracking a 2D matrix with each side with length n.

These days, space is less of a problem compare to time complexity (storage is cheaper and abundant. You can easily get more memory than a better CPU), so we care about time complexity almost always. Of course, you can impress the interviewers more if you know the difference in space complexity between these methods thoroughly.

This is it! Another problem down!

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store