The Blind 75 Leetcode Series: Group Anagrams

Photo by Chris Ried on Unsplash

Today, we’ll be working on 49. Group Anagrams

Given an array of strings strs, group the anagrams together. You can return the answer in any order.

An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.

Constraints:

1 <= strs.length <= 104

0 <= strs[i].length <= 100

strs[i] consists of lowercase English letters.

This one is quite simple. We have a list of strings, each string consists of lowercase English letters, and we want to group them by the same anagrams.

Without the constraints, some questions we can ask would be

  1. Are the strings always gonna be alphanumerical?
  2. Are there uppercase and lowercase of the same letter? Do we treat them as the same letter?

If there are both uppercase and lowercase of the same letter and we should treat them the same, then our approach will need some tweaking, but for now, let’s just work with lowercase.

The intuition is that, for the anagrams, they consist of the same letters, so if we have a way to turn them into the same sequence, then they will be the exactly same strings. The easiest way to do it? Use sort.

sorted_string = sorted(s)

with Python, sorted returns a new list while sort sorts in place. We are using sorted here as we want to modify the string even more.

We now need a tracker to store these sorted lists. As usual, a dictionary will do the job just fine. However, we cannot use a list as key since it’s mutable, so we want to transform it back to a string.

from collections import defaultdict # python3 import. py2, which you shouldn't be using anymore, has a different routetracker = defaultdict(list)
sorted_s = ''.join(sorted(s)) # s is a string in the list
tracker[sorted_s].append(s)

A bit on explanation for defaultdict for those unfamiliar. defaultdict is a simple way to deal with newly added keys. Without it, we will need to do something like

if sorted_s in tracker:
tracker[sorted_s].append(s)
else:
tracker[sorted_s] = [s]

defaultdict gets rid of the if-else and you can assume the key exists. This dictionary defaults each key to an empty list, and you can directly append values to it. If a non-existing key is called, it’ll return the default value.

NOTE: defaultdict(int) would default each key to 0 and defaultdict(str) defaults each key to '' and so on.

We now just have to sort every string and check against the tracker. All strings with the same sorted string will be grouped together under the same key, and we just need to return all the values of the tracker.

We can add some more logic to do early return on the one-element of no-element list.

It will look like

def groupAnagrams(strs):
if len(strs) <= 1:
return [strs]
tracker = defaultdict(list)
for s in strs:
sorted_s = ''.join(sorted(s))
tracker[sorted_s].append(s)
return tracker.values()

What’s the time complexity for this?

Say we have n strings in the input, and each string needs a sort, which is O(m logm), where m is the length of the word, as we’ve found in previous problems, so the total time complexity would be O(n * m logm)

def groupAnagrams(strs):
if len(strs) <= 1:
return [strs]
tracker = defaultdict(list)
for s in strs: # O(n)
sorted_s = ''.join(sorted(s)) # O(m logm)
tracker[sorted_s].append(s)
return tracker.values()

Is there a better way to do it?

We spend a great deal on the sorting. What if there’s a way to avoid sorting? Since we only have to deal with lowercase alphabets, we have only 26 letters. We can record the position of the letter for each letter in a string, then form a key with it.

Let’s say we have a list of 26 integers, first one being the position of “a”, second being the position of “b”, and so on. A string “cat” will give us

[1, 0, 1, 0..............0, 1, 0, 0...........0]
0th 2nd 19th

With only 1s on position 0, 2, and 19. An anagram of “cat”, say “act”, will also have the same positions. We then turn this list into a tuple or a string, making it immutable. Now we effectively made keys for our tracker again.

def groupAnagrams(strs):
tracker = defaultdict(list)
position = [0] * 26 # for 26 lowercase alphabets
for s in strs:
for letter in s:
# ord returns the unicode position of the input
# a - z are continuous in unicode, so we can find
# the diff within 26

position[ord(letter)-ord('a')] += 1
tracker[tuple(position)].append(s)
position = [0] * 26
return tracker.values()

What’s the time complexity now? We got rid of the sort, but we still need to go through each letter in a string, so we have O(m) for each string and O(n) for n string, effectively making it O(nm).

def groupAnagrams(strs):
tracker = defaultdict(list)
position = [0] * 26 # for 26 lowercase alphabets
# O(n)
for s in strs:
# O(m)
for letter in s:
position[ord(letter)-ord('a')] += 1
tracker[tuple(position)].append(s)
position = [0] * 26
return tracker.values()

In an interview, if you give me the first answer, that’s a pass for me. If you give me the second answer, I’ll give you bonus point. First approach is intuitive but it already shows if you have knowledge for list, dict, and grouping. The second approach shows you can find a way to optimize the time complexity.

That’s it! Another problem down!

Buy my a coffee: https://www.buymeacoffee.com/jonathanckz

--

--

--

I am a software developer who has been in this industry for close to a decade. I share my experience to people who are or want to get into the industry

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

PiCharts: Easy JavaScript charts with Ruby

Self-Driven Databases: Intelligent Parameter Tuning for Alibaba Databases Explained

VSCode Extensions You Need To Start Using Right Now

Deploying JanusGraph on Alibaba Cloud ECS

Software Engineering Bootcamp Advice

Learning Kafka from Scratch: A Guide to Kafka (Part 2)

Quick Guide to Load Balancing on Alibaba Cloud

Event Driven Programming

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jonathan Chao

Jonathan Chao

I am a software developer who has been in this industry for close to a decade. I share my experience to people who are or want to get into the industry

More from Medium

Leetcode 207: Course Schedule

The Blind 75 Leetcode Series: Container With Most Water

Leetcode Blind 75 Practice — Product of Array Except Self

1570. Dot Product of Two Sparse Vectors