The Blind 75 Leetcode Series: Merge K Sorted Lists

Today, we are working on Merge k Sorted Lists.

You are given an array of k linked-lists lists, each linked-list is sorted in ascending order.

Merge all the linked-lists into one sorted linked-list and return it.

A very brief description. But we are provided with examples:

Input: lists = [[1,4,5],[1,3,4],[2,6]]
Output: [1,1,2,3,4,4,5,6]
Explanation: The linked-lists are:
[
1->4->5,
1->3->4,
2->6
]
merging them into one sorted list:
1->1->2->3->4->4->5->6

And we are given constraints:

Constraints:

k == lists.length

0 <= k <= 104

0 <= lists[i].length <= 500

-104 <= lists[i][j] <= 104

lists[i] is sorted in ascending order.

The sum of lists[i].length will not exceed 104.

If you’ve been following the series, you would remember a similar problem we’ve done: Merging 2 sorted lists.

Essentially, we do the following things in that problem:

  1. check for both lists. If both exist, compare value and choose the smaller one. Move the pointer of that list.
  2. Once one list runs out, check if the other list has something remaining, if so, append that to the end of the list

Can we do something similar with > 2 lists?

We can iterate through the lists, check for each linked list inside and find the first element. Compare them, then choose the smallest value. Once we’ve decided on which value to choose and which list to choose from, we move the pointer of that list. We just need to make sure that when we no longer find any existing node (i.e. all pointers point to null), we jump out of the loop.

def mergeKLists(lists):
head = ListNode()
begin = head
tracker = {} # track tracker {position: node}
while True:
for i in range(len(lists)):
if lists[i]:
tracker[i] = lists[i]
if not tracker:
break
target_linked_list = min(tracker.items(), key=lambda x: x[1].val)[0] # find min val of the nodes and return the position
head.next = lists[target_linked_list] # link the result list to this node
lists[target_linked_list] = lists[target_linked_list].next # move the node's list pointer
head = head.next
tracker = {} # reset tracker for next iteration of tracking

return begin.next

What is the time complexity for this approach?

If we have k linked lists and the list has total of N nodes, we compare k nodes each round and we have N rounds. That gives us O(kN) for time complexity.

def mergeKLists(lists):
head = ListNode()
begin = head
tracker = {}
while True: # O(N) for total of N nodes in a list
for i in range(len(lists)): # O(k) lists
if lists[i]:
tracker[i] = lists[i]
if not tracker:
break
target_linked_list = min(tracker.items(), key=lambda x: x[1].val)[0]
head.next = lists[target_linked_list]
lists[target_linked_list] = lists[target_linked_list].next
head = head.next
tracker = {}

return begin.next

Of course, another way is to simply do the merge-2-lists k-1 times and we will get the same result.

def mergeTwoLists(list1, list2):
head = ListNode()
begin = head

while list1 and list2:
if list1.val <= list2.val:
head.next = list1
list1 = list1.next
else:
head.next = list2
list2 = list2.next
head = head.next

if list1:
head.next = list1
if list2:
head.next = list2

return begin.next
def mergeKLists(lists):
if len(lists) == 0:
return None
if len(lists) == 1:
return lists[0]
next_list = lists[0]
for i in range(1, len(lists)):
curList = self.mergeTwoLists(next_list, lists[i])
return next_list

We’ll see that this approach doesn’t give us a better time complexity. From the Merge 2 Lists problem, we know that the algorithm takes O(N) where N is the number of total nodes. We are going through the process k-1 times, so we will again get O(kN) for time complexity.

Can we do better on top of this?

We are comparing values, and these values are pointing the the corresponding nodes. This puts these nodes in a rank, or priority, when it comes to which to pick next. A priority queue may fit this job.

A priority queue, … each element additionally has a “priority” associated with it … element with high priority is served before an element with low priority … two elements have the same priority, they are served according to the order in which they were enqueued.

So node.val = priority in our case. We can use a heap to implement this queue, as described here.

Intuitively, our priority is the value of each node since that’s what we are comparing, so (priority, node) should be added, right? But what happens if the values are the same? The comparison will move to the next level: comparing two nodes. Unfortunately, this is where things break. Simply adding (priority, node) will break as the comparison cannot be done on the same priority (as we cannot compare two nodes directly).

Therefore, we need to add an additional value for ordering. Since we don’t really care what order nodes with same priority goes in, we can simply pick the first-one-in, namely a smaller index in the list. Our tuple now becomes (priority, idx, node). idx is the index of the linked list in the input list.

The rest is easy, as the heap will automatically sort our nodes based on given node.val and index, we can safely just pick them up one by one.

import heapqdef mergeKLists(lists):
head = ListNode()
begin = head
h = []

# push all linked lists into heap, order is n.val -> index
for i, n in enumerate(lists):
if n:
heapq.heappush(h, (n.val, i, n))

# keep taking value out and link to the merged linked list
while h:
target = heapq.heappop(h)
node = target[2]
head.next = node
head = head.next

# if the next node exists, add back to the heap with increased index. "i" at this point is len(lists) - 1
if node.next:
i+=1
heapq.heappush(h, (node.next.val, i, node.next))
return begin.next

How much does this improve? We still have total of N nodes, but now the comparison takes O(log k) where k is the number of linked lists. The reason for it being k and not N is because we always only have k elements in the heap if not fewer. We pop and node, check if there’s the next node, and insert it back if the next node is present, so always k elements.

import heapqdef mergeKLists(lists):
head = ListNode()
begin = head
h = []

# O(k)
for i, n in enumerate(lists):
if n:
heapq.heappush(h, (n.val, i, n))

# total of N nodes, so O(N)
while h:
# O(log k)
target = heapq.heappop(h)
node = target[2]
head.next = node
head = head.next

# if the next node exists, add back to the heap with increased index. "i" at this point is len(lists) - 1
if node.next:
i+=1
# O(log k)
heapq.heappush(h, (node.next.val, i, node.next))
return begin.next

Technically we have O(N * log2k) + O(k). We can deduce it to O(N log k).

Now, in our k-1 merge-2-lists approach, we constantly revisit the same linked list over and over, which is a lot of time wasted. What if we can reduce the time wasted checking the same list?

Instead of merging list1 & list2 -> (list1+list2) & list3 -> (list1+list2+list3) & list4 …, what if we do list1 & list3 , list2 & list4 , list5 & list7 , list6 and list8 and so on, then do the next iteration with (list1+list3) & (list2 + list4) , (list5 + list7) & (list6 + list8) and so on, and eventually come up with one single list? That way we don’t keep visiting list1 again and again.

We then will create a divide and conquer algorithm to implement this.

We will still use our merge-2-list mechanism, but we can change how we pass in the lists.

def mergeTwoLists(list1, list2):
# the same O(N) algorithm
def mergeKLists(lists):
k = len(lists)
if k == 0:
return
step = 1
while step < k:
# we are not changing the size of the lists but each linked list itself, so we do every-other, then every-fourth, then every-eighth, and so on
for i in range(0, k - step, step * 2):
# merge the 2 lists
lists[i] = self.mergeTwoLists(lists[i], lists[i+step])
step += step
# at the end, only the first list matter as it has all nodes linked/lists merged
return lists[0]

With this improvement, what time complexity are we looking at?

Since we are dealing with two lists at a time, we dissect the problem into, say, 8 -> 4 -> 2 -> 1, that’s logarithm, and our size would be k once again, and each iteration is a merge-2-list which is O(N), so we are looking at O(N * log k), much the same as our priority queue solution.

def mergeTwoLists(list1, list2):
# the same O(N) algorithm
def mergeKLists(lists):
k = len(lists)
if k == 0:
return
step = 1
while step < k:
# O(logk)
for i in range(0, k - step, step * 2):
# O(N)
lists[i] = self.mergeTwoLists(lists[i], lists[i+step])
step += step
return lists[0]

If I’m an interviewer who asks this question, I’d start with the merge-2-list problem and see if the candidates can build from that solution to the merge-k-lists solution. You are more than welcome to use either approach (priority or divide & conquer) but I’ll give extra point if you can build from the previous work to this solution. Still, every person is different, so use the one you’re familiar with.

That’s it! Another problem down.

Buy my a coffee: https://www.buymeacoffee.com/jonathanckz

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jonathan Chao

Jonathan Chao

I am a software developer who has been in this industry for close to a decade. I share my experience to people who are or want to get into the industry