# The Blind 75 Leetcode Series: Merge K Sorted Lists

Today, we are working on Merge k Sorted Lists.

You are given an array of

`k`

linked-lists`lists`

, each linked-list is sorted in ascending order.

Merge all the linked-lists into one sorted linked-list and return it.

A very brief description. But we are provided with examples:

**Input:** lists = [[1,4,5],[1,3,4],[2,6]]

**Output:** [1,1,2,3,4,4,5,6]

**Explanation:** The linked-lists are:

[

1->4->5,

1->3->4,

2->6

]

merging them into one sorted list:

1->1->2->3->4->4->5->6

And we are given constraints:

Constraints:

`k == lists.length`

`0 <= k <= 104`

`0 <= lists[i].length <= 500`

`-104 <= lists[i][j] <= 104`

`lists[i]`

is sorted inascending order.The sum of

`lists[i].length`

will not exceed`104`

.

If you’ve been following the series, you would remember a similar problem we’ve done: Merging 2 sorted lists.

Essentially, we do the following things in that problem:

- check for both lists. If both exist, compare value and choose the smaller one. Move the pointer of that list.
- Once one list runs out, check if the other list has something remaining, if so, append that to the end of the list

Can we do something similar with > 2 lists?

We can iterate through the lists, check for each linked list inside and find the first element. Compare them, then choose the smallest value. Once we’ve decided on which value to choose and which list to choose from, we move the pointer of that list. We just need to make sure that when we no longer find any existing node (i.e. all pointers point to `null`

), we jump out of the loop.

`def mergeKLists(lists):`

head = ListNode()

begin = head

tracker = {} # track tracker {position: node}

while True:

for i in range(len(lists)):

if lists[i]:

tracker[i] = lists[i]

if not tracker:

break

target_linked_list = min(tracker.items(), key=lambda x: x[1].val)[0] # find min val of the nodes and return the position

head.next = lists[target_linked_list] # link the result list to this node

lists[target_linked_list] = lists[target_linked_list].next # move the node's list pointer

head = head.next

tracker = {} # reset tracker for next iteration of tracking

return begin.next

What is the time complexity for this approach?

If we have `k`

linked lists and the list has total of `N`

nodes, we compare k nodes each round and we have N rounds. That gives us O(kN) for time complexity.

`def mergeKLists(lists):`

head = ListNode()

begin = head

tracker = {}

while True: # O(N) for total of N nodes in a list

for i in range(len(lists)): # O(k) lists

if lists[i]:

tracker[i] = lists[i]

if not tracker:

break

target_linked_list = min(tracker.items(), key=lambda x: x[1].val)[0]

head.next = lists[target_linked_list]

lists[target_linked_list] = lists[target_linked_list].next

head = head.next

tracker = {}

return begin.next

Of course, another way is to simply do the merge-2-lists k-1 times and we will get the same result.

def mergeTwoLists(list1, list2):

head = ListNode()

begin = head

while list1 and list2:

if list1.val <= list2.val:

head.next = list1

list1 = list1.next

else:

head.next = list2

list2 = list2.next

head = head.next

if list1:

head.next = list1

if list2:

head.next = list2

return begin.nextdef mergeKLists(lists):

if len(lists) == 0:

return None

if len(lists) == 1:

return lists[0]

next_list = lists[0]

for i in range(1, len(lists)):

curList = self.mergeTwoLists(next_list, lists[i])

return next_list

We’ll see that this approach doesn’t give us a better time complexity. From the Merge 2 Lists problem, we know that the algorithm takes O(N) where N is the number of total nodes. We are going through the process k-1 times, so we will again get O(kN) for time complexity.

Can we do better on top of this?

We are comparing values, and these values are pointing the the corresponding nodes. This puts these nodes in a rank, or priority, when it comes to which to pick next. A priority queue may fit this job.

A

priority queue,… each element additionally has a “priority” associated with it … element with high priority is served before an element with low priority … two elements have the same priority, they are served according to the order in which they were enqueued.

So node.val = priority in our case. We can use a heap to implement this queue, as described here.

Intuitively, our priority is the value of each node since that’s what we are comparing, so `(priority, node)`

should be added, right? But what happens if the values are the same? The comparison will move to the next level: comparing two nodes. Unfortunately, this is where things break. Simply adding `(priority, node)`

will break as the comparison cannot be done on the same priority (as we cannot compare two nodes directly).

Therefore, we need to add an additional value for ordering. Since we don’t really care what order nodes with same priority goes in, we can simply pick the first-one-in, namely a smaller index in the list. Our tuple now becomes `(priority, idx, node)`

. `idx`

is the index of the linked list in the input list.

The rest is easy, as the heap will automatically sort our nodes based on given node.val and index, we can safely just pick them up one by one.

import heapqdef mergeKLists(lists):

head = ListNode()

begin = head

h = []

# push all linked lists into heap, order is n.val -> index

for i, n in enumerate(lists):

if n:

heapq.heappush(h, (n.val, i, n))

# keep taking value out and link to the merged linked list

while h:

target = heapq.heappop(h)

node = target[2]

head.next = node

head = head.next

# if the next node exists, add back to the heap with increased index. "i" at this point is len(lists) - 1

if node.next:

i+=1

heapq.heappush(h, (node.next.val, i, node.next))

return begin.next

How much does this improve? We still have total of N nodes, but now the comparison takes O(log k) where k is the number of linked lists. The reason for it being `k`

and not `N`

is because we always only have `k`

elements in the heap if not fewer. We pop and node, check if there’s the next node, and insert it back if the next node is present, so always `k`

elements.

import heapqdef mergeKLists(lists):

head = ListNode()

begin = head

h = []

# O(k)

for i, n in enumerate(lists):

if n:

heapq.heappush(h, (n.val, i, n))

# total of N nodes, so O(N)

while h:

# O(log k)

target = heapq.heappop(h)

node = target[2]

head.next = node

head = head.next

# if the next node exists, add back to the heap with increased index. "i" at this point is len(lists) - 1

if node.next:

i+=1

# O(log k)

heapq.heappush(h, (node.next.val, i, node.next))

return begin.next

Technically we have O(N * log2k) + O(k). We can deduce it to O(N log k).

Now, in our k-1 merge-2-lists approach, we constantly revisit the same linked list over and over, which is a lot of time wasted. What if we can reduce the time wasted checking the same list?

Instead of merging `list1 & list2`

-> `(list1+list2) & list3`

-> `(list1+list2+list3) & list4`

…, what if we do `list1 & list3`

, `list2 & list4`

, `list5 & list7`

, `list6 and list8`

and so on, then do the next iteration with `(list1+list3) & (list2 + list4)`

, `(list5 + list7) & (list6 + list8)`

and so on, and eventually come up with one single list? That way we don’t keep visiting `list1`

again and again.

We then will create a **divide and conquer **algorithm to implement this.

We will still use our merge-2-list mechanism, but we can change how we pass in the lists.

def mergeTwoLists(list1, list2):

# the same O(N) algorithmdef mergeKLists(lists):

k = len(lists)

if k == 0:

return

step = 1

while step < k:

# we are not changing the size of the lists but each linked list itself, so we do every-other, then every-fourth, then every-eighth, and so on

for i in range(0, k - step, step * 2):

# merge the 2 lists

lists[i] = self.mergeTwoLists(lists[i], lists[i+step])

step += step # at the end, only the first list matter as it has all nodes linked/lists merged

return lists[0]

With this improvement, what time complexity are we looking at?

Since we are dealing with two lists at a time, we dissect the problem into, say, 8 -> 4 -> 2 -> 1, that’s logarithm, and our size would be k once again, and each iteration is a merge-2-list which is O(N), so we are looking at O(N * log k), much the same as our priority queue solution.

def mergeTwoLists(list1, list2):

# the same O(N) algorithmdef mergeKLists(lists):

k = len(lists)

if k == 0:

return

step = 1

while step < k:

# O(logk)

for i in range(0, k - step, step * 2):

# O(N)

lists[i] = self.mergeTwoLists(lists[i], lists[i+step])

step += step

return lists[0]

If I’m an interviewer who asks this question, I’d start with the merge-2-list problem and see if the candidates can build from that solution to the merge-k-lists solution. You are more than welcome to use either approach (priority or divide & conquer) but I’ll give extra point if you can build from the previous work to this solution. Still, every person is different, so use the one you’re familiar with.

That’s it! Another problem down.

*Buy my a coffee: **https://www.buymeacoffee.com/jonathanckz*