# The Blind 75 Leetcode Series: Merge K Sorted Lists Photo by Chris Ried on Unsplash

Today, we are working on Merge k Sorted Lists.

You are given an array of `k` linked-lists `lists`, each linked-list is sorted in ascending order.

Merge all the linked-lists into one sorted linked-list and return it.

A very brief description. But we are provided with examples:

`Input: lists = [[1,4,5],[1,3,4],[2,6]]Output: [1,1,2,3,4,4,5,6]Explanation: The linked-lists are:[  1->4->5,  1->3->4,  2->6]merging them into one sorted list:1->1->2->3->4->4->5->6`

And we are given constraints:

Constraints:

`k == lists.length`

`0 <= k <= 104`

`0 <= lists[i].length <= 500`

`-104 <= lists[i][j] <= 104`

`lists[i]` is sorted in ascending order.

The sum of `lists[i].length` will not exceed `104`.

If you’ve been following the series, you would remember a similar problem we’ve done: Merging 2 sorted lists.

Essentially, we do the following things in that problem:

1. check for both lists. If both exist, compare value and choose the smaller one. Move the pointer of that list.
2. Once one list runs out, check if the other list has something remaining, if so, append that to the end of the list

Can we do something similar with > 2 lists?

We can iterate through the lists, check for each linked list inside and find the first element. Compare them, then choose the smallest value. Once we’ve decided on which value to choose and which list to choose from, we move the pointer of that list. We just need to make sure that when we no longer find any existing node (i.e. all pointers point to `null`), we jump out of the loop.

`def mergeKLists(lists):    head = ListNode()    begin = head    tracker = {}  # track tracker {position: node}    while True:        for i in range(len(lists)):            if lists[i]:                tracker[i] = lists[i]        if not tracker:            break        target_linked_list = min(tracker.items(), key=lambda x: x.val)  # find min val of the nodes and return the position        head.next = lists[target_linked_list]  # link the result list to this node        lists[target_linked_list] = lists[target_linked_list].next  # move the node's list pointer        head = head.next        tracker = {} # reset tracker for next iteration of tracking                return begin.next`

What is the time complexity for this approach?

If we have `k` linked lists and the list has total of `N` nodes, we compare k nodes each round and we have N rounds. That gives us O(kN) for time complexity.

`def mergeKLists(lists):    head = ListNode()    begin = head    tracker = {}    while True:  # O(N) for total of N nodes in a list        for i in range(len(lists)):  # O(k) lists            if lists[i]:                tracker[i] = lists[i]        if not tracker:            break        target_linked_list = min(tracker.items(), key=lambda x: x.val)          head.next = lists[target_linked_list]          lists[target_linked_list] = lists[target_linked_list].next          head = head.next        tracker = {}                return begin.next`

Of course, another way is to simply do the merge-2-lists k-1 times and we will get the same result.

`def mergeTwoLists(list1, list2):    head = ListNode()    begin = head    while list1 and list2:        if list1.val <= list2.val:            head.next = list1            list1 = list1.next        else:            head.next = list2            list2 = list2.next        head = head.next    if list1:        head.next = list1    if list2:        head.next = list2    return begin.nextdef mergeKLists(lists):    if len(lists) == 0:        return None    if len(lists) == 1:        return lists    next_list = lists    for i in range(1, len(lists)):        curList = self.mergeTwoLists(next_list, lists[i])    return next_list`

We’ll see that this approach doesn’t give us a better time complexity. From the Merge 2 Lists problem, we know that the algorithm takes O(N) where N is the number of total nodes. We are going through the process k-1 times, so we will again get O(kN) for time complexity.

Can we do better on top of this?

We are comparing values, and these values are pointing the the corresponding nodes. This puts these nodes in a rank, or priority, when it comes to which to pick next. A priority queue may fit this job.

A priority queue, … each element additionally has a “priority” associated with it … element with high priority is served before an element with low priority … two elements have the same priority, they are served according to the order in which they were enqueued.

So node.val = priority in our case. We can use a heap to implement this queue, as described here.

Intuitively, our priority is the value of each node since that’s what we are comparing, so `(priority, node)` should be added, right? But what happens if the values are the same? The comparison will move to the next level: comparing two nodes. Unfortunately, this is where things break. Simply adding `(priority, node)` will break as the comparison cannot be done on the same priority (as we cannot compare two nodes directly).

Therefore, we need to add an additional value for ordering. Since we don’t really care what order nodes with same priority goes in, we can simply pick the first-one-in, namely a smaller index in the list. Our tuple now becomes `(priority, idx, node)`. `idx` is the index of the linked list in the input list.

The rest is easy, as the heap will automatically sort our nodes based on given node.val and index, we can safely just pick them up one by one.

`import heapqdef mergeKLists(lists):    head = ListNode()    begin = head    h = []        # push all linked lists into heap, order is n.val -> index    for i, n in enumerate(lists):        if n:            heapq.heappush(h, (n.val, i, n))        # keep taking value out and link to the merged linked list    while h:        target = heapq.heappop(h)	node = target	head.next = node	head = head.next                # if the next node exists, add back to the heap with increased index. "i" at this point is len(lists) - 1        if node.next:  	    i+=1	    heapq.heappush(h, (node.next.val, i, node.next))    return begin.next`

How much does this improve? We still have total of N nodes, but now the comparison takes O(log k) where k is the number of linked lists. The reason for it being `k` and not `N` is because we always only have `k` elements in the heap if not fewer. We pop and node, check if there’s the next node, and insert it back if the next node is present, so always `k` elements.

`import heapqdef mergeKLists(lists):    head = ListNode()    begin = head    h = []        # O(k)    for i, n in enumerate(lists):        if n:            heapq.heappush(h, (n.val, i, n))        # total of N nodes, so O(N)    while h:        # O(log k)        target = heapq.heappop(h)	node = target	head.next = node	head = head.next                # if the next node exists, add back to the heap with increased index. "i" at this point is len(lists) - 1        if node.next:  	    i+=1	    # O(log k)            heapq.heappush(h, (node.next.val, i, node.next))    return begin.next`

Technically we have O(N * log2k) + O(k). We can deduce it to O(N log k).

Now, in our k-1 merge-2-lists approach, we constantly revisit the same linked list over and over, which is a lot of time wasted. What if we can reduce the time wasted checking the same list?

Instead of merging `list1 & list2` -> `(list1+list2) & list3` -> `(list1+list2+list3) & list4` …, what if we do `list1 & list3` , `list2 & list4` , `list5 & list7` , `list6 and list8` and so on, then do the next iteration with `(list1+list3) & (list2 + list4)` , `(list5 + list7) & (list6 + list8)` and so on, and eventually come up with one single list? That way we don’t keep visiting `list1` again and again.

We then will create a divide and conquer algorithm to implement this.

We will still use our merge-2-list mechanism, but we can change how we pass in the lists.

`def mergeTwoLists(list1, list2):    # the same O(N) algorithmdef mergeKLists(lists):    k = len(lists)    if k == 0:        return    step = 1    while step < k:        # we are not changing the size of the lists but each linked list itself, so we do every-other, then every-fourth, then every-eighth, and so on        for i in range(0, k - step, step * 2):            # merge the 2 lists            lists[i] = self.mergeTwoLists(lists[i], lists[i+step])        step += step    # at the end, only the first list matter as it has all nodes linked/lists merged    return lists`

With this improvement, what time complexity are we looking at?

Since we are dealing with two lists at a time, we dissect the problem into, say, 8 -> 4 -> 2 -> 1, that’s logarithm, and our size would be k once again, and each iteration is a merge-2-list which is O(N), so we are looking at O(N * log k), much the same as our priority queue solution.

`def mergeTwoLists(list1, list2):    # the same O(N) algorithmdef mergeKLists(lists):    k = len(lists)    if k == 0:        return    step = 1    while step < k:        # O(logk)        for i in range(0, k - step, step * 2):            # O(N)            lists[i] = self.mergeTwoLists(lists[i], lists[i+step])        step += step    return lists`

If I’m an interviewer who asks this question, I’d start with the merge-2-list problem and see if the candidates can build from that solution to the merge-k-lists solution. You are more than welcome to use either approach (priority or divide & conquer) but I’ll give extra point if you can build from the previous work to this solution. Still, every person is different, so use the one you’re familiar with.

That’s it! Another problem down. 