How to sort complex objects with custom criteria in Python

Sorting objects with multiple member variables can be done in multiple ways, such as overloading the “<” operator, using the key parameter and there’s also support to use comparator functions

In most cases, sorting is a simple task but there are occasions when the data is a bit complicated and python provides elegant way to handle those scenarios. So we will look at a complex scenario in this article

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
class Laptop:
    def __init__(self, cpu, ram, ssd) -> None:
    self.cpu = cpu
    self.ram = ram
    self.ssd = ssd

A = Laptop("Ryzen 7", 8,  256)
B = Laptop("Ryzen 5", 8,  512)
C = Laptop("Ryzen 7", 16, 128)
D = Laptop("Ryzen 5", 16, 128)

arr = [A,B,C,D]
1
2
3
4
5
6
A = [ "Ryzen 7", 8, 256 ]
B = [ "Ryzen 5", 8, 512 ]
C = [ "Ryzen 7", 16, 128 ]
D = [ "Ryzen 5", 16, 128 ]

arr = [A,B,C,D]

Lets say, our priority is in this order, cpu > ram > ssd

1
2
3
4
5
# As class objects
arr.sort(key=lambda x:(x.cpu,x.ram, x.ssd), reverse=True)

# As list items
arr.sort(key=lambda x:(x[0], x[1], x[2]), reverse=True)

The result is,

1
2
3
4
Ryzen 7, 16, 128
Ryzen 7, 8, 256
Ryzen 5, 16, 128
Ryzen 5, 8, 512

For simpler cases, when you have only one criteria, you don’t need to use tuples

1
arr.sort(key=lambda x:x.cpu, reverse=True)

Lets make the scenario more complex by introducing Intel,

1
2
3
E = Laptop("Intel i7", 16, 512)

arr = [A,B,C,D,E]

If we don’t change anything the result will be,

1
2
3
4
5
Ryzen 7, 16, 128
Ryzen 7, 8, 256
Ryzen 5, 16, 128
Ryzen 5, 8, 512
Intel i7, 16, 512
  • Which may not be what we want, we were getting results as expected before because we assumed that Ryzen 7 > Ryzen 5 which is carried out by the “<” operator of string

  • For the sake of demonstration lets define the precedence in this way, Ryzen 7 > Intel i7 > Ryzen5

Here is one way to achieve this result using operator overloading

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class Laptop:
    def __init__(self, cpu, ram, ssd) -> None:
        self.cpu = cpu
        self.ram = ram
        self.ssd = ssd

    def __lt__(a, b):
        brand_a, model_a = a.cpu.split(" ")
        brand_b, model_b = b.cpu.split(" ")

        if brand_a == brand_b:
            if model_a != model_b:
                return model_a < model_b
        else:
            if brand_a == "Intel":
                return b.cpu == "Ryzen 7"
            elif brand_b == "Intel":
                return a.cpu == "Ryzen 5"

        if a.ram != b.ram:
            return a.ram < b.ram

        return a.ssd < b.ssd

In this function, returning 0 means a is smaller, if 1 is returned then b is smaller

When the “<” operator is defined, you can sort by simply calling

1
arr.sort(reverse=True)
1
2
3
4
5
6
from functools import cmp_to_key

def comparator(a, b):
    return a.ram - b.ram

arr.sort(key=cmp_to_key(comparator), reverse=True)

You can write logic of similar complexity by using comparator function

But in this method, returning -1 or any negative number means a is smaller, if positive number is returned than b is smaller. If 0 is returned then they have the same priority and no swapping will be done

  • The benefit of using comparator function is that you don’t overload the class operators and you have the option to use multiple comparators for different use cases.
  • If you have already overloaded the “<” operator but a scenario arises where your criteria is a bit different then comparator function is what you may need

Use a dictionary to define the priority of the cpu models,

1
2
3
4
5
mp = {
    "Ryzen 5"  : 0,
    "Intel i7" : 1,
    "Ryzen 7"  : 2
}

Then the logic becomes simpler,

1
2
3
4
5
6
7
8
def __lt__(a, b):
    if a.cpu != b.cpu:
        return mp[a.cpu] < mp[b.cpu]

    if a.ram != b.ram:
        return a.ram < b.ram

    return a.ssd < b.ssd

To use comparator function, just replace the less than ("<") operators with minus("-") operator

You may have already guessed that you can make it even simpler without using operator overloading

1
arr.sort(key=lambda x:(mp[x.cpu], x.ram, x.ssd),reverse=True)

This will be the result,

1
2
3
4
5
Ryzen 7, 16, 128
Ryzen 7, 8, 256
Intel i7, 16, 512
Ryzen 5, 16, 128
Ryzen 5, 8, 512

All three ways may have their places but if you can perform the task by using lambda in key parameter than you should stick to that