How does the min/max function on a nested list work?
Lets say, there is a nested list, like:
my_list = [[1, 2, 21], [1, 3], [1, 2]]
When the function min() is called on this:
The output received is
Why and How does it work? What are some use cases of it?
How are lists and other sequences compared in Python?
Lists (and other sequences) in Python are compared lexicographically and not based on any other parameter.
Sequence objects may be compared to other objects with the same sequence type. The comparison uses lexicographical ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted.
What is lexicographic sorting?
From the Wikipedia page on lexicographic sorting
lexicographic or lexicographical order (also known as lexical order, dictionary order, alphabetical order or lexicographic(al) product) is a generalization of the way the alphabetical order of words is based on the alphabetical order of their component letters.
The min function returns the smallest value in the iterable. So the lexicographic value of [1,2] is the least in that list. You can check by using [1,2,21]
>>> my_list=[[1,2,21],[1,3],[1,2]] >>> min(my_list) [1, 2]
What is happening in this case of min?
Going element wise on my_list, firstly [1,2,21] and [1,3]. Now from the docs
If two items to be compared are themselves sequences of the same type, the lexicographical comparison is carried out recursively.
Thus the value of [1,1,21] is less than [1,3], because the second element of [1,3], which is, 3 is lexicographically higher than the value of the second element of [1,1,21], which is, 1.
Now comparing [1,2] and [1,2,21], and adding another reference from the docs
If one sequence is an initial sub-sequence of the other, the shorter sequence is the smaller (lesser) one.
[1,2] is an initial sub-sequence of [1,2,21]. Therefore the value of [1,2] on the whole is smaller than that of [1,2,21]. Hence [1,2] is returned as the output.
This can be validated by using the sorted function
>>> sorted(my_list) [[1, 2], [1, 2, 21], [1, 3]]
What if the list has multiple minimum elements?
If the list contains duplicate min elements the first is returned
>>> my_list=[[1,2],[1,2]] >>> min(my_list) [1, 2]
This can be confirmed using the id function call
>>> my_list=[[1,2],[1,2]] >>> [id(i) for i in my_list] [140297364849368, 140297364850160] >>> id(min(my_list)) 140297364849368
What do I need to do to prevent lexicographic comparison in min?
If the required comparison is not lexicographic then the key argument can be used (as mentioned by Padraic)
The min function has an additional optional argument called key. The key argument takes a function.
The optional key argument specifies a one-argument ordering function like that used for list.sort(). The key argument, if supplied, must be in keyword form (for example, min(a,b,c,key=func)).
For example, if we need the smallest element by length, we need to use the len function.
>>> my_list=[[1,2,21],[1,3],[1,2]] >>> min(my_list,key=len) # Notice the key argument [1, 3]
As we can see the first shortest element is returned here.
What if the list is heterogeneous?
If the list is heterogeneous type names are considered for ordering, check Comparisions,
Objects of different types except numbers are ordered by their type names
Hence if you put an int and a list there you will get the integer value as the smallest as i is of lower value than l. Similarly '1' would be of higher value than both of this.
>>> my_list=[[1,1,21],1,'1'] >>> min(my_list) 1
Python3 and onwards
However this confusing technique was removed in Python3. It now raises a TypeError. Read What's new in Python 3.0
The ordering comparison operators (<, <=, >=, >) raise a TypeError exception when the operands don’t have a meaningful natural ordering. Thus, expressions like 1 < '', 0 > None or len <= len are no longer valid, and e.g. None < None raises TypeError instead of returning False. A corollary is that sorting a heterogeneous list no longer makes sense – all the elements must be comparable to each other.
>>> my_list=[[1,1,21],1,'1'] >>> min(my_list) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unorderable types: int() < list()
But it works for Comparable types, For example
>>> my_list=[1,2.0] >>> min(my_list) 1
Here we can see that the list contains float values and int values. But as float and int are comparable types, min function works in this case.
One simple use case for lexicographical sorting is with making a sortable namedtuple class.
from collections import namedtuple Time = namedtuple('Time', ['hours', 'minutes', 'seconds']) t1 = Time(hours=8, minutes=15, seconds=30) t2 = Time(hours=8, minutes=15, seconds=0) t3 = Time(hours=8, minutes=30, seconds=30) t4 = Time(hours=7, minutes=15, seconds=30) assert min(t1, t2, t3, t4) == t4 assert max(t1, t2, t3, t4) == t3
Two lists are compared element wise
Even if sizes of two lists are different the two lists are compared element wise starting the comparison from the first element.
Now suppose that every element of a list has been checked and they are the same and there is no next element in the shorter list. Then the shorter list is declared to be smaller than the longer one.
>>> [1,2]<[1,3] True >>> [1,2]<[1,2,21] True >>> [1,3]<[1,2,21] False >>>[1,2,22]<[1,2,21] False >>><[1,2,21] True >>>
it compares the lists elementwise:
>>> [1,2]<[1,3] True >>> [1,2]<[1,2,21] True >>>