Thursday, 11 February 2016

python - How to make a flat list out of list of lists?



I wonder whether there is a shortcut to make a simple list out of list of lists in Python.



I can do that in a for loop, but maybe there is some cool "one-liner"? I tried it with reduce(), but I get an error.



Code




l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
reduce(lambda x, y: x.extend(y), l)


Error message



Traceback (most recent call last):
File "", line 1, in
File "", line 1, in
AttributeError: 'NoneType' object has no attribute 'extend'


Answer



Given a list of lists l,



flat_list = [item for sublist in l for item in sublist]



which means:



flat_list = []
for sublist in l:

for item in sublist:
flat_list.append(item)


is faster than the shortcuts posted so far. (l is the list to flatten.)



Here is the corresponding function:



flatten = lambda l: [item for sublist in l for item in sublist]



As evidence, you can use the timeit module in the standard library:



$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 143 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 3: 969 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 3: 1.1 msec per loop



Explanation: the shortcuts based on + (including the implied use in sum) are, of necessity, O(L**2) when there are L sublists -- as the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated, and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the end). So, for simplicity and without actual loss of generality, say you have L sublists of I items each: the first I items are copied back and forth L-1 times, the second I items L-2 times, and so on; total number of copies is I times the sum of x for x from 1 to L excluded, i.e., I * (L**2)/2.



The list comprehension just generates one list, once, and copies each item over (from its original place of residence to the result list) also exactly once.


No comments:

Post a Comment

c++ - Does curly brackets matter for empty constructor?

Those brackets declare an empty, inline constructor. In that case, with them, the constructor does exist, it merely does nothing more than t...