Skip to content


Python ‘all’ odity

[update] Question solved, see bottom of post.

Since Python 2.5 the language got a new built-in method ‘all’ (and it’s nephew ‘any’). I wanted to play around with this a little, combined with generators, so I created a little testcase to test performance.

Here’s the test-case: take a list L of X random numbers in a given range [A, B], and check whether

  • all elements in L are >= A
  • all elements in L are >= (A + Z) where Z is a number in [0, (B - A)]

The first test should always result True, the second test could result to False.

Here’s the output of a test-run:

In [1]: import random, sys

In [2]: a = [random.randint(100, sys.maxint) for i in xrange(2000000)]

In [3]: len(a)
Out[3]: 2000000

In [4]: #Check whether all elements are >= 100 

In [5]: %timeit all(i >= 100 for i in a)
10 loops, best of 3: 515 ms per loop

In [6]: %timeit any(i < 100 for i in a)
10 loops, best of 3: 454 ms per loop

In [7]: def f(l):
   ...:     for i in l:
   ...:         if i < 100:
   ...:             return False
   ...:     return True
   ...: 

In [8]: %timeit f(a)
10 loops, best of 3: 292 ms per loop

In [9]: #Same thing for 100000, since now the list shouldn't be completely iterated

In [10]: %timeit all(i >= 100000 for i in a)
100 loops, best of 3: 4.73 ms per loop

In [11]: %timeit any(i < 100000 for i in a)
100 loops, best of 3: 4.29 ms per loop

In [12]: def g(l):
   ....:     for i in l:
   ....:         if i < 100000:
   ....:             return False
   ....:     return True
   ....: 

In [13]: %timeit g(a)
100 loops, best of 3: 2.82 ms per loop

In [14]: #For reference

In [15]: %timeit False in (i >= 100 for i in a)
10 loops, best of 3: 531 ms per loop

In [16]: %timeit False in (i >= 100000 for i in a)
100 loops, best of 3: 5.03 ms per loop

It’s as if ‘all’, ‘any’ or ‘in’ don’t break/return when a first occurence of False (or True, obviously) is found. Is this the desired behaviour, and if it is, why? The calculation time difference between using all/any/in or a custom-made function (which is, unlike all etc, not written in C) which breaks whenever it can, is pretty astonishing.

[update] Question solved. It’s pretty normal the function-based approach performs better, since it combines what ‘all’ and the generator provided to ‘all’ do, taking away the generator function-call overhead. Damn :-)

Posted in Development.

Tagged with , .


3 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. Manlome says

    Hmm, hier snap ik niets van, maar bedankt voor je berichtje op mn blog en succes met de cello! (Staat ook nog op mijn lijstje)

  2. Nicke says

    Ik kan goed programmeren, maar toch niet in deze taal. I like :D

    Bedankt voor je berichtje

  3. jwm says

    Today I was suspecting the same thing, that all() did not bail early.
    Here is some code that demonstrates that it does bail on the first False


    x = []
    def testnum5(n):
    x.append(1)
    return n==5

    print all( (testnum5(i) for i in [5,5,5,1,5,5]) )
    print 'testnum5() called:',sum(x)
    # should be 4 if all() bails early, which it is



Some HTML is OK

or, reply to this post via trackback.