Ikke's blog

'cause this is what I do

Re: Python recursion performance test

(This is a reply on a post by Ahmed Soliman on recursion performance in (C)Python, and CPython function call overhead in general. I started to write this as a comment on his post, but it turned out much longer, so sending it over here in the end.)

Hey,

As discussed before, this is not a fair comparison, since the non-recursive version is much ‘smarter’ than the recursive one: it calculates values and will never recalculates them, whilst the recursive version calculates everything over and over again.

Adding some simple memoization helps a lot. First, my testing code:

Here are the benchmarks on my MacBook Pro Intel Core2Duo 2.33GHz with 3GB RAM (running quite a lot of applications). Do note the ‘dumb’ version calculates fib(35), whilst the slightly optimized versions, which still use recursion but much less recursive calls (as they should) or your second version calculate fib(150).

Using MacOS X 10.5.6 stock CPython 2.5.1:

MacBook:Projects nicolas $ python -V
Python 2.5.1

MacBook:Projects nicolas $ python fib.py 35 150
fib(35) = 9227465
Calculation took 12.8542108536 seconds

Calculating the amount of recursive calls to calculate fib(35)
Calculating fib(35) = 9227465 took 29860703 calls

fib2(150) = 9969216677189303386214405760200
Calculation took 0.00020694732666 seconds

memoize_dict(fib)(150) = 9969216677189303386214405760200
Calculation took 0.00141310691833 seconds

memoize_constant_list(151, fib)(150) = 9969216677189303386214405760200
Calculation took 0.000310182571411 seconds

Overall it looks like fib2 and memoize_constant_list perform fairly similar, I guess function call overhead and list.append have a similar influence on performance in this case.

Using Jython 2.5.0 from the binary distribution on the Java HotSpot 64bit Server VM as shipped for OS X 10.5.6:

MacBook:Projects nicolas $ ./Jython/jython2.5.0/jython -V 
Jython 2.5.0

MacBook:Projects nicolas $ ./Jython/jython2.5.0/jython fib.py 35 150
fib(35) = 9227465
Calculation took 12.5539999008 seconds

Calculating the amount of recursive calls to calculate fib(35)
Calculating fib(35) = 9227465 took 29860703 calls

fib2(150) = 9969216677189303386214405760200
Calculation took 0.0519998073578 seconds

memoize_dict(fib)(150) = 9969216677189303386214405760200
Calculation took 0.00399994850159 seconds

memoize_constant_list(151, fib)(150) = 9969216677189303386214405760200
Calculation took 0.00300002098083 seconds

The ‘dumb’ fib implementation performs similar in both CPython and Jython. Jython performs significantly less good on the other implementations though, but maybe todays news could help here, not sure how much locking on dict and list access Jython introduces.

Finally, using Unladen Swallow 2009Q2, self-compiled from SVN on the same system, using standard settings:

MacBook:Projects nicolas $ ./unladen-swallow/unladen-2009Q2-inst/bin/python -V
Python 2.6.1

MacBook:Projects nicolas $ ./unladen-swallow/unladen-2009Q2-inst/bin/python fib.py 35 150
fib(35) = 9227465
Calculation took 12.2675719261 seconds

Calculating the amount of recursive calls to calculate fib(35)
Calculating fib(35) = 9227465 took 29860703 calls

fib2(150) = 9969216677189303386214405760200
Calculation took 0.000118970870972 seconds

memoize_dict(fib)(150) = 9969216677189303386214405760200
Calculation took 0.000972986221313 seconds

memoize_constant_list(151, fib)(150) = 9969216677189303386214405760200
Calculation took 0.00036096572876 seconds

which is similar to, slighly better or slightly worse than the CPython run, and when enforcing JIT (which introduces a significant startup time, which is not measured here):

MacBook:Projects nicolas $ ./unladen-swallow/unladen-2009Q2-inst/bin/python -j always fib.py 35 150
fib(35) = 9227465
Calculation took 14.6129109859 seconds

Calculating the amount of recursive calls to calculate fib(35)
Calculating fib(35) = 9227465 took 29860703 calls

fib2(150) = 9969216677189303386214405760200
Calculation took 0.0432291030884 seconds

memoize_dict(fib)(150) = 9969216677189303386214405760200
Calculation took 0.0363459587097 seconds

memoize_constant_list(151, fib)(150) = 9969216677189303386214405760200
Calculation took 0.0335609912872 seconds

which, to my surprise, performs pretty worse than the default settings.

Overall: your first implementation performs tons and tons of function calls, whilst the second one, which resembles memoize_list_fib in my code (which is recursive), performs significantly less function calls and in the end memoize_list_fib performs almost as good as your second version (it performs +- the same number of function calls as the number of times you’re going through your loop).

So whilst I do agree function calls in Python are reasonably slow compared to plain C function calls (which is just a jmp, no frame handling etc. etc. required), your comparison between your recursive and non-recursive implementation is completely unfair, and even if calculating fib(35) takes several seconds, consider you’re doing a pretty impressive 29860703 function calls to perform the calculation.

Time to get some sleep.

Posted in Development, Technology.

Tagged with Development, python.

By Nicolas – July 16, 2009

6 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

James Henstridge says

In your fib2() function, you construct a list of all the Fibonacci numbers up to the point in the sequence you’re asked for. This means that it’s memory requirements will be proportional to the number you ask for: the same problem with naive recursive solutions.

But each iteration of the loop only requires access to the previous two items in the sequence, so you could get away with only storing those two items. Something like this should be sufficient:

def fib3(n):
if n == 0 or n == 1: return n
x, y = 0, 1
for i in xrange(n-1):
x, y = y, x + y
return y

It’d be interesting to see how Unladen Swallow fairs with this version: as well as getting rid of the list reallocation, it also moves the logic to be in terms of local variables which should be easier to optimise.

July 16, 2009, 10:02 Reply
Mohamed Fawzy says

it is an intelligent solution James,But am i supposed not to use recursion at all in Python,i wish there is a plan for optimizing the python interpreter functions call.
I think programmers should be able to choose their problem solving technique,not to be compelled to avoid using someone i mean recursion here.

July 16, 2009, 19:36 Reply
James Henstridge says

Mohamed: I do use recursion in Python (e.g. for processing tree data structures) — I just don’t use it in cases where I am going to hit the recursion limit. Python has other tools for those sort of situations (ones that leave you with a readable stack trace when things go wrong too).

Furthermore, the naive recursive fib() implementation given here can’t take advantage of tail call optimisation anyway, due to it not simply returning the result of another function call.

To take advantage of tail call optimisation, you’d need an implementation like:

def fib_helper(x, y, n):
if n == 0: return x
return fib_helper(y, x + y, n – 1)
def fib4(n):
return fib_helper(0, 1, n)

While I agree that my fib3() implementation is slightly more difficult to read than the original naive fib() implementation, it isn’t clear to me that fib4() is better. If I was using a different language, things might tip in the other direction, but we are talking about Python here.

July 17, 2009, 04:28 Reply
Mohammed Gamal says

Great reply! However, I think that even with memoization you’re very likely to hit recursion limits in python, since you still call the function recursively, and in my experience python doesn’t really hold a huge space for the call stack. Try for example running fib2(1000) from first run, where you’ll hit the limit, now try fib2(900) then fib2(1000), you’ll immediately get the result since you only did 100 recursive calls this time.
Bottom line, you may use recursion in python, but you gotta bit a little bit more careful when designing your algorithms than other compiled programming languages

August 1, 2009, 00:49 Reply
Nicolas says

More discussion going on at http://geekfromcairo.wordpress.com/2009/08/04/to-understand-recursion-you-need-to-understand-recursion-take-2/#comment-39 and before…

August 4, 2009, 19:20 Reply

Continuing the Discussion

To Understand Recursion You Need To Understand Recursion… « Another Geek from Cairo linked to this post on August 1, 2009
[...] rightly concluded that recursion in Python is slow. (An interesting counter-gument can be found here as [...]

« First Clojure experiments Microsoft to release Linux HyperV drivers as GPLv2 »

@donsbot Any reason Data.ByteString.zipWith' isn't exported/public API? 12:30:27 AM December 17, 2011 from web in reply to donsbot Reply Retweet Favorite
RT @raichoo: "#Ocaml has an OOP extension… that nobody uses […] not even its inventor" -Yaron Minsky, Janestreet. 09:28:04 PM December 16, 2011 from web Reply Retweet Favorite
Whenever you launch a tech startup, don't look for office space: head to Starbucks. Coffee, power & free wifi, all you need is a laptop! 06:30:52 PM December 16, 2011 from web Reply Retweet Favorite
@viktorklang Bit me a couple of time in the Haskell bindings as well. Simple approach to make a library "threadsafe"? 03:19:38 PM December 16, 2011 from web in reply to viktorklang Reply Retweet Favorite
W00t ^_^ RT @johtib Screenshot teaser for something I've been working on lately: http://t.co/OyGog474 01:19:31 PM December 16, 2011 from web Reply Retweet Favorite

@eikke

Proudly powered by WordPress and Carrington.

Re: Python recursion performance test

6 Responses

Continuing the Discussion

About Ikke's blog

Meta

Friends

Me

Planets

License

Me @ Twitter

Re: Python recursion performance test

6 Responses

Continuing the Discussion

Subscribe

About Ikke's blog

Meta

Friends

Me

Planets

License

Me @ Twitter

Tags