Ikke's blog » python http://eikke.com 'cause this is what I do Sun, 13 Feb 2011 14:58:55 +0000 en-US hourly 1 http://wordpress.org/?v=3.4.1 Book review: Python Testing – Beginner’s Guide http://eikke.com/book-review-python-testing-beginners-guide-2/ http://eikke.com/book-review-python-testing-beginners-guide-2/#comments Sun, 14 Mar 2010 20:49:54 +0000 Nicolas http://eikke.com/?p=170 As mentioned before, some days ago I received a copy of a recent book from Packt Publishing titled “Python Testing – Beginner’s Guide” by Daniel Arbuckle. I read the whole book (it’s not huge, around 220 pages), and wrote a review, as requested by Packt.

The book targets people who know Python (it doesn’t contain a language introduction chapter or something alike, which would be rather pointless anyway), and want to start testing the code they write. Even though the author starts by explaining basic tools like doctests and the unittest framework contained in the Python standard library, it could be a useful read even if you used these tools before, e.g. when the Mock library is explained, or in the chapter on web application testing using Twill.

The text is easy to read, and contains both hands-on code examples, explanations as well as tasks for the reader and quiz questions. I did not audit all code for correctness (although in my opinion some more time should have been invested here before the book was publishing: some code samples contain errors, even invalid syntax (p45: “self.integrated_error +q= err * delta“), which is not what I expect in a book about code testing), nor all quizes. These could’ve used some more care as well, e.g. on p94 one can read

What is the unittest equivalent of this doctest?

>>> try:
...     int('123')
... except ValueError:
...     pass
... else:
...     print 'Expected exception was not raised'

I was puzzled by this, since as far as I could remember, int(’123′) works just fine, and I didn’t have a computer at hand to check. Checked now, and it works as I expected, so maybe I’m missing something here? The solution found in the back of the book is a literal unittest-port of the above doctest, and should fail, if I’m not mistaken:

>>> def test_exceptions(TestCase):
...     def test_ValueError(self):
...         self.assertRaises(ValueError, int, '123')

This example also shows one more negative point of the book, IMHO: the code samples don’t follow PEP-8 (or similar) capitalization, which makes code rather hard to read sometimes.

The solutions for the last quiz questions are missing as well, and accidently I did want to read those.

Don’t be mistaken though: these issues don’t reduce the overall value of the book, it’s certainly worth your time, as long as you keep in mind not to be too confused by the mistakes as shown above.

Topic overview

The book starts with a short overview of types of testing, including unit, integration and system testing, and why testing is worth the effort. This is a very short overview of 3 pages.

Starting from chapter 2, the doctest system is introduced. I think it’s an interesting approach to start with doctest instead of using unittest, which is modeled after the more ‘standard’ xUnit packages. Doctests are useful during specification writing as well, which is in most project the first stage, before any unittestable code is written. The chapter also introduces an overview of the doctest directives, which was useful to read.

In chapter 3 gives an example of the development of a small project, and all stages involved, including how doctests fit in every stage.

Maybe a sample of Sphinx and its doctest integration would have been a nice addition to one of the previous chapters, since the book introduced doctest as part of stand-alone text files, not as part of code docstrings (although it does talk about those as well). When writing documentation in plain text files, Sphinx is certainly the way to go, and its doctest plugin is a useful extra.

Starting in chapter 4, the Python ‘mocking‘ library is introduced. The chapter itself is a rather good introduction to mock-based testing, but I don’t think mocks should be used in doctests, which should be rather small, examplish snippets. Mock definitions don’t belong there, IMO. This chapter also shows some lack of pre-publishing reviews in a copy-paste error, in the block explaining how to install mocker on page 62, telling from now on Nose is ready to be used.

Chapter 5, which you can read here introduces the unittest framework, its assertion methods, fixtures and mocking integration.

In chapter 6 ‘nose‘ is introduced, a tool to find and run tests in a project. I use nose myself in almost every project, and it’s certainly a good choice. The chapter gives a pretty good overview of the useful features nose provides. It does contain a strange example of module-level setup and teardown methods, whilst IMHO subclassing TestCase would be more suited (and more portable).

Chapter 7 implements a complete project from specification to implementation and maintenance. Useful to read, but I think the chapter contains too much code, and it’s repeated too often.

Chapter 8 introduces web application testing using Twill, which I never used before (nor did I ever test a web application before). Useful to read, but Twill might be a strange choice, since there have been no releases since end 2007… Selenium might have been a better choice?

A large part of the chapter is dedicated to list all possible Twill commands as well, which I think is a waste of space, this can be easily found in the Twill language reference.

Chapter 9 introduces integration and system testing. Interesting to read, the diagram-drawing method used is certainly useful, but it also contains too much code listings.

Finally, chapter 10 gives a short overview of some other testing tools. First coverage.py is explained, which is certainly useful. Then integration of test execution with version control systems is explained. I think this is certainly useful, but not at this level of detail. Setting up a Subversion repository is not exactly what I expect here, especially not when non-anonymous, password-based authentication over svn:// is used (which is a method which should be avoided, AFAIK).
Finally, continuous integration using Buildbot is tackled. No comments here, although I tend to use Hudson myself ;-)

Final words

Is this book worth your time and money? If you’re into Python and you don’t have lots of experience with testing Python code, it certainly is. Even if you wrote tests using unittest or doctests before, you’ll most likely learn some new things, like using mocks.

I’m glad Packt gave me the opportunity to read and review the book. I’d advise them to put some more effort in pre-publishing reviews for future titles, but the overall quality of the non-code content was certainly OK, and I hope lots of readers will enjoy and learn from this book.

]]>
http://eikke.com/book-review-python-testing-beginners-guide-2/feed/ 3
Re: Python recursion performance test http://eikke.com/re-python-recursion-performance-test/ http://eikke.com/re-python-recursion-performance-test/#comments Thu, 16 Jul 2009 01:00:57 +0000 Nicolas http://eikke.com/?p=113 (This is a reply on a post by Ahmed Soliman on recursion performance in (C)Python, and CPython function call overhead in general. I started to write this as a comment on his post, but it turned out much longer, so sending it over here in the end.)

Hey,

As discussed before, this is not a fair comparison, since the non-recursive version is much ‘smarter’ than the recursive one: it calculates values and will never recalculates them, whilst the recursive version calculates everything over and over again.

Adding some simple memoization helps a lot. First, my testing code:

Here are the benchmarks on my MacBook Pro Intel Core2Duo 2.33GHz with 3GB RAM (running quite a lot of applications). Do note the ‘dumb’ version calculates fib(35), whilst the slightly optimized versions, which still use recursion but much less recursive calls (as they should) or your second version calculate fib(150).

Using MacOS X 10.5.6 stock CPython 2.5.1:

MacBook:Projects nicolas $ python -V
Python 2.5.1

MacBook:Projects nicolas $ python fib.py 35 150
fib(35) = 9227465
Calculation took 12.8542108536 seconds

Calculating the amount of recursive calls to calculate fib(35)
Calculating fib(35) = 9227465 took 29860703 calls

fib2(150) = 9969216677189303386214405760200
Calculation took 0.00020694732666 seconds

memoize_dict(fib)(150) = 9969216677189303386214405760200
Calculation took 0.00141310691833 seconds

memoize_constant_list(151, fib)(150) = 9969216677189303386214405760200
Calculation took 0.000310182571411 seconds

Overall it looks like fib2 and memoize_constant_list perform fairly similar, I guess function call overhead and list.append have a similar influence on performance in this case.

Using Jython 2.5.0 from the binary distribution on the Java HotSpot 64bit Server VM as shipped for OS X 10.5.6:

MacBook:Projects nicolas $ ./Jython/jython2.5.0/jython -V 
Jython 2.5.0

MacBook:Projects nicolas $ ./Jython/jython2.5.0/jython fib.py 35 150
fib(35) = 9227465
Calculation took 12.5539999008 seconds

Calculating the amount of recursive calls to calculate fib(35)
Calculating fib(35) = 9227465 took 29860703 calls

fib2(150) = 9969216677189303386214405760200
Calculation took 0.0519998073578 seconds

memoize_dict(fib)(150) = 9969216677189303386214405760200
Calculation took 0.00399994850159 seconds

memoize_constant_list(151, fib)(150) = 9969216677189303386214405760200
Calculation took 0.00300002098083 seconds

The ‘dumb’ fib implementation performs similar in both CPython and Jython. Jython performs significantly less good on the other implementations though, but maybe todays news could help here, not sure how much locking on dict and list access Jython introduces.

Finally, using Unladen Swallow 2009Q2, self-compiled from SVN on the same system, using standard settings:

MacBook:Projects nicolas $ ./unladen-swallow/unladen-2009Q2-inst/bin/python -V
Python 2.6.1

MacBook:Projects nicolas $ ./unladen-swallow/unladen-2009Q2-inst/bin/python fib.py 35 150
fib(35) = 9227465
Calculation took 12.2675719261 seconds

Calculating the amount of recursive calls to calculate fib(35)
Calculating fib(35) = 9227465 took 29860703 calls

fib2(150) = 9969216677189303386214405760200
Calculation took 0.000118970870972 seconds

memoize_dict(fib)(150) = 9969216677189303386214405760200
Calculation took 0.000972986221313 seconds

memoize_constant_list(151, fib)(150) = 9969216677189303386214405760200
Calculation took 0.00036096572876 seconds

which is similar to, slighly better or slightly worse than the CPython run, and when enforcing JIT (which introduces a significant startup time, which is not measured here):

MacBook:Projects nicolas $ ./unladen-swallow/unladen-2009Q2-inst/bin/python -j always fib.py 35 150
fib(35) = 9227465
Calculation took 14.6129109859 seconds

Calculating the amount of recursive calls to calculate fib(35)
Calculating fib(35) = 9227465 took 29860703 calls

fib2(150) = 9969216677189303386214405760200
Calculation took 0.0432291030884 seconds

memoize_dict(fib)(150) = 9969216677189303386214405760200
Calculation took 0.0363459587097 seconds

memoize_constant_list(151, fib)(150) = 9969216677189303386214405760200
Calculation took 0.0335609912872 seconds

which, to my surprise, performs pretty worse than the default settings.

Overall: your first implementation performs tons and tons of function calls, whilst the second one, which resembles memoize_list_fib in my code (which is recursive), performs significantly less function calls and in the end memoize_list_fib performs almost as good as your second version (it performs +- the same number of function calls as the number of times you’re going through your loop).

So whilst I do agree function calls in Python are reasonably slow compared to plain C function calls (which is just a jmp, no frame handling etc. etc. required), your comparison between your recursive and non-recursive implementation is completely unfair, and even if calculating fib(35) takes several seconds, consider you’re doing a pretty impressive 29860703 function calls to perform the calculation.

Time to get some sleep.

]]>
http://eikke.com/re-python-recursion-performance-test/feed/ 6
Python value swap http://eikke.com/python-value-swap/ http://eikke.com/python-value-swap/#comments Wed, 22 Apr 2009 18:44:55 +0000 Nicolas http://eikke.com/?p=103 Been looking (again) at XMPP recently. While browsing through existing source code and samples in several languages, there’s one pattern which comes back quite frequently in ‘echobot’ demos: when a message comes in, the to and from attributes are swapped, and the message is sent.

The most common approach is something like (pseudocode):

temp = from
from = to
to = temp

In Python there’s an easier approach though which seems to be unknown to several developers. It uses the multi-assignment/expansion syntax:

from, to = to, from

Basically, the tuple on the right (to, from) is constructed, then expanded to locals ‘from’ and ‘to’.

Just a hint :-) It’s a pretty elegant line of code IMHO.

]]>
http://eikke.com/python-value-swap/feed/ 8
Erlang, Python and Twisted mashup using TwOTP http://eikke.com/erlang-python-and-twisted-mashup-using-twotp/ http://eikke.com/erlang-python-and-twisted-mashup-using-twotp/#comments Sun, 19 Apr 2009 19:03:30 +0000 Nicolas http://eikke.com/?p=90 Recently, I’ve been toying around with Erlang again. After creating some simple apps I wanted to integrate some Erlang code inside a Python application (since that’s still my favorite day-to-day language, it’s used at work and I’m sort-of convinced Erlang would be a good choice for several of the applications we need to develop, integrated with our existing Python code). The most obvious solution would be to use an Erlang port, but this is IMHO rather cumbersome: it requires a developer to define a messaging format, parsing code for incoming messages, etc. There’s a tutorial available if you want to take this route.

A more elegant solution is creating a node using Python, similar to JInterface and equivalents. Luckily there’s an existing project working on a library to create Erlang nodes using Python and Twisted: TwOTP.

One downside: it’s rather underdocumented… So here’s a very quick demo how to call functions on an Erlang node from within a Twisted application.

First of all we’ll create 2 Erlang functions: one which returns a simple “Hello” message, one which uses an extra process to return ‘pong’ messages on calls to ‘ping’, and counts those.

The code:

-module(demo).
-export([hello/1, ping/0, start/0]).

hello(Name) ->
    Message = "Hello, " ++ Name,
    io:format(Message ++ "~n", []),
    Message.

ping_loop(N) ->
    receive
        {get_id, From} ->
            From ! {pong, N},
            ping_loop(N + 1)
    end.

ping() ->
    pingsrv ! {get_id, self()},
    receive
        {pong, N} -> ok
    end,
    {pong, N}.

start() ->
    Pid = spawn_link(fun() -> ping_loop(1) end),
    register(pingsrv, Pid).

This should be straight-forward if you’re familiar with Erlang (which I assume).

The Python code is not that hard to get either: it follows the basic Twisted pattern. First one should create a connection to EPMD, the Erlang Port Mapper Daemon (used to find other nodes), then a connection to the server node should be created, and finally functions can be called (calls happen the same way as Erlang’s RPC module).

Here’s the code. I’d advise to read it bottom-to-top:

import sys

from twisted.internet import reactor
import twotp

def error(e):
    '''A generic error handler'''
    print 'Error:'
    print e
    reactor.stop()

def do_pingpong(proto):
    def handle_pong(result):
        # Parse the result
        # 'ping' returns a tuple of an atom ('pong') and an integer (the pong
        # id)
        # In TwOTP, an Atom object has a 'text' attribute, which is the string
        # form of the atom
        text, id_ = result[0].text, result[1]
        print 'Got ping result: %s %d' % (text, id_)
        # Recurse
        reactor.callLater(1, do_pingpong, proto)

    # Call the 'ping' function of the 'demo' module
    d = proto.factory.callRemote(proto, 'demo', 'ping')
    # Add an RPC call handler
    d.addCallback(handle_pong)
    # And our generic error handler
    d.addErrback(error)

def call_hello(proto, name):
    def handle_hello(result):
        print 'Got hello result:', result
        # Erlang strings are lists of numbers
        # The default encoding is Latin1, this might need to be changed if your
        # Erlang node uses another encoding
        text = ''.join(chr(c) for c in result).decode('latin1')
        print 'String form:', text
        # Start pingpong loop
        do_pingpong(proto)

    # Call the 'hello' function of the 'demo' module, and pass in argument
    # 'name'
    d = proto.factory.callRemote(proto, 'demo', 'hello', name)
    # Add a callback for this function call
    d.addCallback(handle_hello)
    # And our generic error handler
    d.addErrback(error)

def launch(epmd, remote, name):
    '''Entry point of our demo application'''
    # Connect to a node. This returns a deferred
    d = epmd.connectToNode(remote)
    # Add a callback, called when the connection to the node is established
    d.addCallback(call_hello, name)
    # And add our generic error handler
    d.addErrback(error)

def main():
    remote = sys.argv[1]
    name = sys.argv[2]
    # Read out the Erlang cookie value
    cookie = twotp.readCookie()
    # Create a name for this node
    this_node = twotp.buildNodeName('demo_client')
    # Connect to EPMD
    epmd = twotp.OneShotPortMapperFactory(this_node, cookie)
    # Call our entry point function when the Twisted reactor is started
    reactor.callWhenRunning(launch, epmd, remote, name)
    # Start the reactor
    reactor.run()

if __name__ == '__main__':
    main()

Finally, to run it, you should first start a server node, and run the ‘pingsrv’ process:

MacBook:pyping nicolas$ erl -sname test@localhost
Erlang (BEAM) emulator version 5.6.5 [source] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.6.5  (abort with ^G)
(test@localhost)1> c(demo).
{ok,demo}
(test@localhost)2> demo:start().
true

Notice we started erl providing test@localhost as short node name.

Now we can launch our client:

(pythonenv)MacBook:pyping nicolas$ python hello.py 'test' Nicolas
Got hello result: [72, 101, 108, 108, 111, 44, 32, 78, 105, 99, 111, 108, 97, 115]
String form: Hello, Nicolas
Got ping result: pong 1
Got ping result: pong 2
Got ping result: pong 3

‘test’ is the shortname of the server node.

You can stop the ping loop using CTRL-C. If you restart the client afterwards, you can see the ping IDs were retained:

(pythonenv)MacBook:pyping nicolas$ python hello.py 'test' Nicolas
Got hello result: [72, 101, 108, 108, 111, 44, 32, 78, 105, 99, 111, 108, 97, 115]
String form: Hello, Nicolas
Got ping result: pong 4
Got ping result: pong 5

That’s about it. Using TwOTP you can also develop a node which exposes functions, which can be called from an Erlang node using rpc:call/4. Check the documentation provided with TwOTP for a basic example of this feature.

Combining Erlang applications as distributed, fault tolerant core infrastructure and Python/Twisted applications for ‘everyday coding’ can be an interesting match in several setups, an TwOTP provides all required functionalities to integrate the 2 platforms easily.

]]>
http://eikke.com/erlang-python-and-twisted-mashup-using-twotp/feed/ 9
Python gotcha http://eikke.com/python-gotcha/ http://eikke.com/python-gotcha/#comments Fri, 26 Sep 2008 19:35:37 +0000 Nicolas http://eikke.com/python-gotcha/ Don’t ever do this unless it’s really what you want:

import os

def some_func(fd):
    f = os.fdopen(fd, 'w')
    f.write('abc')

fd = get_some_fd()
some_func(fd)
some_other_func(fd)

Here’s what goes wrong: when some_func comes to an end, f (which is a file-like objects) goes out of scope, is destructed, which causes fd to be closed. I think this is pretty weird behavior (an object closing an fd it didn’t open itself), but well.

Here’s a better version, for reference:

def some_func(fd):
    f = os.fdopen(os.dup(fd), 'w')
    #Use f here

Try this on fd 0/1/2 in an (I)Python shell ;-)

]]>
http://eikke.com/python-gotcha/feed/ 3
Embedding JavaScript in Python http://eikke.com/embedding-javascript-in-python/ http://eikke.com/embedding-javascript-in-python/#comments Mon, 25 Aug 2008 23:05:16 +0000 Nicolas http://eikke.com/embedding-javascript-in-python/ Reading some posts about embedding languages/runtimes in applications on Planet GNOME reminded me I still had to announce some really quick and incomplete code blob I created some days after last GUADEC edition (which was insanely cool, thanks guys).

It takes WebKit‘s JavaScriptCore and allows you to embed it in some Python program, so you, as a Python developer, can allow consumers to write plugins using JavaScript. Don’t ask me whether it’s useful, maybe it’s not, but anyway.

There’s one catch: currently there is no support to expose custom Python objects to the JavaScript runtime: you’re able to use JavaScript objects and functions etc. from within Python, but not the other way around. I started working on this, but the JSCore API lacked some stuff to be able to implement this cleanly (or I missed a part of it, that’s possible as well), maybe it has changed by now… There is transparent translation of JavaScript base types: unicode strings, booleans, null (which becomes None in Python), undefined (which becomes jscore.UNDEFINED) and floats.

I did not work on the code for quite a long time because of too much real-job-work, maybe it no longer compiles, sorry… Anyway, it’s available in git here, patches welcome etc. I guess this is the best sample code around. It’s using Cython for compilation (never tried with Pyrex, although this might work as well). If anyone can use it, great, if not, too bad, I did learn Cython doing this ;-)

]]>
http://eikke.com/embedding-javascript-in-python/feed/ 4
Python ‘all’ odity http://eikke.com/python-all-odity/ http://eikke.com/python-all-odity/#comments Thu, 01 May 2008 13:57:00 +0000 Nicolas http://eikke.com/python-all-odity/ [update] Question solved, see bottom of post.

Since Python 2.5 the language got a new built-in method ‘all’ (and it’s nephew ‘any’). I wanted to play around with this a little, combined with generators, so I created a little testcase to test performance.

Here’s the test-case: take a list L of X random numbers in a given range [A, B], and check whether

  • all elements in L are >= A
  • all elements in L are >= (A + Z) where Z is a number in [0, (B - A)]

The first test should always result True, the second test could result to False.

Here’s the output of a test-run:

In [1]: import random, sys

In [2]: a = [random.randint(100, sys.maxint) for i in xrange(2000000)]

In [3]: len(a)
Out[3]: 2000000

In [4]: #Check whether all elements are >= 100 

In [5]: %timeit all(i >= 100 for i in a)
10 loops, best of 3: 515 ms per loop

In [6]: %timeit any(i < 100 for i in a)
10 loops, best of 3: 454 ms per loop

In [7]: def f(l):
   ...:     for i in l:
   ...:         if i < 100:
   ...:             return False
   ...:     return True
   ...: 

In [8]: %timeit f(a)
10 loops, best of 3: 292 ms per loop

In [9]: #Same thing for 100000, since now the list shouldn't be completely iterated

In [10]: %timeit all(i >= 100000 for i in a)
100 loops, best of 3: 4.73 ms per loop

In [11]: %timeit any(i < 100000 for i in a)
100 loops, best of 3: 4.29 ms per loop

In [12]: def g(l):
   ....:     for i in l:
   ....:         if i < 100000:
   ....:             return False
   ....:     return True
   ....: 

In [13]: %timeit g(a)
100 loops, best of 3: 2.82 ms per loop

In [14]: #For reference

In [15]: %timeit False in (i >= 100 for i in a)
10 loops, best of 3: 531 ms per loop

In [16]: %timeit False in (i >= 100000 for i in a)
100 loops, best of 3: 5.03 ms per loop

It’s as if ‘all’, ‘any’ or ‘in’ don’t break/return when a first occurence of False (or True, obviously) is found. Is this the desired behaviour, and if it is, why? The calculation time difference between using all/any/in or a custom-made function (which is, unlike all etc, not written in C) which breaks whenever it can, is pretty astonishing.

[update] Question solved. It’s pretty normal the function-based approach performs better, since it combines what ‘all’ and the generator provided to ‘all’ do, taking away the generator function-call overhead. Damn :-)

]]>
http://eikke.com/python-all-odity/feed/ 3
Python if/else in lambda http://eikke.com/python-ifelse-in-lambda/ http://eikke.com/python-ifelse-in-lambda/#comments Sat, 16 Feb 2008 21:24:43 +0000 Nicolas http://eikke.com/python-ifelse-in-lambda/ Scott, in your “Functional Python” introduction you write:

The one limitation that most disappoints me is that Python lacks is a functional way of writing if/else. Sometimes you just want to do something like this:

lambda x : if_else(x>100, “big number”, “little number”)

(This would return the string “big number” if x was greater than 100, and “little number” otherwise.) Sometimes I get around this by defining my own if_else that I can use in lambda-functions:

def if_else(condition, a, b) :
   if condition : return a
   else         : return b

Actually, you don’t need this helper if_else function at all:

In [1]: f = lambda x: x > 100 and 'big' or 'small'
In [2]: for i in (1, 10, 99, 100, 101, 110):
...:     print i, 'is', f(i)
...:
1 is small
10 is small
99 is small
100 is small
101 is big
110 is big

James, obviously you’re right… Stupid me didn’t think about that. Your version won’t work when a discriminator isn’t known at import time. But even then a function taking *args and **kwargs with a class-like name, returning a correct class instance, would cut the job.

Regarding the module/plugin stuff, I’d rather use setuptools/pkg_resources :-)

]]>
http://eikke.com/python-ifelse-in-lambda/feed/ 17
Python factory-like type instances http://eikke.com/python-factory-like-type-instances/ http://eikke.com/python-factory-like-type-instances/#comments Mon, 11 Feb 2008 19:31:20 +0000 Nicolas http://eikke.com/python-factory-like-type-instances/ When designing applications or libraries, sometimes you need to be able to create instances of a certain interface (in a liberal sense) at runtime without knowing at write/compile time which specific implementation (class) you’ll need to use, as this could depend on runtime variables.

An example of this is an interface providing some functionality which should be implemented differently on different platforms, eg Linux and Windows.

There are some standard patterns how to achieve this. One of them is the factory pattern, which works somewhat like this Python example (let’s pretend ‘PLATFORM’ is ‘linux2′ or ‘win32′, ie sys.platform):

#Pretend we use sys.platform instead of PLATFORM where we use it
PLATFORM = 'linux2'

class FooBase(object):
    def say_foo(self):
        print 'foo'

class PlatformFoo(FooBase):
    def say_platform_foo(self):
        raise NotImplementedError

    @staticmethod
    def get_class():
        #Several ways to get this (dict, introspection, if-tree,...), pick yours
        klass = {
            'linux2': LinuxFoo,
            'win32': WindowsFoo,
        }.get(PLATFORM, None)
        if not klass:
            raise Exception, 'Platform not supported'
        return klass

class WindowsFoo(PlatformFoo):
    def say_platform_foo(self):
        print 'win32 foo'

class LinuxFoo(PlatformFoo):
    def say_platform_foo(self):
        print 'linux foo'

def main():
    foo_class = PlatformFoo.get_class()
    foo = foo_class()
    foo.say_platform_foo()

if __name__ == '__main__':
    main()

Executing this code will, as expected, write ‘linux foo’ to the console. Obviously we could not return the platform-specific class in a PlatformFoo function, but an actual instance, up to you.

Python allows you to handle this situation somewhat nicer though, without introducing any intermediate functions, by using metaclasses.

I won’t explain what metaclasses are here, or how they work, there are several resources on the internet explaining them. Let’s just get to the code:

#Pretend we use sys.platform instead of PLATFORM where we use it
PLATFORM = 'linux2'

class FooBase(object):
    def say_foo(self):
        print 'foo'

    def say_platform_foo(self):
        raise NotImplementedError

class WindowsFoo(FooBase):
    def say_platform_foo(self):
        print 'win32 foo'
         
class LinuxFoo(FooBase):
    def say_platform_foo(self):
        print 'linux foo'


class FooMeta(type):
    def __new__(cls, name, bases, attrs):
        #Several ways to get this (dict, introspection, if-tree,...), pick yours
        klass = {
            'linux2': LinuxFoo,
            'win32': WindowsFoo,
        }.get(PLATFORM, None)
        if not klass:
            raise Exception, 'Platform not supported'
        return klass

class Foo:
    __metaclass__ = FooMeta


def main():
    foo = Foo()
    foo.say_platform_foo()

if __name__ == '__main__':
    main()

See we don’t need any getter-function here, but we can just create a ‘Foo’ instance? The resulting object will be a ‘LinuxFoo’ or a ‘WindowsFoo’ as expected, depending on the value of ‘PLATFORM’. The above code also displays ‘linux foo’.

There’s something nifty about it too: you don’t loose any class inheritance information:

In [1]: from test2 import Foo, LinuxFoo, FooBase

In [2]: f = Foo()

In [3]: f.say_platform_foo()
linux foo

In [4]: type(f)
Out[4]: 

In [5]: isinstance(f, LinuxFoo)
Out[5]: True

In [6]: isinstance(f, FooBase)
Out[6]: True

In [7]: isinstance(f, Foo)
Out[7]: True

This shouldn’t surprise you, as ‘Foo’ actually became an alias for ‘LinuxFoo’.

Maybe this pattern will be useful in one of your projects one day, who know :-)

]]>
http://eikke.com/python-factory-like-type-instances/feed/ 2
How not to write Python code http://eikke.com/how-not-to-write-python-code/ http://eikke.com/how-not-to-write-python-code/#comments Fri, 08 Feb 2008 20:50:16 +0000 Nicolas http://eikke.com/how-not-to-write-python-code/ Lately I’ve been reading some rather unclean Python code. Maybe this is mainly because the author(s) of the code had no in-depth knowledge of the Python language itself, the ‘platform’ delivered with cPython,… Here’s a list of some of the mistakes you should really try to avoid when writing Python code:

  • Remember Python comes batteries included
    Python is shipped with a whole bunch of standard modules implementing a broad range of functionality, including text handling, various data types, networking stuff (both low- and high-level), document processing, file archive handling, logging, etc. All these are documented in the Python Library Documentation, so it is a must to browse at least through the list of available modules, so you get some notions of what you can use by default. An example: don’t introduce a dependency on Twisted to implement a very basic and simple custom HTTP server if you don’t have any performance needs, use BaseHTTPServer and derivates.
  • Python is Python, don’t try to emulate bad coding patterns from other languages
    Python is a mature programming language which provides great flexibility, but also has some pretty specific patterns which you might not know in other languages you used before.
    As an example, don’t try to emulate PHP’s ‘include’ or ‘require’ function, at all. This could be done, somewhat, by writing the code to be included (and executed on inclusion) in a module on the top level (ie. not in functions/classes/…), and using something like ‘from foo import *’ where you want this code to be executed. This will work, but it can become hard to maintain this. Modules are not meant to be used like this, so don’t. If you need to execute some code at some point, put it in a module as a function, import the function and call it wherever you want.
  • Don’t pollute the global namespace
    Do not use ‘from foo import *’, as this will pull in everything defined in foo, but also all modules imported in foo, and maybe even their imports, etc. Try to ‘import foo’ and use foo.whatever, or use ‘from foo import whatever, somethingelse’. Explicit imports make code much more readable, and make it much easier to figure out in which module something you’re using in the current module is defined, if it’d be imported by one of your many global imports otherwise.
  • Use Pythonesque coding pattern
    This is very related to the previous item, obviously. Python has some well-known constructs to handle some situations. Get to know and understand them.
    An example: as you might know, Python has no switch/case construct. There’s a very neat way to implement this though by simply using a dict and function objects (or lambda functions). An example:

    def handle_one():
        return 'one'
    def handle_two():
        return 'two'
    def handle_default():
        return 'unknown'
    cases = {
        'one': handle_one,
        'two': handle_two,
        'three': lambda: 'three',
    }
    for i in ('one', 'two', 'three', 'four', ):
        handler = cases.get(i, handle_default)
        print handler()

    We’re using the dict.get method here, which can take an optional ‘default’ argument. Pretty neat, huh?

  • Don’t reinvent the wheel
    Related to #1. An example? Python contains a great ‘logger’ module, which includes advanced functionality like logging over network, over HTTP, defining multiple logging targets, target trees,… No need to reimplement this yourself!
  • Document your code
    Python has this great language feature called docstrings. Sprinkle them throughout your code rigorously. Do this while writing your functions/classes, not afterwards. Everyone knows that’s extremely boring and depressing.
  • Write tests
    Write testing code. Python includes at least 2 ways to write tests: using standard unit tests, or using doctests, test code snippets included in your docstrings, both useful and illustrative. There’s no way to know some code refactoring went well if you can’t test the result.
  • Use error reporting wisely
    Python includes exception handling. Use this wisely: when something goes wrong in some function which should return a string normally to be displayed to the user, don’t just return a normal string with some error message inside, as if everything went well, but return the message packed in an exception object, so the calling code will know something went wrong (and maybe handle according to this information), whilst still being able to display the error message to the user.
    Next to this, subclass Exception (or a more specific Exception child class), don’t just return base Exceptions, unless in some basic circumstances. An exception class shouldn’t be huge: ‘class FooException(Exception): pass’ cuts the job.
  • Don’t turn off error reporting during development
    In some cases it’s useful to make sure your application keeps on running, no matter what happens (this is eg how Twisted handles server handler exceptions). Python provides some ways to achieve this, so in case you need it you can use it, but make sure you provide a way to disable this, so you can tell your application to crash hard on exceptions during development. This way you’ll certainly notice the issue and you’ll be able to fix it early.
  • Search the web!
    Lots of great people wrote thousands of Python modules for lots of things. Many of them use the very liberal Python license, which allows you to re-use this code even in a close source environment. Pypi can be a great place to start.
  • Use Python basic built-in functions
    A basic example: to check whether a function parameter is of a certain type, don’t use something like ‘arg.__class__ == MyClass’, use ‘isinstance(arg, MyClass)’. Did you know isinstance’s second argument can be a tuple/list? If it is, arg’s type will be checked against all types in this list, so there’s no need to do several ‘isinstance’ calls. Other useful built-ins are getattr/setattr/hasattr (obviously), issubclass,…
  • Use non-instance-specific class methods where useful
    Just like many other programming languages, Python allows you to add static methods to a class. Just decorate your method using the ‘staticmethod’ decorator!
    Next to static methods, Python knows the concept of class methods, which get the class as argument. You most likely won’t need these often.
  • Learn ‘functional programming’ basics
    At first it can be hard to wrap your head around functional programming patterns, but they allow a very convenient and clean way to handle several situations.
  • Don’t mess with sys.path
    If you need to import ‘external’ modules, try not to mess with sys.path. Use distutils functions to discover modules, ship them as eggs,… If you want to alter sys.path anyway, try not to hardcode any ‘base’ paths: generic paths are a major plus, and removing hardcoded path stuff can be a PITA.
  • Use an interactive shell
    A Python shell like iPython is a must-have. I’m completely addicted to the tab-completion and documentation shortcuts it provides.
  • Use a code metrics tool
    I personally use PyLint (with some rules disabled). This tool will check your code for various things: missing imports, typos, wrong variable/function/class/module naming, syntax errors,… which could be in your code even if your test suite runs fine. Maybe you can even add a hook to the VCS you’re using, which doesn’t allow you to check in code unless it got a PyLint score of eg. 7. Extremely useful!

Some days ago RealNitro pointed me at this list of essential Python readings. “Idiomatic Python” is a must-read, even for experienced Python developers.

That’s about it for now, maybe I’ll add some more items to this list later on. If you have some other hints, comments!

]]>
http://eikke.com/how-not-to-write-python-code/feed/ 41