Skip to content


How not to write Python code

Lately I’ve been reading some rather unclean Python code. Maybe this is mainly because the author(s) of the code had no in-depth knowledge of the Python language itself, the ‘platform’ delivered with cPython,… Here’s a list of some of the mistakes you should really try to avoid when writing Python code:

  • Remember Python comes batteries included
    Python is shipped with a whole bunch of standard modules implementing a broad range of functionality, including text handling, various data types, networking stuff (both low- and high-level), document processing, file archive handling, logging, etc. All these are documented in the Python Library Documentation, so it is a must to browse at least through the list of available modules, so you get some notions of what you can use by default. An example: don’t introduce a dependency on Twisted to implement a very basic and simple custom HTTP server if you don’t have any performance needs, use BaseHTTPServer and derivates.
  • Python is Python, don’t try to emulate bad coding patterns from other languages
    Python is a mature programming language which provides great flexibility, but also has some pretty specific patterns which you might not know in other languages you used before.
    As an example, don’t try to emulate PHP’s ‘include’ or ‘require’ function, at all. This could be done, somewhat, by writing the code to be included (and executed on inclusion) in a module on the top level (ie. not in functions/classes/…), and using something like ‘from foo import *’ where you want this code to be executed. This will work, but it can become hard to maintain this. Modules are not meant to be used like this, so don’t. If you need to execute some code at some point, put it in a module as a function, import the function and call it wherever you want.
  • Don’t pollute the global namespace
    Do not use ‘from foo import *’, as this will pull in everything defined in foo, but also all modules imported in foo, and maybe even their imports, etc. Try to ‘import foo’ and use foo.whatever, or use ‘from foo import whatever, somethingelse’. Explicit imports make code much more readable, and make it much easier to figure out in which module something you’re using in the current module is defined, if it’d be imported by one of your many global imports otherwise.
  • Use Pythonesque coding pattern
    This is very related to the previous item, obviously. Python has some well-known constructs to handle some situations. Get to know and understand them.
    An example: as you might know, Python has no switch/case construct. There’s a very neat way to implement this though by simply using a dict and function objects (or lambda functions). An example:

    def handle_one():
        return 'one'
    def handle_two():
        return 'two'
    def handle_default():
        return 'unknown'
    cases = {
        'one': handle_one,
        'two': handle_two,
        'three': lambda: 'three',
    }
    for i in ('one', 'two', 'three', 'four', ):
        handler = cases.get(i, handle_default)
        print handler()

    We’re using the dict.get method here, which can take an optional ‘default’ argument. Pretty neat, huh?

  • Don’t reinvent the wheel
    Related to #1. An example? Python contains a great ‘logger’ module, which includes advanced functionality like logging over network, over HTTP, defining multiple logging targets, target trees,… No need to reimplement this yourself!
  • Document your code
    Python has this great language feature called docstrings. Sprinkle them throughout your code rigorously. Do this while writing your functions/classes, not afterwards. Everyone knows that’s extremely boring and depressing.
  • Write tests
    Write testing code. Python includes at least 2 ways to write tests: using standard unit tests, or using doctests, test code snippets included in your docstrings, both useful and illustrative. There’s no way to know some code refactoring went well if you can’t test the result.
  • Use error reporting wisely
    Python includes exception handling. Use this wisely: when something goes wrong in some function which should return a string normally to be displayed to the user, don’t just return a normal string with some error message inside, as if everything went well, but return the message packed in an exception object, so the calling code will know something went wrong (and maybe handle according to this information), whilst still being able to display the error message to the user.
    Next to this, subclass Exception (or a more specific Exception child class), don’t just return base Exceptions, unless in some basic circumstances. An exception class shouldn’t be huge: ‘class FooException(Exception): pass’ cuts the job.
  • Don’t turn off error reporting during development
    In some cases it’s useful to make sure your application keeps on running, no matter what happens (this is eg how Twisted handles server handler exceptions). Python provides some ways to achieve this, so in case you need it you can use it, but make sure you provide a way to disable this, so you can tell your application to crash hard on exceptions during development. This way you’ll certainly notice the issue and you’ll be able to fix it early.
  • Search the web!
    Lots of great people wrote thousands of Python modules for lots of things. Many of them use the very liberal Python license, which allows you to re-use this code even in a close source environment. Pypi can be a great place to start.
  • Use Python basic built-in functions
    A basic example: to check whether a function parameter is of a certain type, don’t use something like ‘arg.__class__ == MyClass’, use ‘isinstance(arg, MyClass)’. Did you know isinstance’s second argument can be a tuple/list? If it is, arg’s type will be checked against all types in this list, so there’s no need to do several ‘isinstance’ calls. Other useful built-ins are getattr/setattr/hasattr (obviously), issubclass,…
  • Use non-instance-specific class methods where useful
    Just like many other programming languages, Python allows you to add static methods to a class. Just decorate your method using the ‘staticmethod’ decorator!
    Next to static methods, Python knows the concept of class methods, which get the class as argument. You most likely won’t need these often.
  • Learn ‘functional programming’ basics
    At first it can be hard to wrap your head around functional programming patterns, but they allow a very convenient and clean way to handle several situations.
  • Don’t mess with sys.path
    If you need to import ‘external’ modules, try not to mess with sys.path. Use distutils functions to discover modules, ship them as eggs,… If you want to alter sys.path anyway, try not to hardcode any ‘base’ paths: generic paths are a major plus, and removing hardcoded path stuff can be a PITA.
  • Use an interactive shell
    A Python shell like iPython is a must-have. I’m completely addicted to the tab-completion and documentation shortcuts it provides.
  • Use a code metrics tool
    I personally use PyLint (with some rules disabled). This tool will check your code for various things: missing imports, typos, wrong variable/function/class/module naming, syntax errors,… which could be in your code even if your test suite runs fine. Maybe you can even add a hook to the VCS you’re using, which doesn’t allow you to check in code unless it got a PyLint score of eg. 7. Extremely useful!

Some days ago RealNitro pointed me at this list of essential Python readings. “Idiomatic Python” is a must-read, even for experienced Python developers.

That’s about it for now, maybe I’ll add some more items to this list later on. If you have some other hints, comments!

Posted in Development, Technology.

Tagged with , .


41 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. Eduardo Padoan says

    “If you got some other hints, comments!”

    * Dont abuse isinstance() – prefer to catch the exception when the wrong type is given when possible (so you can pass objects of unespected types that implements the desired interface)
    * Instead of testing “if some_condition(): do_stuff()”, prefer “try: do_stuff(); except SomeError: recover() “.
    * if a method does not manipulate/depend on the state of the object (eg. a classmethod), it probably should be a function.

  2. Nicolas says

    Eduardo: I don’t completely agree on your first point. What will you do when a given parameter only implements a part of the expected interface? Imagine you do several calls on the given argument, which work, you do other calls/operations by using info they returned (maybe even changing remote data), then do a last call, which is not implemented. At that moment it will be most likely extremely hard, if not impossible, to revert state. I tend to use some sort of “abstract base class”, where the base class for an expected ‘type’ got all functions defined but raises NotImplementedError when no reasonable default can be provided. All actual types should inherit from this abstract class/interface (depending on whether you actually provide any implementation), so isinstance or issubclass can work fine.

  3. Chris Cunningham says

    The Twisted website is .com, not .org (which is a squatter site).

    – Chris

  4. Nicolas says

    Fixed, thanks Chris!

  5. Paddy3118 says

    Hi Nicolas,
    I tend to agree more with Eduardo on the use of isinstance. The new to Python programmer mentioned in the article I think will tend to overuse isinstance and needs to get into the habit of NOT automatically checking types. When they become more experienced they can make a more informed choice between “Just do it” relying on exceptions; hasattr; and isinstance.

    - Paddy.

  6. Dieter_be says

    Great read for a python newbie like me.

    Maybe useful to mention the webpage of distutils :
    http://www.python.org/community/sigs/current/distutils-sig/

  7. What is Python? says

    Thank you for the info. Could you elaborate more on the avoiding php-style includes. I use that often in my web applications. For instance, database connections which contain passwords and usernames I do not want to keep in the web tree.

  8. Nicolas says

    Paddy: I’ll take it into account :-)

    Dieter: glad you like it, thanks for the link, I forgot to add it indeed.

    ‘What is Python?’: I know this is a very common pattern in PHP-style coding. Exporting variables might not be so bad, I was rather talking about functional statements to be executed on include (ie stuff that sets up some environment things etc).
    About the database connection parameters: first of all, you mention you don’t want to have connection parameter settings inside your web tree. Well, when writing web applications using Python, you’ll most likely use some framework to make this easier (check Django!), and most of these frameworks (maybe even all of them!) don’t require your code files to be in any web tree at all, as they’re not served by your webserver the same way it serves PHP files, which (in most setups) should be stored inside your DocumentRoot/htdocs folder indeed.
    On the purpose of creating a database connection, there are several possibilities. You could create a factory class which returns a database connection, you could use some ‘settings.py’ file which lists a whole bunch of variables, which you import (by name) when you need them, you could store the settings in some dict which you pass to the connection creation function using **kwargs,… I guess this also depends on what database access method you use: are you using a low-level database-specific packags, like you would use the mysql_* functions in PHP? I’d advise against this: better use some more generic access method, maybe provided by some framework you use (eg Django’s ORM), or something like SQLAlchemy or equivalent. These packages might enforce you to use some specific way to store database connection settings.
    Anyway, when starting to learn Python, try to forget lots of the PHP stuff you know, as you should not try to reproduce these patterns in Python.

  9. Alan says

    “if you don’t have any performance needs, use BaseHTTPServer and derivates”

    Disagree. Maybe you want an http server you can test from standard in, or hook up to an arbitrary process that’s speaking http on one of its file descriptors. BaseHTTPServer derives from SocketServer.TCPServer so it can’t be used in a network-independent way. An http parser that operates on a file object would have been more useful, wouldn’t you agree?

    There are often good reasons *not* to reuse.

  10. karim says

    Great read! Thank you!

  11. Greg says

    For the sys.path advice. Sometimes I’ll have a module that has to stay in a really weird location like a mapped network drive. Don’t I need to use sys.path to get at something like? If there’s a better way I’d love to hear it.

  12. Danz says

    “don’t use something like ‘arg.__class__ == MyClass’, use ‘isinstance(arg, MyClass)’”

    Nitpicking, but these aren’t equivalent once subclasses are involved. isinstance is often more useful – but not always.

    >>>class parent:
    … pass
    >>>class child(parent):
    … pass
    >>>instance = child()

    >>>isinstance(instance, parent)
    True
    >>> instance.__class__ == parent
    False

  13. Danz says

    oh, and let me add one: use local functions to keep long code under control. Maybe this one is just me – I don’t see it used much even by experienced pythonistas. But I do find it makes my code a lot easier to understand when I come back to it later.

    Here’s what I mean. Suppose you have something like this :

    def doSomething(foo):
    #20 lines of setting something up
    for item in bigLongList:
    #30 lines of code inside the for loop
    #20 more lines outside the loop, tidying up

    70 lines in a function is a pain to read, so the obvious first step is to break it into separate functions:

    def setUpSomething(foo):
    #20 lines of code
    def processItem(item):
    #30 lines of code
    def tidyUpSomething(foo):
    #20 lines of code
    def doSomething(foo):
    bigLongList = setUpSomething(foo)
    for item in bigLongList:
    processItem(item)
    tidyUpSomething(foo)

    but if those 3 functions are only ever going to be used together, it’s messy and confusing to have them all in the global namespace. I find it much neater to do:

    def doSomething(foo):
    def setUpSomething(foo):
    #20 lines of code
    def processItem(item):
    #30 lines of code
    def tidyUpSomething(foo):
    #20 lines of code
    bigLongList = setUpSomething(foo)
    for item in bigLongList:
    processItem(item)
    tidyUpSomething(foo)

  14. Danz says

    …well, that would have been much more readable, if it weren’t for the system eating leading whitespace in comments.

  15. James says

    I hate to be a nitpicker, your article is very simple and straightforward with good advice. Unfortunately it can be painful to read due to your continued use of the word “got” when you should be using the word “has.” Python has some well known constructs, not Python got some well known constructs. Fixing things like that may help many more people take your writing seriously.

  16. elzapp says

    Nice post. bookmarked. :)

  17. Hamish M says

    Very interesting article. There are lot’s of things I don’t do very often, like documenting my code, and creating test cases.

    By the way, you said ‘docstings’ at one point, I think you mean docstrings. :)

  18. Nicolas says

    Danz: you’re right about the isinstance/subclass thing, although the cases where you want to do exact class comparison, not taking subclasses into account, are rather limited.
    About spaces: that’s how HTML works, you don’t want me to put all comments inside a ‘pre’, right?

    James: thanks for the hint, I’ll try to take it into account. I’m not a native English speaker though. I’ll fix the obvious mistakes in a minute.

    Hamish: Thanks, fixing.

  19. Nicolas says

    Forgot one. Greg: a pattern I saw several times before is using a ‘wrapper’ script around the actual Python application, which sets PYTHONPATH. Maybe not so nice either, but imho definitely better than changing sys.path in non-wrapper code.

  20. Dennis Forbes says

    I found this entry very confusing — are these points how NOT to program? Most of them seem like “how TO” program in Python, apart from perhaps oddities like making members static.

  21. Nicolas says

    Dennis: you’re right, there’s some ambiguity. The entry was inspired by some frustrations of working with code which was written in a way Python code should *not* be written. Sorry for the confusion.

  22. Hugh Bien says

    Great tips! Two of them I really enjoyed were ‘write docs’ and ‘unit test’. Those two together, along with the clear syntax of Python, means it’s pretty easy to go back to a project after being away from it for a year or two.

  23. Spacebat says

    I often start a script with sys.path.insert(0, ‘lib’) as I have zero experience with distutils and eggs. I’ll have to look into them. Perhaps a less awful approach would be sys.path.append(‘lib’) so that I get my project specific modules while being sure that any standard modules from can still be found.

  24. Ionut says

    I also agree with Eduardo! Duck typing is usually preferred in Python instead of the “common base class” pattern which is common in statically typed languages. There is no point in checking isinstance when you could check whether the method exists. This will simplify testing.
    The problem in your counterexample can easily be fixed by checking for all the methods before starting the process.

  25. ale says

    It’s a pity that pythonistas end up seeing things like the emulated C-switch as “pretty neat”, instead of “yeah, it sucks, but it’s the best you can do”. This is a kludge. A mechanism. Something that should stay behind the scenes and, even in primitive languages like C, does stay behind the scenes.

    When most programmers are thinking of that bit of code, they don’t think: “I want a hash that dispatches function calls depending on some variable”, they think “I want this bit of code to run on this input, this on this input, etc” and the code should reflect the semantics, not what the machine does at the back. That is what we have low-level languages for. Python is supposed to be runnable pseudo-code, and pseudo-code wouldn’t look like that.

    I say it is a pity because it leads to staleness in Python. It is a flaw in its design. Something that should be corrected, not seen as a symbol of greatness.

  26. Himanshu says

    Thanks for the useful advice.

  27. Paddy3118 says

    Hi ale,
    On your comment about switch statements. It is swings and roundabouts – a lot of times where switch statements might be used , where the body of each case is small, in a language like C, Pythons dynamism allows you to just not need the switch statement, and when the contents of each case block are large then encapsulating it in a function is preferable. That leaves not enough reason to add another statement to Python. Sometimes when translating an algorithm to Python it would make the translation more literal though…

    - Paddy.

  28. Issyk-Kul Karakol says

    Great tips! Thanks. I’m still beginner in Python but wrote a few scripts and your article is useful (Thanks for Pypi link – didn’t know it before)!

  29. ale says

    @Paddy
    Python does remove lots of occurrences of the switch statement, but it still leaves a lot of them in there (enough for the dict method to make it into something like this list, and enough for me to have developed a disliking for it).

    More importantly, the fact that most leftover cases are large where whatever has to be called is big enough to deserve their own function is orthogonal to my objection, since even if you have to call a function, it is better to do it inside the switch because the code is semantically clearer (having a switch doesn’t prevent the cases from being function calls). When any programmer sees a switch statement it is obvious what is going on.

    The dict method is not obvious. I even tend to write it in the if – elif – elif sometimes because of this. Note that it is perfectly possible to use the dict semantics in pretty much every other language, but it is rarely used outside of python, afaik.

    A less-important problem with the dict dispatch method is that all the functions being dispatched have to have the same signature, which either forces you to write functions just for the switch statement, regardless of whether they could have been reused from elsewhere, or use partials/currying and thus add another bit of trickiness for something that should be in the language in the first place.

    Finally, a switch statement to me seems more pythonic: there should be one obvious way to do it. The fact that someone felt that something as basic as the dispatch method needed to be added a tips-and-tricks list is a clear sign that it is failing the obvious criteria.

  30. Paddy3118 says

    Hi ale,
    Dispatch is handled by the dict method, there is no mention that it cannot do what switch might be used for in other languages. Switch semantics are far from uniform in other languages (I dislike those that allow the statements of one case to flow into those of the next without an explicit break type statement at the end of a case clause).
    Pythons function arguments are very rich and would allow some flexibility in function signature for the uniform dispatch used.
    The dict method is not obvious to those knowing languages that have switch statements – it is a Python idiom that needs to be learned, but a powerful idiom.
    The tips and tricks list is necessary to remind those4 new to Python , and especially those who program in other languages, that Python is best written using its own idioms rather than trying to use or adapt idioms from other languages. It is not as if the inclusion of a switch statement isn’t discussed by the developers – it is, and maybe your comments were meant for that audience, but if you are going to add new statements to Python, I’d rather have a bigger payback for the resultant language bloat.

    - Paddy.

  31. ale says

    @Paddy
    I think you are exaggerating the switch statement semantics variability for effect. The variation is minor and it is only pertinent to the fall-through behaviour in C/c++/Java/php’s. ruby, lisp, ml, haskell all share identical semantics, and in all of these cases, including the fall-through ones, the meaning is clear from a naive point of view since the semantics are _explicit_. (There’s another pythonic argument for it)

    Do you honestly think that the dispatch method is as intuitive as a switch statement to a non-programmer? (If yes, then that is all you need to answer) I don’t even think it is as intuitive to a python programmer since i’ve been a python programmer for a long time and i still can spot a switch statement’s intentions much quicker, not to mention find the right method being called, since while python enforces indenting for standard blocks, it can do nothing about dictionary’s contents.

    Note also that the list of tips is very generic and most would apply equally well to pretty much any other language, but definitely not the dispatch. The point about not carrying bad patterns from other languages is already explicited in another point. The dispatch hint just sits by itself, as a non-obvious solution to a problem that people will obviously want to solve.

    But my main objection was to the “pretty neat”, which pervades the c.l.p newsgroup, exulting on the idioms that were invented to paper over the language’s flaws. It’s as if the ‘if’ statement wouldn’t have been included, they would defend the use of the dispatch mechanism for this too. If the language can afford a ‘with’, it can definitely afford a ‘switch’ or a ‘case’ or a ‘when’ (i like when best). It is very far from bloated still.

  32. Paddy3118 says

    Hi ale,
    I do not think the lack of an explicit switch statement is a major failing of Python. Whether it is a failing leads to some healthy debate, and shows me how Python developers take care with what they add to the language knowing that it is far harder to remove something that proves to be a mistake.
    It is right that such proposals should be debated and the onus is on those who wish to make a change to prove their change would be advantageous.
    I personally think Python gains from having a smaller syntax in general – I am one of those Python users who use several other languages as well as Python, but prefer Python for a lot of my personal projects. Switching languages is helped by Python being readable and concise, so I tend to be cautious about new additions.
    On comp.lang.python, those answering questions are proud of their language and I guess it may show. But their is also a very strong wish to be helpful and to be polite as well as to preserve c.l.p as a good place for people to get help. I am sure if you posted examples of c.l.p being “pretty neat” to the newsgroup, that it would be read.
    If you want to improve Python by giving it a switch statement then you might try trawling the developers mailing list as well as c.l.p. and seeing how you can advance the argument. (One of the more persuasive arguments I have seen is for proponents of change to trawl the standard library and show how “better” it could be with the new feature using it for examples and statistics, as well as doing an implementation, so people can try it out).

    - Paddy.

  33. Markus Jais says

    Great list. I especially agree with you about not reinventing the wheel. I’ve seen it dozens of times that programmers write libraries, modules and other stuff that was already availabe in the standard library. I’ve seen this in Python as well as Ruby, C++, Java and other languages.
    In most cases the solutions that already comes with the language (or is available on the web) is a very good and reliable solution and will always be better than the one you can create for yourself.

  34. Joe Smith says

    Sweet post! Thanks for the advice :)

  35. dikar says

    thanks

  36. Tom says

    Another nice way to do a switch case is like this:

    def handle_one():
    pass
    def handle_two():
    pass

    someVariable = ‘case1′
    switchcase = {
    ‘case1′:handle_one,
    ‘case2′:handle_two
    }[someVariable]()

  37. Trevor says

    Regarding the switch statement discussion going on, Ale is correct, his points are all logically sound. The question comes down to this: is the concept of the switch statement a good one?

    The reason I say this is because you can apply his points to any programming construct, so if you believe the concept of switch is good (regardless of how poorly other languages implement it) then it should be added.

    Anyone who says otherwise is bias based on their emotions for some strange reason.

Continuing the Discussion

  1. purrl.net |** urls that purr **| linked to this post on February 9, 2008

    This is one of the web’s most interesting stories on Sat 9th Feb 2008…

    These are the web’s most talked about URLs on Sat 9th Feb 2008. The current winner is …..

  2. roScripts - Webmaster resources and websites linked to this post on February 9, 2008

    How not to write Python code » Ikke’s blog…

    How not to write Python code » Ikke’s blog…

  3. Whykay's happenings linked to this post on February 10, 2008

    links for 2008-02-10…

    Regular Expressions Cheat Sheet – Cheat Sheets – ILoveJackDaniels.com (tags: cheatsheet code programming reference development) How not to write Python code » Ikke’s blog (tags: python programming toread tips style) Vide…



Some HTML is OK

or, reply to this post via trackback.