Ikke's blog » Development http://eikke.com 'cause this is what I do Sun, 13 Feb 2011 14:58:55 +0000 en-US hourly 1 http://wordpress.org/?v=3.4.1 Scala tail recursion and decompiler adventures http://eikke.com/scala-tail-recursion-decompiler/ http://eikke.com/scala-tail-recursion-decompiler/#comments Wed, 12 Aug 2009 22:31:03 +0000 Nicolas http://eikke.com/?p=129 I’ve been into Scala lately. More about it will follow later, but there’s something I found out which I really like.

Last couple of days I wrote some very basic Scala snippets, containing constructs which would be non-trivial or ‘unusual’ to write in Java, compile it to a class file, and then use a Java decompiler to figure out how the Scala compiler maps those constructs to JVM bytecodes.

There’s one thing which took my attention: looks like (basic) tail-recursive functions are optimized into while-loops! This only happens if the last call of a function is a call to itself (the most basic form of tail recursion), but it’s an interesting feature anyway… No more need to put socket accept handling in an infinite while loop :-)

A little demo. First, here’s a Scala object which implements a very basic ‘reduce’ function:

object Reducer {
  def reduce[T, V](fun: (V, T) => V, values: List[T], initial: V): V = {
    if(values isEmpty)
      return initial
    val next = fun(initial, values head)
    return reduce(fun, values tail, next)
  }

  def main(args: Array[String]): Unit = {
    val values = List(1, 2, 3, 4)
    val sum = reduce[Int, Int]((x, y) => x + y, values, 0)
    println("Result: " + sum)
  }
}

We can compile and run this, and it’ll output the expected result ’10′:

MacBook:reduce nicolas $ scalac Reducer.scala 
MacBook:reduce nicolas $ scala Reducer
Result: 10

Now we can open the generated class files in JD. There are a couple of them (it’s interesting to take a look at all of them and figure out what they represent exactly), but in this case we need ‘Reducer$.class’, which contains the implementations of our public functions, including ‘reduce’.

Here’s the Java version of the ‘reduce’ function:

public <T, V> V reduce(Function2<V, T, V> fun, List<T> values, V initial)
{
  while (true)
  {
    if (values.isEmpty())
      return initial;
    Object next = fun.apply(initial, values.head());
    initial = next;
    values = values.tail();
  }
}

‘Function2′ is a built-in Scala type which represents a function taking 2 parameters. As you can see, this code does exactly the same as our Scala version and is most likely the way we’d write the code manually as well (the only thing I don’t get is why ‘next’ is an Object and not a ‘V’, I might figure that out later), but without forcing us to write the imperative code, whilst still producing bytecodes which will most likely show the best performance on the JVM (which currently has no tail recursion optimization support (although that might change one day)).

I like it :-)

[update]
For reference, here’s a slightly more Scala-ish implementation of reduce, showing the same time performance characteristics during some basic profiling. I was not able to get JD nor jad to generate any usable decompiled code though:

def reduce[T, V](fun: (V, T) => V, values: List[T], initial: V): V = {
    values match {
        case List() => initial;
        case head :: tail => reduce(fun, tail, fun(initial, head))
    }
}

It uses Scala’s “List” pattern matching functionality.

]]>
http://eikke.com/scala-tail-recursion-decompiler/feed/ 7
Re: Python recursion performance test http://eikke.com/re-python-recursion-performance-test/ http://eikke.com/re-python-recursion-performance-test/#comments Thu, 16 Jul 2009 01:00:57 +0000 Nicolas http://eikke.com/?p=113 (This is a reply on a post by Ahmed Soliman on recursion performance in (C)Python, and CPython function call overhead in general. I started to write this as a comment on his post, but it turned out much longer, so sending it over here in the end.)

Hey,

As discussed before, this is not a fair comparison, since the non-recursive version is much ‘smarter’ than the recursive one: it calculates values and will never recalculates them, whilst the recursive version calculates everything over and over again.

Adding some simple memoization helps a lot. First, my testing code:

Here are the benchmarks on my MacBook Pro Intel Core2Duo 2.33GHz with 3GB RAM (running quite a lot of applications). Do note the ‘dumb’ version calculates fib(35), whilst the slightly optimized versions, which still use recursion but much less recursive calls (as they should) or your second version calculate fib(150).

Using MacOS X 10.5.6 stock CPython 2.5.1:

MacBook:Projects nicolas $ python -V
Python 2.5.1

MacBook:Projects nicolas $ python fib.py 35 150
fib(35) = 9227465
Calculation took 12.8542108536 seconds

Calculating the amount of recursive calls to calculate fib(35)
Calculating fib(35) = 9227465 took 29860703 calls

fib2(150) = 9969216677189303386214405760200
Calculation took 0.00020694732666 seconds

memoize_dict(fib)(150) = 9969216677189303386214405760200
Calculation took 0.00141310691833 seconds

memoize_constant_list(151, fib)(150) = 9969216677189303386214405760200
Calculation took 0.000310182571411 seconds

Overall it looks like fib2 and memoize_constant_list perform fairly similar, I guess function call overhead and list.append have a similar influence on performance in this case.

Using Jython 2.5.0 from the binary distribution on the Java HotSpot 64bit Server VM as shipped for OS X 10.5.6:

MacBook:Projects nicolas $ ./Jython/jython2.5.0/jython -V 
Jython 2.5.0

MacBook:Projects nicolas $ ./Jython/jython2.5.0/jython fib.py 35 150
fib(35) = 9227465
Calculation took 12.5539999008 seconds

Calculating the amount of recursive calls to calculate fib(35)
Calculating fib(35) = 9227465 took 29860703 calls

fib2(150) = 9969216677189303386214405760200
Calculation took 0.0519998073578 seconds

memoize_dict(fib)(150) = 9969216677189303386214405760200
Calculation took 0.00399994850159 seconds

memoize_constant_list(151, fib)(150) = 9969216677189303386214405760200
Calculation took 0.00300002098083 seconds

The ‘dumb’ fib implementation performs similar in both CPython and Jython. Jython performs significantly less good on the other implementations though, but maybe todays news could help here, not sure how much locking on dict and list access Jython introduces.

Finally, using Unladen Swallow 2009Q2, self-compiled from SVN on the same system, using standard settings:

MacBook:Projects nicolas $ ./unladen-swallow/unladen-2009Q2-inst/bin/python -V
Python 2.6.1

MacBook:Projects nicolas $ ./unladen-swallow/unladen-2009Q2-inst/bin/python fib.py 35 150
fib(35) = 9227465
Calculation took 12.2675719261 seconds

Calculating the amount of recursive calls to calculate fib(35)
Calculating fib(35) = 9227465 took 29860703 calls

fib2(150) = 9969216677189303386214405760200
Calculation took 0.000118970870972 seconds

memoize_dict(fib)(150) = 9969216677189303386214405760200
Calculation took 0.000972986221313 seconds

memoize_constant_list(151, fib)(150) = 9969216677189303386214405760200
Calculation took 0.00036096572876 seconds

which is similar to, slighly better or slightly worse than the CPython run, and when enforcing JIT (which introduces a significant startup time, which is not measured here):

MacBook:Projects nicolas $ ./unladen-swallow/unladen-2009Q2-inst/bin/python -j always fib.py 35 150
fib(35) = 9227465
Calculation took 14.6129109859 seconds

Calculating the amount of recursive calls to calculate fib(35)
Calculating fib(35) = 9227465 took 29860703 calls

fib2(150) = 9969216677189303386214405760200
Calculation took 0.0432291030884 seconds

memoize_dict(fib)(150) = 9969216677189303386214405760200
Calculation took 0.0363459587097 seconds

memoize_constant_list(151, fib)(150) = 9969216677189303386214405760200
Calculation took 0.0335609912872 seconds

which, to my surprise, performs pretty worse than the default settings.

Overall: your first implementation performs tons and tons of function calls, whilst the second one, which resembles memoize_list_fib in my code (which is recursive), performs significantly less function calls and in the end memoize_list_fib performs almost as good as your second version (it performs +- the same number of function calls as the number of times you’re going through your loop).

So whilst I do agree function calls in Python are reasonably slow compared to plain C function calls (which is just a jmp, no frame handling etc. etc. required), your comparison between your recursive and non-recursive implementation is completely unfair, and even if calculating fib(35) takes several seconds, consider you’re doing a pretty impressive 29860703 function calls to perform the calculation.

Time to get some sleep.

]]>
http://eikke.com/re-python-recursion-performance-test/feed/ 6
First Clojure experiments http://eikke.com/first-clojure-experiments/ http://eikke.com/first-clojure-experiments/#comments Sun, 12 Jul 2009 00:28:25 +0000 Nicolas http://eikke.com/?p=109 Some weeks ago I attended JavaOne (a pretty neat conference, even for non-Java-heads like me) and got in touch with several non-Java languages running on the JVM (nothing really new next to Project Fortress, but I never got into most for real).

Since I wanted to learn some language not resembling any other I already know (even a little), I decided some hours ago to start digging into Clojure, which is a LISP dialect running on the JVM using STM (Software Transactional Memory) and created with concurrency in mind. Check the website for more information.

After some hacking I got a first ‘application’ running. Since recently there’s been some little meme at work regarding echo servers, I decided to write a very basic line-oriented echo server in Clojure.

The result is a server using one thread per connection which just sends back lines to a connected client as-is. Nothing fancy, but might be a useful start for developing basic network applications using Clojure.

Enjoy!

]]>
http://eikke.com/first-clojure-experiments/feed/ 5
Erlang, Python and Twisted mashup using TwOTP http://eikke.com/erlang-python-and-twisted-mashup-using-twotp/ http://eikke.com/erlang-python-and-twisted-mashup-using-twotp/#comments Sun, 19 Apr 2009 19:03:30 +0000 Nicolas http://eikke.com/?p=90 Recently, I’ve been toying around with Erlang again. After creating some simple apps I wanted to integrate some Erlang code inside a Python application (since that’s still my favorite day-to-day language, it’s used at work and I’m sort-of convinced Erlang would be a good choice for several of the applications we need to develop, integrated with our existing Python code). The most obvious solution would be to use an Erlang port, but this is IMHO rather cumbersome: it requires a developer to define a messaging format, parsing code for incoming messages, etc. There’s a tutorial available if you want to take this route.

A more elegant solution is creating a node using Python, similar to JInterface and equivalents. Luckily there’s an existing project working on a library to create Erlang nodes using Python and Twisted: TwOTP.

One downside: it’s rather underdocumented… So here’s a very quick demo how to call functions on an Erlang node from within a Twisted application.

First of all we’ll create 2 Erlang functions: one which returns a simple “Hello” message, one which uses an extra process to return ‘pong’ messages on calls to ‘ping’, and counts those.

The code:

-module(demo).
-export([hello/1, ping/0, start/0]).

hello(Name) ->
    Message = "Hello, " ++ Name,
    io:format(Message ++ "~n", []),
    Message.

ping_loop(N) ->
    receive
        {get_id, From} ->
            From ! {pong, N},
            ping_loop(N + 1)
    end.

ping() ->
    pingsrv ! {get_id, self()},
    receive
        {pong, N} -> ok
    end,
    {pong, N}.

start() ->
    Pid = spawn_link(fun() -> ping_loop(1) end),
    register(pingsrv, Pid).

This should be straight-forward if you’re familiar with Erlang (which I assume).

The Python code is not that hard to get either: it follows the basic Twisted pattern. First one should create a connection to EPMD, the Erlang Port Mapper Daemon (used to find other nodes), then a connection to the server node should be created, and finally functions can be called (calls happen the same way as Erlang’s RPC module).

Here’s the code. I’d advise to read it bottom-to-top:

import sys

from twisted.internet import reactor
import twotp

def error(e):
    '''A generic error handler'''
    print 'Error:'
    print e
    reactor.stop()

def do_pingpong(proto):
    def handle_pong(result):
        # Parse the result
        # 'ping' returns a tuple of an atom ('pong') and an integer (the pong
        # id)
        # In TwOTP, an Atom object has a 'text' attribute, which is the string
        # form of the atom
        text, id_ = result[0].text, result[1]
        print 'Got ping result: %s %d' % (text, id_)
        # Recurse
        reactor.callLater(1, do_pingpong, proto)

    # Call the 'ping' function of the 'demo' module
    d = proto.factory.callRemote(proto, 'demo', 'ping')
    # Add an RPC call handler
    d.addCallback(handle_pong)
    # And our generic error handler
    d.addErrback(error)

def call_hello(proto, name):
    def handle_hello(result):
        print 'Got hello result:', result
        # Erlang strings are lists of numbers
        # The default encoding is Latin1, this might need to be changed if your
        # Erlang node uses another encoding
        text = ''.join(chr(c) for c in result).decode('latin1')
        print 'String form:', text
        # Start pingpong loop
        do_pingpong(proto)

    # Call the 'hello' function of the 'demo' module, and pass in argument
    # 'name'
    d = proto.factory.callRemote(proto, 'demo', 'hello', name)
    # Add a callback for this function call
    d.addCallback(handle_hello)
    # And our generic error handler
    d.addErrback(error)

def launch(epmd, remote, name):
    '''Entry point of our demo application'''
    # Connect to a node. This returns a deferred
    d = epmd.connectToNode(remote)
    # Add a callback, called when the connection to the node is established
    d.addCallback(call_hello, name)
    # And add our generic error handler
    d.addErrback(error)

def main():
    remote = sys.argv[1]
    name = sys.argv[2]
    # Read out the Erlang cookie value
    cookie = twotp.readCookie()
    # Create a name for this node
    this_node = twotp.buildNodeName('demo_client')
    # Connect to EPMD
    epmd = twotp.OneShotPortMapperFactory(this_node, cookie)
    # Call our entry point function when the Twisted reactor is started
    reactor.callWhenRunning(launch, epmd, remote, name)
    # Start the reactor
    reactor.run()

if __name__ == '__main__':
    main()

Finally, to run it, you should first start a server node, and run the ‘pingsrv’ process:

MacBook:pyping nicolas$ erl -sname test@localhost
Erlang (BEAM) emulator version 5.6.5 [source] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.6.5  (abort with ^G)
(test@localhost)1> c(demo).
{ok,demo}
(test@localhost)2> demo:start().
true

Notice we started erl providing test@localhost as short node name.

Now we can launch our client:

(pythonenv)MacBook:pyping nicolas$ python hello.py 'test' Nicolas
Got hello result: [72, 101, 108, 108, 111, 44, 32, 78, 105, 99, 111, 108, 97, 115]
String form: Hello, Nicolas
Got ping result: pong 1
Got ping result: pong 2
Got ping result: pong 3

‘test’ is the shortname of the server node.

You can stop the ping loop using CTRL-C. If you restart the client afterwards, you can see the ping IDs were retained:

(pythonenv)MacBook:pyping nicolas$ python hello.py 'test' Nicolas
Got hello result: [72, 101, 108, 108, 111, 44, 32, 78, 105, 99, 111, 108, 97, 115]
String form: Hello, Nicolas
Got ping result: pong 4
Got ping result: pong 5

That’s about it. Using TwOTP you can also develop a node which exposes functions, which can be called from an Erlang node using rpc:call/4. Check the documentation provided with TwOTP for a basic example of this feature.

Combining Erlang applications as distributed, fault tolerant core infrastructure and Python/Twisted applications for ‘everyday coding’ can be an interesting match in several setups, an TwOTP provides all required functionalities to integrate the 2 platforms easily.

]]>
http://eikke.com/erlang-python-and-twisted-mashup-using-twotp/feed/ 9
Python factory-like type instances http://eikke.com/python-factory-like-type-instances/ http://eikke.com/python-factory-like-type-instances/#comments Mon, 11 Feb 2008 19:31:20 +0000 Nicolas http://eikke.com/python-factory-like-type-instances/ When designing applications or libraries, sometimes you need to be able to create instances of a certain interface (in a liberal sense) at runtime without knowing at write/compile time which specific implementation (class) you’ll need to use, as this could depend on runtime variables.

An example of this is an interface providing some functionality which should be implemented differently on different platforms, eg Linux and Windows.

There are some standard patterns how to achieve this. One of them is the factory pattern, which works somewhat like this Python example (let’s pretend ‘PLATFORM’ is ‘linux2′ or ‘win32′, ie sys.platform):

#Pretend we use sys.platform instead of PLATFORM where we use it
PLATFORM = 'linux2'

class FooBase(object):
    def say_foo(self):
        print 'foo'

class PlatformFoo(FooBase):
    def say_platform_foo(self):
        raise NotImplementedError

    @staticmethod
    def get_class():
        #Several ways to get this (dict, introspection, if-tree,...), pick yours
        klass = {
            'linux2': LinuxFoo,
            'win32': WindowsFoo,
        }.get(PLATFORM, None)
        if not klass:
            raise Exception, 'Platform not supported'
        return klass

class WindowsFoo(PlatformFoo):
    def say_platform_foo(self):
        print 'win32 foo'

class LinuxFoo(PlatformFoo):
    def say_platform_foo(self):
        print 'linux foo'

def main():
    foo_class = PlatformFoo.get_class()
    foo = foo_class()
    foo.say_platform_foo()

if __name__ == '__main__':
    main()

Executing this code will, as expected, write ‘linux foo’ to the console. Obviously we could not return the platform-specific class in a PlatformFoo function, but an actual instance, up to you.

Python allows you to handle this situation somewhat nicer though, without introducing any intermediate functions, by using metaclasses.

I won’t explain what metaclasses are here, or how they work, there are several resources on the internet explaining them. Let’s just get to the code:

#Pretend we use sys.platform instead of PLATFORM where we use it
PLATFORM = 'linux2'

class FooBase(object):
    def say_foo(self):
        print 'foo'

    def say_platform_foo(self):
        raise NotImplementedError

class WindowsFoo(FooBase):
    def say_platform_foo(self):
        print 'win32 foo'
         
class LinuxFoo(FooBase):
    def say_platform_foo(self):
        print 'linux foo'


class FooMeta(type):
    def __new__(cls, name, bases, attrs):
        #Several ways to get this (dict, introspection, if-tree,...), pick yours
        klass = {
            'linux2': LinuxFoo,
            'win32': WindowsFoo,
        }.get(PLATFORM, None)
        if not klass:
            raise Exception, 'Platform not supported'
        return klass

class Foo:
    __metaclass__ = FooMeta


def main():
    foo = Foo()
    foo.say_platform_foo()

if __name__ == '__main__':
    main()

See we don’t need any getter-function here, but we can just create a ‘Foo’ instance? The resulting object will be a ‘LinuxFoo’ or a ‘WindowsFoo’ as expected, depending on the value of ‘PLATFORM’. The above code also displays ‘linux foo’.

There’s something nifty about it too: you don’t loose any class inheritance information:

In [1]: from test2 import Foo, LinuxFoo, FooBase

In [2]: f = Foo()

In [3]: f.say_platform_foo()
linux foo

In [4]: type(f)
Out[4]: 

In [5]: isinstance(f, LinuxFoo)
Out[5]: True

In [6]: isinstance(f, FooBase)
Out[6]: True

In [7]: isinstance(f, Foo)
Out[7]: True

This shouldn’t surprise you, as ‘Foo’ actually became an alias for ‘LinuxFoo’.

Maybe this pattern will be useful in one of your projects one day, who know :-)

]]>
http://eikke.com/python-factory-like-type-instances/feed/ 2
How not to write Python code http://eikke.com/how-not-to-write-python-code/ http://eikke.com/how-not-to-write-python-code/#comments Fri, 08 Feb 2008 20:50:16 +0000 Nicolas http://eikke.com/how-not-to-write-python-code/ Lately I’ve been reading some rather unclean Python code. Maybe this is mainly because the author(s) of the code had no in-depth knowledge of the Python language itself, the ‘platform’ delivered with cPython,… Here’s a list of some of the mistakes you should really try to avoid when writing Python code:

  • Remember Python comes batteries included
    Python is shipped with a whole bunch of standard modules implementing a broad range of functionality, including text handling, various data types, networking stuff (both low- and high-level), document processing, file archive handling, logging, etc. All these are documented in the Python Library Documentation, so it is a must to browse at least through the list of available modules, so you get some notions of what you can use by default. An example: don’t introduce a dependency on Twisted to implement a very basic and simple custom HTTP server if you don’t have any performance needs, use BaseHTTPServer and derivates.
  • Python is Python, don’t try to emulate bad coding patterns from other languages
    Python is a mature programming language which provides great flexibility, but also has some pretty specific patterns which you might not know in other languages you used before.
    As an example, don’t try to emulate PHP’s ‘include’ or ‘require’ function, at all. This could be done, somewhat, by writing the code to be included (and executed on inclusion) in a module on the top level (ie. not in functions/classes/…), and using something like ‘from foo import *’ where you want this code to be executed. This will work, but it can become hard to maintain this. Modules are not meant to be used like this, so don’t. If you need to execute some code at some point, put it in a module as a function, import the function and call it wherever you want.
  • Don’t pollute the global namespace
    Do not use ‘from foo import *’, as this will pull in everything defined in foo, but also all modules imported in foo, and maybe even their imports, etc. Try to ‘import foo’ and use foo.whatever, or use ‘from foo import whatever, somethingelse’. Explicit imports make code much more readable, and make it much easier to figure out in which module something you’re using in the current module is defined, if it’d be imported by one of your many global imports otherwise.
  • Use Pythonesque coding pattern
    This is very related to the previous item, obviously. Python has some well-known constructs to handle some situations. Get to know and understand them.
    An example: as you might know, Python has no switch/case construct. There’s a very neat way to implement this though by simply using a dict and function objects (or lambda functions). An example:

    def handle_one():
        return 'one'
    def handle_two():
        return 'two'
    def handle_default():
        return 'unknown'
    cases = {
        'one': handle_one,
        'two': handle_two,
        'three': lambda: 'three',
    }
    for i in ('one', 'two', 'three', 'four', ):
        handler = cases.get(i, handle_default)
        print handler()

    We’re using the dict.get method here, which can take an optional ‘default’ argument. Pretty neat, huh?

  • Don’t reinvent the wheel
    Related to #1. An example? Python contains a great ‘logger’ module, which includes advanced functionality like logging over network, over HTTP, defining multiple logging targets, target trees,… No need to reimplement this yourself!
  • Document your code
    Python has this great language feature called docstrings. Sprinkle them throughout your code rigorously. Do this while writing your functions/classes, not afterwards. Everyone knows that’s extremely boring and depressing.
  • Write tests
    Write testing code. Python includes at least 2 ways to write tests: using standard unit tests, or using doctests, test code snippets included in your docstrings, both useful and illustrative. There’s no way to know some code refactoring went well if you can’t test the result.
  • Use error reporting wisely
    Python includes exception handling. Use this wisely: when something goes wrong in some function which should return a string normally to be displayed to the user, don’t just return a normal string with some error message inside, as if everything went well, but return the message packed in an exception object, so the calling code will know something went wrong (and maybe handle according to this information), whilst still being able to display the error message to the user.
    Next to this, subclass Exception (or a more specific Exception child class), don’t just return base Exceptions, unless in some basic circumstances. An exception class shouldn’t be huge: ‘class FooException(Exception): pass’ cuts the job.
  • Don’t turn off error reporting during development
    In some cases it’s useful to make sure your application keeps on running, no matter what happens (this is eg how Twisted handles server handler exceptions). Python provides some ways to achieve this, so in case you need it you can use it, but make sure you provide a way to disable this, so you can tell your application to crash hard on exceptions during development. This way you’ll certainly notice the issue and you’ll be able to fix it early.
  • Search the web!
    Lots of great people wrote thousands of Python modules for lots of things. Many of them use the very liberal Python license, which allows you to re-use this code even in a close source environment. Pypi can be a great place to start.
  • Use Python basic built-in functions
    A basic example: to check whether a function parameter is of a certain type, don’t use something like ‘arg.__class__ == MyClass’, use ‘isinstance(arg, MyClass)’. Did you know isinstance’s second argument can be a tuple/list? If it is, arg’s type will be checked against all types in this list, so there’s no need to do several ‘isinstance’ calls. Other useful built-ins are getattr/setattr/hasattr (obviously), issubclass,…
  • Use non-instance-specific class methods where useful
    Just like many other programming languages, Python allows you to add static methods to a class. Just decorate your method using the ‘staticmethod’ decorator!
    Next to static methods, Python knows the concept of class methods, which get the class as argument. You most likely won’t need these often.
  • Learn ‘functional programming’ basics
    At first it can be hard to wrap your head around functional programming patterns, but they allow a very convenient and clean way to handle several situations.
  • Don’t mess with sys.path
    If you need to import ‘external’ modules, try not to mess with sys.path. Use distutils functions to discover modules, ship them as eggs,… If you want to alter sys.path anyway, try not to hardcode any ‘base’ paths: generic paths are a major plus, and removing hardcoded path stuff can be a PITA.
  • Use an interactive shell
    A Python shell like iPython is a must-have. I’m completely addicted to the tab-completion and documentation shortcuts it provides.
  • Use a code metrics tool
    I personally use PyLint (with some rules disabled). This tool will check your code for various things: missing imports, typos, wrong variable/function/class/module naming, syntax errors,… which could be in your code even if your test suite runs fine. Maybe you can even add a hook to the VCS you’re using, which doesn’t allow you to check in code unless it got a PyLint score of eg. 7. Extremely useful!

Some days ago RealNitro pointed me at this list of essential Python readings. “Idiomatic Python” is a must-read, even for experienced Python developers.

That’s about it for now, maybe I’ll add some more items to this list later on. If you have some other hints, comments!

]]>
http://eikke.com/how-not-to-write-python-code/feed/ 41
django-validation now includes inheritance support http://eikke.com/django-validation-now-includes-inheritance-support/ http://eikke.com/django-validation-now-includes-inheritance-support/#comments Fri, 18 Jan 2008 03:33:39 +0000 Nicolas http://eikke.com/django-validation-now-includes-inheritance-support/ I’m happy to announce django-validation got field type inheritance support since a couple of minutes. This means your form fields will be validated starting from the most base field type (django.newforms.Field) up to the actual field type (no multiple-inheritance supported though).

In the example I wrote yesterday, when using a TestField field, this field will be validated as a django.newforms.Field (a “required” check will be done), then as a django.newforms.CharField (“min_length” and “max_length” checks), and finally as a TestField. A normal CharField would be validated as a Field first, then as a CharField, etc.

The returned errors will be a list of all errors found, starting with the most basic one (the ones found by the most general class, Field).

Next to this, all generated Javascript code should be namespaced now (based on Python module and class names), although there might be some bad things left, I’m no Javascript guru. The generated code might be somewhat messy.

Current Python code is most certainly ugly and will need more rewrites. Next to this, other field types should be added, and some tests would be nice too.

I made a snapshot of yesterday’s sample (with some changes, the ClientValidator API slightly changed), you can try it here.

]]>
http://eikke.com/django-validation-now-includes-inheritance-support/feed/ 0
django-validation: an introduction http://eikke.com/django-validation-an-introduction/ http://eikke.com/django-validation-an-introduction/#comments Tue, 15 Jan 2008 23:59:58 +0000 Nicolas http://eikke.com/django-validation-an-introduction/ Some time ago I wrote this generic AJAX Django form validation code. Some people didn’t like this, as AJAX should not be used to perform form validation, which is sometimes true, sometimes not, as I pointed out before.

So I’ve been thinking since some time to create a Django templatetag which allows one to generate client-side Javascript form validation code without writing any code himself (unless using custom widgets). Today I got into it.

The resulting project is called django-validation. It basicly allows one to write a Newforms form class, and generate client-side validation code for this form in a template. Currently only CharField validation is implemented, more should follow soon and is easy to add.

Next to validation of built-in field types, one can also add code to validate custom fields. This can be done inside an inner class of the field class.

The current release is very alpha-grade software, a lot of enhancements could be done, and most certainly more standard field type validators should be written. Next to this, field type inheritance isn’t supported for now (so if your field type A inherits CharField, no CharField validation will be done), this should change soon.

Patches are, obviously, very welcome!

Here’s a sample how to use it. First we define a very basic form:

class TestForm(forms.Form):
    first_name = forms.CharField(max_length=128)
    test = TestField(required_value=u'I like django-validation')

This form uses a custom class (just for demonstration purposes). This class only performs client-side validation, no clean() function is provided, although in real field types this should obviously be added. More information can be found in the inline comments:

from validation.templatetags.validation import add_error
class TestField(forms.CharField):
    def __init__(self, *args, **kwargs):
        if not 'required_value' in kwargs.keys():
            raise Exception, 'required_value should be provided'
        self.required_value = kwargs['required_value']
        del kwargs['required_value']
        super(TestField, self).__init__(*args, **kwargs)

    class ClientValidator:
        '''
        This inner class knows how to generate Javascript validation code for this field type.
        The code will be pasted inside a function block. There is at least one assigned variable, 'field',
        which is the DOM element we got to validate.

        More parameters can be defined in the 'parameters' attribute. These parameters will be added
        to the Javascript function prototype, and the value of the form field parameter value will be
        assigned.

        We need to define the field class name, so the django-validation code can generate suitable
        function names.
        '''
        parameters = ('required_value', )
        field_class = 'TestField'

        def render(self, output):
            '''
            The render function should output Javascript code to check the value of the 'field' element.
            It is called inside a Javascript function scope.

            output is a StringIO object. Normally only write or writelines calls should be used.
            '''
            output.write('value = field.value;\n')
            output.write('if(value != required_value) {\n')
            # This error message should be internationalized.
            # See the add_error documentation.
            add_error(output, 'error_msg', 'Field value should be %(value)d.', (('%(value)d', 'required_value'), ))
            output.write('    return [error_msg];\n')
            output.write('}\n')

Finally, here’s how to use it inside a template (this must be one of the worse HTML/Javascript snippets I ever wrote):

{% load validation %}
<html>
<body>
    <form>
        {{ form.as_p }}
        <p><input type="submit" onclick="return testform(this.form);" />
    </form>
    <div id="errors"></div>
    {% validation_js form 'validate_testform' %}
    <script type="text/javascript">
        function testform(form) {
            valid = true;
            errors = validate_testform(form);
            err = "<ul>";
            for(field in errors) {
                if(errors[field] != null) {
                    err += '<li>' + field + ': ' + errors[field][0] + '</li>';
                    valid = false;
                }
            }
            err += '</ul>';
            if(valid)
                err = 'Form is valid';
            document.getElementById('errors').innerHTML = err;
            return false;
        }
    </script>
</body>
</html>

The current code is available in a git repository. Enjoy!

]]>
http://eikke.com/django-validation-an-introduction/feed/ 6
Filesystem issues and django-couchdb work http://eikke.com/filesystem-issues-and-django-couchdb-work/ http://eikke.com/filesystem-issues-and-django-couchdb-work/#comments Sun, 30 Dec 2007 01:47:22 +0000 Nicolas http://eikke.com/filesystem-issues-and-django-couchdb-work/ Last night, when shutting down my laptop (which had been up for quite a long time because of suspend/resume niceness), it crashed. I don’t know what exactly happened: pressed the GNOME’s logout button, applications were closed, until only my background was visible, then the system locked up, so I suspect my X server (some part of it, GPU driver (fglrx) might be the bad guy). I was able to sysrq s u o, so I thought everything would be relatively fine.

This morning I powered on my system, and while booting, fsck of some partitions was taking a rather long time. It’s pretty normal fsck was taking somewhat longer, but not thát long… I’m using JFS on most logical volumes.

When the consistency check of my /home partition was done, a whole load of invalid files was displayed and later on moved to lost+found: 34068 files. Once booted, I scanned my filesystems again, rebooted, logged in, started X. Everything started fine, until I launched Evolution: it presented my the ‘initial run’ wizard. Other issues (on first sight): all Firefox cookies were gone, and Pidgin’s blist.xml was corrupted. When using my old computer (which had frequent lockups on heavy IO usage) these last 2 issues happened a lot too, which is highly annoying, especially the blist.xml thing as I can’t see any reason to keep this file opened for long periods?

Luckily I was able to get my Evolution up and running again by restoring it’s GConf settings and ~/.evolution using some old backup (15/10/07). I guess I should backup more regularly… Next to this I hope I won’t find any other corrupted files, so the ones in lost+found are just Evolution email files and Firefox caches.

Anyway, here’s a screenshot displaying some of the initial and hackish work I’ve done this evening on integrating Django and CouchDB as I wrote about yesterday:

Django and CouchDB first shot

As you can see, currently I’m able to edit fields of an object. There’s one major condition: an object with the given ID should already exist in the ‘database’ which makes the current code rather useless, but hey ;-) I’ll add object creation functionality later tonight or tomorrow.

Current code is very expensive too, doing way too many queries to CouchDB, mainly in client.py. This most certainly needs work.

Upgraded my WordPress installation to the latest release, 2.3.2, in about 5 seconds. Got to love svn switch (although maybe I should start using git-svn for this installation too and git-pull the release branch in my local copy).

]]>
http://eikke.com/filesystem-issues-and-django-couchdb-work/feed/ 9
Django domain redirect middleware http://eikke.com/django-domain-redirect-middleware/ http://eikke.com/django-domain-redirect-middleware/#comments Wed, 26 Dec 2007 15:02:12 +0000 Nicolas http://eikke.com/2007/12/26/django-domain-redirect-middleware/ Most web developers know it’s possible to serve one site on several domains or subdomains, using eg. the ServerAlias Apache directive. Sometimes this behaviour is not wanted: you want your main domain to be “foo.tld”, although “www.foo.tld” should work too, and maybe even some completely different domains.

This way it’s possible to have permalinks, and you won’t get bad points from search engine spiders who don’t like the same content to be available on several URIs.

In Django there’s the PREPEND_WWW setting, which will force all requests to be redirected to www.foo.bar when coming in on foo.bar, etc. This functionality is rather limited though. As I wanted to be able to have one unique main domain in my new application, I wrote a middleware which accomplishes this in a very generic way, using the django.contrib.sites framework. You need to add this middleware before all others in your settings file, even before the CommonMiddleware.

Here’s the code:

from django.contrib.sites.models import Site
from django.http import HttpResponsePermanentRedirect
from django.core.urlresolvers import resolve
from django.core import urlresolvers
from django.utils.http import urlquote

class DomainRedirectMiddleware(object):
    def process_request(self, request):
        host = request.get_host()
        site = Site.objects.get_current()

        if host == site.domain:
            return None

        # Only redirect if the request is a valid path
        try:
            # One issue here: won't work when using django.contrib.flatpages
            # TODO: Make this work with flatpages :-) 
            resolve(request.path)
        except urlresolvers.Resolver404:
            return None

        new_uri = '%s://%s%s%s' % (
                request.is_secure() and 'https' or 'http',
                site.domain,
                urlquote(request.path),
                (request.method == 'GET' and len(request.GET) > 0) and '?%s' % request.GET.urlencode() or ''
            )

        return HttpResponsePermanentRedirect(new_uri)

Make sure you got the sites framework set up correctly when using this! As noted in the code, this doesn’t work with the Flatpages framework yet, this is TODO.

]]>
http://eikke.com/django-domain-redirect-middleware/feed/ 7