Metalinguistic Abstraction

Computer Languages, Programming, and Free Software

Archive for the ‘python’ Category

emacs develock customization for Python

with one comment

This has been annoying me for some time: develock mode doesn’t support Python out of the box.  I had a hacked-up develock where I simply changed all references of Ruby (via string-replace) to Python, and things worked pretty well…but I had to constantly load my hacked up develock.

Well, I finally got around to customizing develock properly post-facto, I think.  Rejoice, fellow whitespace pedants.  Here’s the snippet:

;; develock-py.el
;; Made by Daniel Farina
;; Login   <>
;; Started on  Sun Feb 14 09:21:21 2010 Daniel Farina
;; Last update Sun Feb 14 09:27:12 2010 Daniel Farina

(require 'develock)

(defcustom develock-python-font-lock-keywords
 '(;; a long line
 (1 'develock-long-line-1 t)
 (2 'develock-long-line-2 t))
 ;; long spaces
 (1 'develock-whitespace-2)
 (2 'develock-whitespace-3 nil t))
 ;; trailing whitespace
 ("[^\t\n ]\\([\t ]+\\)$"
 (1 'develock-whitespace-1 t))
 ;; spaces before tabs
 ("\\( +\\)\\(\t+\\)"
 (1 'develock-whitespace-1 t)
 (2 'develock-whitespace-2 t))
 ;; tab space tab
 ("\\(\t\\) \t"
 (1 'develock-whitespace-2 append))
 ;; only tabs or spaces in the line
 ("^[\t ]+$"
 (0 'develock-whitespace-2 append))
 ;; reachable E-mail addresses
 (0 'develock-reachable-mail-address t))
 ;; things to be paid attention
 (0 'develock-attention t)))
 "Extraordinary level highlighting for the Python mode."
 :type develock-keywords-custom-type
 :set 'develock-keywords-custom-set
 :group 'develock
 :group 'font-lock)

(defvar python-font-lock-keywords-x nil
 "Extraordinary level font-lock keywords for the Python mode.")

(setq develock-keywords-alist
 (cons '(python-mode

(plist-put develock-max-column-plist 'python-mode 79)

This text is made available under the license CC0 1.0 Universal (CC0 1.0).

Written by fdr

February 14, 2010 at 9:53 am

Posted in lisp, python

Tagged with , , ,

A different way to main()

leave a comment »

import sys
import getopt

class Usage(Exception):
    def __init__(self, msg):
        self.msg = msg

def main(argv=None):
    if argv is None:
        argv = sys.argv
            opts, args = getopt.getopt(argv[1:], "h", ["help"])
        except getopt.error, msg:
             raise Usage(msg)
        # more code, unchanged
    except Usage, err:
        print >>sys.stderr, err.msg
        print >>sys.stderr, "for help use --help"
        return 2

if __name__ == "__main__":

What is this, you wonder? As it turns out, it’s Guido van Rossum’s preferred way to enter a Python program. It’s a sensible departure from the classic variant. Even though this post is from 2003, I am discovering for the first time; perhaps I will adopt this idiom in some of my programs since it has been blessed by the lord of Python.

Written by fdr

February 13, 2008 at 1:54 am

Posted in python

Tagged with ,

Django file and stream serving performance Gotcha

with 9 comments

Recently I’ve been doing a little bit of work with the Django web framework for Python. Part of this project involves having a bit of reasonable binary file streaming to and from the server. There is currently a patch in trac (#2070) slated for acceptance. So I apply it and try it out and try copying some files in and out through the web server. I have some problems with the particulars of this patch and I intend to amend my complaints, but that’s for another post. What I discovered was an annoying performance gotcha in simply reading back binary files to be served to the user.

The gotcha is simple to expose:

In a Django view, use the documented functionality of passing a file-like object to the response object from the view; preferably a big, binary one. So you do something like this:

return HttpResponse(open('/path/to/big/file.bin'))

And then you surf on over to localhost and try grabbing this file. Your hard drive whirs and you notice your CPU usage is at 100% while serving the file slowly. Most people then rationalize it away saying “well, of course, Python is slow, so it makes sense that it would suck at this. Set up a dedicated static file serving server written in C and use some URL routing incantations.”

The crucial information that I had to dig for is how Django emits bytes to users. Django calls iter() on the input object and then uses calls to .next() to grab more bytes to write out to the stream. Once you factor in that the default iter() behavior for a open file in Python is to read lines you realize that there’s just an enormous amount of time and unnecessarily evil buffering going on just to emit chunks of the file separated by (in the case of binary files) completely arbitrarily spaced newline bytes. The result is lots of heap abuse as well as lots of burned CPU time looking for these needles in the haystack.

The hack to address this is very simple: we write a tiny iterator wrapper that simply uses the read(size) call. It can look something like this:

class FileIterWrapper(object):
  def __init__(self, flo, chunk_size = 1024**2):
    self.flo = flo
    self.chunk_size = chunk_size

  def next(self):
    data =
    if data:
      return data
      raise StopIteration

  def __iter__(self):
    return self

1024 ** 2 in bytes is one megabyte in a chunk. When using this iterator the logic is simple and the result is that Python consumes very little CPU time and memory to rip through a file stream. It can be applied to the previous example like so:

return HttpResponse(FileIterWrapper(open('/path/to/big/file.bin')))

Now everything is fast and happy and running as it should.

So what should Django do about this? It could be just written off as an idiosyncrasy of the framework, but I think that the case is strong that Django should inspect for file-like objects and use more aggressive calls to .read() to prevent such unpredictable behavior. One problem with such large (1MB) read()s is that they may block for too long instead of trickling bytes to the user, so some asynchronous I/O strategy would be better.

There’s no reason why a small to moderate sized site should get hosed performance-wise because several people are downloading binary files from a Django server via modpython or wsgi.

Finally, proper error handling on disposing the file descriptor in the above examples is an exercise to the reader. I suggest the using the “with” statement that can be currently imported from future.

Written by fdr

February 12, 2008 at 1:51 pm

Posted in django, projects, python

Tagged with , ,