<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Django file and stream serving performance Gotcha</title>
	<atom:link href="http://metalinguist.wordpress.com/2008/02/12/django-file-and-stream-serving-performance-gotcha/feed/" rel="self" type="application/rss+xml" />
	<link>http://metalinguist.wordpress.com/2008/02/12/django-file-and-stream-serving-performance-gotcha/</link>
	<description>Computer Languages, Programming, and Free Software</description>
	<lastBuildDate>Mon, 06 Apr 2009 02:59:08 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: fdr</title>
		<link>http://metalinguist.wordpress.com/2008/02/12/django-file-and-stream-serving-performance-gotcha/#comment-603</link>
		<dc:creator>fdr</dc:creator>
		<pubDate>Mon, 06 Apr 2009 02:59:08 +0000</pubDate>
		<guid isPermaLink="false">http://metalinguist.wordpress.com/?p=18#comment-603</guid>
		<description>@bambam
Just beware any exceptions...&#039;with&#039; is nice to avoid a try/except/finally block to clean up correctly.</description>
		<content:encoded><![CDATA[<p>@bambam<br />
Just beware any exceptions&#8230;&#8217;with&#8217; is nice to avoid a try/except/finally block to clean up correctly.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: bambam</title>
		<link>http://metalinguist.wordpress.com/2008/02/12/django-file-and-stream-serving-performance-gotcha/#comment-602</link>
		<dc:creator>bambam</dc:creator>
		<pubDate>Wed, 11 Mar 2009 11:37:28 +0000</pubDate>
		<guid isPermaLink="false">http://metalinguist.wordpress.com/?p=18#comment-602</guid>
		<description>hey! Saw this one coming and found your post via google. Splendid work, totally agree. I didn&#039;t need to implement with, I just hit &#039;close&#039; on self.flo just before raising StopIteration. Thanks for the help!</description>
		<content:encoded><![CDATA[<p>hey! Saw this one coming and found your post via google. Splendid work, totally agree. I didn&#8217;t need to implement with, I just hit &#8216;close&#8217; on self.flo just before raising StopIteration. Thanks for the help!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: fdr</title>
		<link>http://metalinguist.wordpress.com/2008/02/12/django-file-and-stream-serving-performance-gotcha/#comment-599</link>
		<dc:creator>fdr</dc:creator>
		<pubDate>Thu, 07 Aug 2008 20:10:12 +0000</pubDate>
		<guid isPermaLink="false">http://metalinguist.wordpress.com/?p=18#comment-599</guid>
		<description>@Gavin Panella

Large read() calls may be problematic -- for reasons besides memory usage -- if they block for an uncomfortable period of time before returning, especially if your disk is highly contended for or slow (or over a network).

What I&#039;m really trying to approximate here is asynchronous I/O where you are regularly given chances to empty a buffer that is filled as quickly as possible to allow more responsive streaming of bytes. A basic consumer/producer issue.

This problem is present for any time when read() may block for an extended period when you could be delivering chunks of the stream that you&#039;ve already gotten.</description>
		<content:encoded><![CDATA[<p>@Gavin Panella</p>
<p>Large read() calls may be problematic &#8212; for reasons besides memory usage &#8212; if they block for an uncomfortable period of time before returning, especially if your disk is highly contended for or slow (or over a network).</p>
<p>What I&#8217;m really trying to approximate here is asynchronous I/O where you are regularly given chances to empty a buffer that is filled as quickly as possible to allow more responsive streaming of bytes. A basic consumer/producer issue.</p>
<p>This problem is present for any time when read() may block for an extended period when you could be delivering chunks of the stream that you&#8217;ve already gotten.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gavin Panella</title>
		<link>http://metalinguist.wordpress.com/2008/02/12/django-file-and-stream-serving-performance-gotcha/#comment-598</link>
		<dc:creator>Gavin Panella</dc:creator>
		<pubDate>Thu, 07 Aug 2008 13:16:52 +0000</pubDate>
		<guid isPermaLink="false">http://metalinguist.wordpress.com/?p=18#comment-598</guid>
		<description>Perhaps the built-in &lt;code&gt;file&lt;/code&gt; class could be subclassed to return 16k chunks instead of lines for files that have been opened as binary:

&lt;code&gt;
class file(file):
    def __iter__(self):
        if &#039;b&#039; in self.mode:
            return iter(lambda: self.read(1024 * 16), &#039;&#039;)
        else:
            return self
&lt;/code&gt;

Or Django could incorporate similar logic, or just always serve open files as chunks. There&#039;s no point seeking for lines when it&#039;s not doing anything with them.

For big chunk sizes, I don&#039;t think using alarms or more threads are going to help you. AFAIK, during a request, one thread is dedicated to servicing that request, right until it completes. Breaking out of a long write is not going to free that thread to service other requests unless you stop servicing the request entirely. The chunk size is probably only a concern if you&#039;re planning on servicing a lot of concurrent downloads, when you might need to reduce memory consumption.
</description>
		<content:encoded><![CDATA[<p>Perhaps the built-in <code>file</code> class could be subclassed to return 16k chunks instead of lines for files that have been opened as binary:</p>
<p><code><br />
class file(file):<br />
    def __iter__(self):<br />
        if 'b' in self.mode:<br />
            return iter(lambda: self.read(1024 * 16), '')<br />
        else:<br />
            return self<br />
</code></p>
<p>Or Django could incorporate similar logic, or just always serve open files as chunks. There&#8217;s no point seeking for lines when it&#8217;s not doing anything with them.</p>
<p>For big chunk sizes, I don&#8217;t think using alarms or more threads are going to help you. AFAIK, during a request, one thread is dedicated to servicing that request, right until it completes. Breaking out of a long write is not going to free that thread to service other requests unless you stop servicing the request entirely. The chunk size is probably only a concern if you&#8217;re planning on servicing a lot of concurrent downloads, when you might need to reduce memory consumption.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: djc</title>
		<link>http://metalinguist.wordpress.com/2008/02/12/django-file-and-stream-serving-performance-gotcha/#comment-597</link>
		<dc:creator>djc</dc:creator>
		<pubDate>Thu, 07 Aug 2008 09:24:32 +0000</pubDate>
		<guid isPermaLink="false">http://metalinguist.wordpress.com/?p=18#comment-597</guid>
		<description>In my opinion, Django should at least try to make use of the optional file_wrapper stuff in WSGI (http://www.python.org/dev/peps/pep-0333/#optional-platform-specific-file-handling) if it exists, and provide a fast implementation itself for that specific case if the server doesn&#039;t expose one.</description>
		<content:encoded><![CDATA[<p>In my opinion, Django should at least try to make use of the optional file_wrapper stuff in WSGI (<a href="http://www.python.org/dev/peps/pep-0333/#optional-platform-specific-file-handling" rel="nofollow">http://www.python.org/dev/peps/pep-0333/#optional-platform-specific-file-handling</a>) if it exists, and provide a fast implementation itself for that specific case if the server doesn&#8217;t expose one.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: fdr</title>
		<link>http://metalinguist.wordpress.com/2008/02/12/django-file-and-stream-serving-performance-gotcha/#comment-591</link>
		<dc:creator>fdr</dc:creator>
		<pubDate>Mon, 07 Apr 2008 21:59:43 +0000</pubDate>
		<guid isPermaLink="false">http://metalinguist.wordpress.com/?p=18#comment-591</guid>
		<description>@wilsoniya

It is true, and I don&#039;t ask it to be a *good* static file server, but the current performance penalty is atrocious...probably a factor of 50 to 100 in terms of CPU usage.

Also, client side interpretation is neither here nor there with regard to this issue, the issue that it&#039;s extremely easy for a naive developer to type something like &quot;open(&#039;file&#039;)&quot; and then send the file handle to Django to be sent. Django takes the very sensible action of making an iterator of the file object, and by default Python&#039;s iterators over files seeks to return one line at a time. Or, more to the point, any series of bytes terminated by a newline byte.

When one wants to simply send something...such as open(&#039;movie.mkv&#039;)...this seeking of newline bytes is a huge waste of time. Instead, something like the FileIterWrapper documented in the post will make the process much, much more efficient.

My only beef with this is that it hits someone who makes this honest, common mistake far too hard. It also may reflect badly on Django&#039;s performance, whereas it&#039;s not really Django&#039;s fault.</description>
		<content:encoded><![CDATA[<p>@wilsoniya</p>
<p>It is true, and I don&#8217;t ask it to be a *good* static file server, but the current performance penalty is atrocious&#8230;probably a factor of 50 to 100 in terms of CPU usage.</p>
<p>Also, client side interpretation is neither here nor there with regard to this issue, the issue that it&#8217;s extremely easy for a naive developer to type something like &#8220;open(&#8216;file&#8217;)&#8221; and then send the file handle to Django to be sent. Django takes the very sensible action of making an iterator of the file object, and by default Python&#8217;s iterators over files seeks to return one line at a time. Or, more to the point, any series of bytes terminated by a newline byte.</p>
<p>When one wants to simply send something&#8230;such as open(&#8216;movie.mkv&#8217;)&#8230;this seeking of newline bytes is a huge waste of time. Instead, something like the FileIterWrapper documented in the post will make the process much, much more efficient.</p>
<p>My only beef with this is that it hits someone who makes this honest, common mistake far too hard. It also may reflect badly on Django&#8217;s performance, whereas it&#8217;s not really Django&#8217;s fault.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: wilsoniya</title>
		<link>http://metalinguist.wordpress.com/2008/02/12/django-file-and-stream-serving-performance-gotcha/#comment-589</link>
		<dc:creator>wilsoniya</dc:creator>
		<pubDate>Mon, 24 Mar 2008 23:37:49 +0000</pubDate>
		<guid isPermaLink="false">http://metalinguist.wordpress.com/?p=18#comment-589</guid>
		<description>Thanks for this.  Also, including the `mimetype&#039; named param may help the client-side application interpret whatever file you&#039;re dealing with properly.  Not a shocking fact, but useful.

I sympathise with your complaints about Django&#039;s static file skiddishness.  It&#039;s obnoxious for developing and stuff, but I doubt the Django team wants to accept the responsibility for creating a robust and reliable static web server. Besides, you can enable static files w/o too much trouble.</description>
		<content:encoded><![CDATA[<p>Thanks for this.  Also, including the `mimetype&#8217; named param may help the client-side application interpret whatever file you&#8217;re dealing with properly.  Not a shocking fact, but useful.</p>
<p>I sympathise with your complaints about Django&#8217;s static file skiddishness.  It&#8217;s obnoxious for developing and stuff, but I doubt the Django team wants to accept the responsibility for creating a robust and reliable static web server. Besides, you can enable static files w/o too much trouble.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Adomas</title>
		<link>http://metalinguist.wordpress.com/2008/02/12/django-file-and-stream-serving-performance-gotcha/#comment-576</link>
		<dc:creator>Adomas</dc:creator>
		<pubDate>Tue, 12 Feb 2008 22:33:34 +0000</pubDate>
		<guid isPermaLink="false">http://metalinguist.wordpress.com/?p=18#comment-576</guid>
		<description>Would be nice to just make use of sendfile syscall as present on Linux etc.</description>
		<content:encoded><![CDATA[<p>Would be nice to just make use of sendfile syscall as present on Linux etc.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
