Django Blog Archive

Red Hat SELinux policy for mod_wsgi

Using SELinux, you can safely grant a process only the permissions it needs to perform its function, and no more. Linux distributions provide policies to enforce these limits on most software they package, but many aren't covered. We've made allowances for mod_wsgi on RHEL and CentOS 5 by extending Apache httpd's SELinux policy.

It seems the SELinux policy for Apache httpd is twice as large as any other package's. The folks at Red Hat have put a lot of work into making sure that attackers who manage to exploit httpd can't break out to the rest of your system, while still allowing the flexibility to serve most applications. Consult the httpd_selinux man page if messages in audit.log coincide with your error.

File Contexts

If you've created files and/or directories in /etc/httpd, make sure they have the proper file contexts so the daemon can read them:

  # restorecon -vR /etc/httpd

httpd can only serve files with an explicitly allowed file context. Configure the context of files and directories within your production code base using the semanage command:

  # semanage fcontext --add --ftype -- --type httpd_sys_content_t "/home/projectname/live(/.*)?"
  # semanage fcontext --add --ftype -d --type httpd_sys_content_t "/home/projectname/live(/.*)?"
  # restorecon -vR /home/projectname/live

View file contexts with ls -Z. Changes should be generally accomplished with semanage and restorecon -vR.

Booleans

The httpd policy provides several boolean options for easy run-time configuration:

  • httpd_can_network_connect - Allows httpd to make network connections, including the local ones you'll be making to a database
  • httpd_enable_homedirs - Allows httpd to access /home/

Booleans are persistently set using the setsebool command with the -P flag:

  # setsebool -P httpd_can_network_connect on

WSGI Socket

When running in daemon mode, httpd and the mod_wsgi daemon communicate via a UNIX socket file. This should usually have a context of httpd_var_run_t. The standard Red Hat SELinux policy includes an entry for /var/run/wsgi.* to use this context, so it makes sense to put the socket there using the WSGISocketPrefix directive within your httpd configuration:

  WSGISocketPrefix run/wsgi

(Note that run/wsgi translates to /etc/httpd/run/wsgi which is symlinked to /var/run/wsgi.)

If socket communication fails, httpd returns a 503 "Temporarily Unavailable" error response.

SELinux Policy Module

In the course of our testing SELinux denials like the following appeared:

  host=example.com type=AVC msg=audit(1262803154.315:1851): avc:  denied  { execmem } for  pid=5337 comm="httpd" scontext=root:system_r:httpd_t:s0 tcontext=root:system_r:httpd_t:s0 tclass=process

Unusual behavior like this is usually best allowed by creating application-specific SELinux policy modules. If you cannot resolve these AVC errors by manipulating file contexts or booleans, collect all the errors into a single file and feed that into the audit2allow utility:

  # yum install policycoreutils
  # mkdir ~/tmp  # if this doesn't exist already
  # audit2allow --module wsgi < ~/tmp/pile_of_auditd_output > ~/tmp/wsgi.te

This will output source for a new policy module. You might review the .te file before compiling. Ours looks like this:

module wsgi 1.0;

require {
      type httpd_t;
      class process execmem;
}

#============= httpd_t ==============
allow httpd_t self:process execmem;

Compile this source into a new policy module and package it:

  # checkmodule -M -m -o ~/tmp/wsgi.mod ~/tmp/wsgi.te
  # semodule_package --outfile ~/tmp/wsgi.pp --module ~/tmp/wsgi.mod

Once created, the module may be installed permanently into any compatible system's SELinux configuration:

  # semodule --install ~/tmp/wsgi.pp

There's plenty of room for improvement here. The file contexts we assigned with semanage should be defined in a .fc source file and included within the policy module. And creating a new context just for the WSGI daemon to transition into would restrict it further, allowing only a subset of Apache httpd's abilities. Writing your own policy like this allows you much finer tuning of your processes' limits, while allowing their needed functionality.

Safari 4 Top Sites feature skews analytics

Safari version 4 has a new "Top Sites" feature that shows thumbnail images of the sites the user most frequently visits (or, until enough history is collected, just generally popular sites).

Martin Sutherland describes this feature in details and shows how to detect these requests, which set the X-Purpose HTTP header to "preview".

The reason this matters is that Safari uses its normal browsing engine to fetch not just the HTML, but all embedded JavaScript and images, and runs in-page client JavaScript code. And these preview thumbnails are refreshed fairly frequently -- possibly several times per day per user.

Thus every preview request looks just like a regular user visit, and this skews analytics which see a much higher than average number of views from Safari 4 users, with lower time-on-site averages and higher bounce rates since no subsequent visits are registered (at least as part of the preview function).

The solution is to simply not output any analytics code when the X-Purpose header is set to "preview". In Interchange this is easily done if you have an include file for your analytics code, by wrapping the file with an [if] block such as this:

[tmp x_purpose][env HTTP_X_PURPOSE][/tmp]
[if scratch x_purpose eq 'preview']
<!-- skip analytics for browser previews -->
[else]
(normal Google Analytics, Omniture SiteCatalyst, or other analytics code)
[/else]
[/if]

In Ruby on Rails you'd check request.env["HTTP_X_PURPOSE"].

In PHP you'd check $_SERVER["HTTP_X_PURPOSE"].

In Django you'd check request.META["HTTP_X_PURPOSE"] or the equivalent request.META.get("HTTP_X_PURPOSE") (from the HttpRequest class).

And so on.

I confirmed the analytics tracking code was omitted by waiting for Safari to make its preview request and inspecting the response with the Fiddler proxy, on Windows. The same can be done for Safari on Mac OS X with a suitable Mac OS X HTTP proxy.

DjangoCon 2009: Portland, Ponies, and Presentations

I attended DjangoCon this year for the first time, and found it very informative and enjoyable. I hoped to round out my knowledge of the state of the Django art, and the conference atmosphere made that easy to do.

Presentations

Avi Bryant's opening keynote was on the state of web application development, and what Django must do to remain relevant. In the past, web application frameworks did things in certain ways due to the constraints of CGI. Now they're structured around relational databases. In the future, they'll be arranged around Ajax and other asynchronous patterns to deliver just content to browsers, not presentation. To wit, "HTML templates should die", meaning we'll see more Gmail-style browser applications where the HTML and CSS is the same for each user, and JavaScript fetches all content and provides all functionality. During Q&A, he clarified that most of what he said applies to web applications, not content-driven sites which must be SEO friendly and so arranged much differently. Many of these themes were serendipitously also in Jacob Kaplan-Moss' "Snakes on the Web" talk, which he gave at PyCon Argentina the same week as DjangoCon.

Ian Bicking's keynote was on open-source as a philosophy; very abstract and philosophical but also interesting. It has been described as “a free software programmer’s midlife crisis”. Frank Wiles of Revolution Systems gave a barn-burner talk on how Django converted him from Perl to Python, followed by another on Postgres optimization. The latter reflected a theme that all web developers are now expected to do Operations as well, with several talks devoted to simple systems administration concepts.

Deployment

While working on Django projects we've been doing this year, I've been watching developments around deployment of Python web applications, particularly with Apache. The overwhelming consensus: Without active development, mod_python is going the way of the dodo. Although WSGI is architecturally similar to CGI, the performance difference can be striking. mod_wsgi's daemon mode running as a separate user is more secure and flexible than mod_python processes running as the Apache user. Given mod_wsgi's momentum, it makes sense to use it and avoid mod_python for new projects.

Several other tools kept re-appearing in presenters' demonstrations. Fabric is a remote webapp deployment tool similar to Ruby's Capistrano. Python's package index, formerly named "The Cheeseshop", has been renamed PyPI, the Python Package Index. Though easy_install is the standard tool to install PyPI packages, pip is gaining momentum as its successor. VirtualEnv is a tool to create isolated Python environments, discrete from the system environment. Since the conference I've been exploring how these tools may be leveraged for our own development, and may be integrated into our DevCamps multiple-environments system.

Pinax

The Eldarion folks gave three talks on Pinax, and the project came up a lot in conversations. If anything could be said to have "buzz" at the conference, this is it. Pinax is a suite of re-usable Django applications, encompassing functionality often-desired, but not common enough to be included in django.contrib. It may be compared to Drupal and Plone. Its popularity also spurred discussions on what should or should not be included in the Django core, and how all Django developers should make their apps re-usable (some of James Bennett's favorite topics).

Community

Among others, I was fortunate to spend time with Kevin Fricovsky and the others who launched the new community site DjangoDose during the conference. DjangoDose is a spiritual successor to the now-defunct This Week in Django podcast, and was visible on many laptops in the conference room, aggregating #djangocon tweets.

That's all I have time to relate now. There was plenty more there and I look forward to following up with people & projects.

Learn more about End Point's Django and Python development.