code

Code which I've written

Ibid finally released!

After over a year of development, we have finally released Ibid. Ibid is a general purpose chat bot written in Python. We've suffered from a bit of feature creep, so despite being a 0.1 release it can talk 7 messaging protocols and has over 100 features provided by plugins. I think we also have an excellent architecture and very developer friendly plugin API. The 0.1.0 release can be downloaded from Launchpad or installed from our PPA.

Content negotiation with Lighttpd and Lua

Following on from yesterday's post, I decided to try implement proper content negotiation. After a fair amount of time spent getting to grips with Lua, I got a script which works very nicely. It implements server driven content negotiation for media types.

The basic idea of content negotiation is that a resource (e.g., this graph) exists in multiple formats (in this case, SVG, PNG and GIF). When a user agent requests the resource, it indicates which formats it understands by listing them in the Accept header. The server compares these to the available formats and sends the best one. So a browser which can display SVG will receive the diagram in SVG format, while a browser which can't will receive it in PNG (or GIF) format.

(The following description assumes knowledge of the Accept header format.)

The script works by searching the directory for files with the requested name but with an additional extension (each of which is a variant). The media type is inferred from the extension using /etc/mime.types, and the quality of the type is set by a hardcoded table in the script. Each variant is checked against the acceptable types sent by the user agent, and the overall quality calculated by multiplying the quality with the q parameter in the Accept header. The variant with the highest overall quality is then chosen.

Some browsers include wildcard entries such as image/* and */* in the Accept header without specifying a q parameter. This parameter defaults to 1 (the highest value), which means that no preference is actually indicated. The script implements the same hack that Apache does in order to compensate for this. It also handles directory index files by defaulting to files named "index".

To install the script, download and save it somewhere (such as /etc/lighttpd/). Then add the following to the site definition.

magnet.attract-physical-path-to = ("/etc/lighttpd/negotiate.lua")

Serving static files without file extensions using Lighttpd and Lua

URLs shouldn't really contain file extensions (like .html, .png) since they are supposed to identify a resource and not a particular representation/format thereof. The format is indicated by the Content-Type header sent in the response. Modern CMSs do this already (for example, the URL of this page doesn't include .html).

Doing the same for static files (i.e. files served directly by the webserver) isn't straightforward because most webservers use the file extension to determine the MIME type to send in the Content-Type header. This means that simply removing the file extension from the filename (or even creating a symlink without a file extension) will cause the webserver to send the wrong Content-Type header.

I decided to try find a solution to this for my webserver of choice, Lighttpd. Lighttpd has a module which embeds a Lua interpreter and allows you to write scripts which modify (or even handle) requests. So I wrote a script which searches the directory for files with the same name as requested but with an extension. This means that any file can be accessed with the file extension removed from the URL while still having the correct Content-Type.

The script currently chooses the first matching file, which means that having multiple files with the same name but different extensions doesn't do anything useful. The proper method however is to actually do content negotiation, which chooses the format based on the preferences indicated by the HTTP client in the Accept header.

To use this script, download it and save it somewhere (I use /etc/lighttpd/). Enable mod_magnet, and add the following line to the site definition.

magnet.attract-physical-path-to = ("/etc/lighttpd/extension.lua")

Sharing links from Konqueror, including to IRC

I follow the main feeds of a couple social news sites (namely Digg, Reddit and Muti). When I find an article which I like, I go back and vote it up on the site. However, when I come across good articles via other sources, I don't submit them to these news sites (or try to find out if they've already been submitted) simply because it's too much effort.

When I started aggregating my activity on these sites on my blog and on FriendFeed, I needed a way to share pages that I didn't get to via one of these social news sites. I ended up setting up Delicious because I found a plugin for Konqueror which made it easy to bookmark pages.

I still wanted to solve the original problem though, and so started looking for an easy way to submit links to these sites from Konqueror. Konqueror has a feature called service menus which allows you to add entries to the context menu of files. I then needed to work out how to submit links to these services, which turned out to simply involve loading a URL with a query parameter specifying the link you want to share.

I created entries for Reddit, Digg, Muti, Delicious, Facebook and Google Bookmarks. These take you to the submission page of the service where you can fill in the title1. Digg and Reddit will show existing submissions if the link has already been submitted.

I often share links on IRC, and wondered if I could integrate that with my menu. It turns out that WeeChat has a control socket, and I could send messages by piping them to the socket. I therefore wrote a script which prompted me for a headline or excerpt using kdialog, and then sent the link to the specified channel. My menu now looks like this:

sharemenu.png

If you want to set this up yourself, download share.desktop and put it in ~/.kde/share/apps/konqueror/servicemenus. If you want the icons, download shareicons.tar.gz, extract them somewhere, and fix the paths in social.desktop2. To setup the IRC feature (assuming you're using WeeChat), download postirc.sh and save it in ~/bin/. You will need to change the commands in social.desktop depending on the servers and channels you wish to use.


  1. One shortcoming is that the title of the page is not automatically filled in. 

  2. I couldn't work out how to use relative paths, or ~. 

CLUG Political Compass graph

A couple people on #clug were updating their Political Compass scores, which prompted me to jump on the bandwagon and do the test. I came out with the following scores.

Economic Left/Right: -4.38
Social Libertarian/Authoritarian: -2.21

I then thought that it would be interesting to compare everyone's scores on a graph, so I wrote a Python script to get the scores from Spinach and a Gnuplot script to plot them.

CLUG Political Compass

To add yourself to the graph, tell Spinach your score in the following format. The graph is regenerated every hour.

cocooncrash.political_compass is -4.38 / -2.21 (2008/09/14)

Downloading Google Talk logs

I used Google Apps to host mail for this domain for a while, and wanted to close down the account since I don't use it anymore. Before I did that I wanted to move all the data onto my server. Transferring the emails was fairly straightforward using POP3, but I couldn't find a way to download the Google Talk logs. Gmail handles the logs as emails, but they aren't accessible using either POP3 or IMAP.

I therefore wrote a Python script which downloads the logs via the web interface. On Jeremy's suggestion I used BeautifulSoup to parse the HTML this time, which worked very well. The script works with both Google Apps and normal Gmail, although my account got locked twice while trying to download the 3500 logs in my account.

Cisco un-Clean Access

The CHPC installed a new network this past weekend as part of the SANReN project. The new network consists of Cisco equipment, including their NAC (or "Clean Access") system. This requires all clients to authenticate before they are allowed access to the network, and can also enforce a configured security policy (such as requiring operating system updates and anti-virus).

The system works as follows. By default, the ports on the switch are in an "unauthenticated" VLAN. When a client is connected, it is provided with an IP address (via DHCP) in an "unauthenticated" subnet. The system then presents a captive portal which requires the user to authenticate with a username and password using their browser. If the authentication is successful, the port is moved to a different VLAN (depending on the user's access level), and the switch briefly disconnects the link which causes the client to negotiate a new IP address (in a different subnet).

Before the portal presents the login page it requires that a Java applet be run on the client. The applet gathers various bits of information about the client (including the operating system) and submits this information to the portal. (I assume that the portal uses this information to determine what policies must be enforced. In our setup, Windows machines must have the Clean Access Client installed, while Linux and Mac OS X machines are simply allowed access.) The portal then presents the login page.

Being a geek, I wasn't very happy to go through this rigmarole everytime I connected to the network. (I also couldn't use my normal browser since the applet didn't work in it.) So I set out to automate the process. Initially I tried to script everything (including the Java applet) but then I noticed that the output of the applet wasn't sent with the login form submission. The only other information the form contained was a session key and random string, both of which were present on the HTML page which contained the applet. A manual test confirmed that the login page could be submitted successfully as long as the session key and random string were correct — the applet could be bypassed.

I quickly scripted the login process using a bash script and wget. I then installed it in /etc/network/if-up.d after adding some logic to only execute if the current IP address was on the unauthenticated network. The result is that I can plug in the cable, and my machine automatically authenticates to the system.

While searching for information about the Clean Access system, I came across this Slashdot article about a guy who was suspended from university for bypassing the Clean Access checks. I only realised last night that this is exactly what my script does!1 I haven't tested it on Windows yet, but the only possible change I can think of is to change the user agent. Seriously Cisco, the fact that I managed to bypass the applet simply by submitting the login form programmatically is ridiculous.

I have attached my script to this post. The way in which I have parsed the HTML page is rather ugly and likely to only work on this specific version of Clean Access. I plan to rewrite it in Python sometime.

Update: I have rewritten the script in Python, which should be a bit more solid since it parses the HTML using a DOM. The script requires libxml2dom and ipy. After configuring the parameters it can be dropped in /etc/network/if-up.d2 where it should run automatically.


  1. Note that it doesn't bypass the authentication: you still need a valid account in order to gain access. 

  2. Make sure not to use a dot in the filename though. 

Mobile interface to Vodacom4me and MyMTN

Vodacom4me and MyMTN allow you to send free SMSs from a computer. Unfortunately those sites are not accessible from a cellphone. I came across a site which provides a mobile interface for Vodacom4me and MyMTN1. This means that you can send SMSs from your cellphone for the cost of the GPRS/UMTS data required to access the site. I having been using this for quite a while, and it works fairly well.

However, there are a few aspects of the site which I don't like, and so I wrote my own version which performs the same function with the following extra features:

  1. Uses cookies to store login data instead of a URL with parameters which needs to be bookmarked (although it will fall back to this method if the user agent doesn't support cookies).
  2. Submits forms using POST instead of GET (but will fall back to GET if the user agent doesn't support POST).
  3. Allows multiple recipients (although only Vodacom4me supports this).
  4. Specifies the maximum message length in the textarea so that phones which support it can show how many characters are left.2
  5. Automatically logs into Vodacom4me/MyMTN again if session has expired.3
  6. Cleaner, less cluttered interface (mainly optimised for my phone ;-) ).
  7. Accessible over HTTPS for extra security.

The site is available at http://m.mene.za.net/ (or with HTTPS). Obviously the restrictions enforced by Vodacom4me and MyMTN still apply. Vodacom4me allows 20 SMSs per day to Vodacom numbers for Vodacom subscribers only. MyMTN allows 5 SMSs per day to MTN numbers for anyone. The source code is available for anyone who is interested (and brave enough).


  1. There is also an interface for CellC's site, but mine does not implement this. 

  2. This is technically not allowed by the HTML specification, but it works on my phone. 

  3. This allows the message composition page to be saved on phones which support this (like my Nokia E70) instead of reloading it every time a message is composed. 

Syndicate content