August 2008

Downloading Google Talk logs

I used Google Apps to host mail for this domain for a while, and wanted to close down the account since I don't use it anymore. Before I did that I wanted to move all the data onto my server. Transferring the emails was fairly straightforward using [POP3][], but I couldn't find a way to download the [Google Talk][] logs. [Gmail][] handles the logs as emails, but they aren't accessible using either POP3 or [IMAP][].

I therefore wrote a [Python][] script which downloads the logs via the web interface. On [Jeremy's][] [suggestion][] I used [BeautifulSoup][] to parse the [HTML][] this time, which worked very well. The script works with both Google Apps and normal Gmail, although my account got locked twice while trying to download the 3500 logs in my account.

Routing by port number

Due to a very restrictive firewall at the CHPC, I need to run a VPN to get access to things like email, Jabber and SSH. This however degrades my web browsing experience, since that gets tunnelled as well. I therefore wanted a setup where only ports which are blocked get tunnelled through the VPN, while everything else goes out normally.

The routing part was fairly straightforward, which consists of an iptables rule to mark certain packets, and an alternate routing table for these marked packets. I first created a name for the new table by adding the following to /etc/iproute2/rt_tables.

10  vpn

I then added a default route to the new table specifying the IP address of the VPN server and the VPN interface, and a rule to use this table for packets marked by iptables.

ip route add default via 10.8.0.3 dev tun0 table vpn
ip rule add fwmark 0x1 table vpn

The following iptables rule will mark packets destined to the listed port numbers. Note that this is for packets originating from the firewall host — if you want this to apply to packets forwarded for other hosts it must be in the PREROUTING chain.

iptables -t mangle -A OUTPUT -p tcp -m multiport --dports 22,995,587,5223 -j MARK --set-mark 0x1

The actual routing worked, but packets were being sent with the wrong source IP. I therefore needed to NAT packets going out on the VPN interface (the IP address is the local IP of the VPN connection).

iptables -t nat -A POSTROUTING -o tun0 -j SNAT --to 10.8.0.4

I could then see packets going out on the VPN interface with the correct source IP as well as the replies, but it still wasn't working. I eventually discovered that rp_filter must be disabled in order for this to work.

echo 0 > /proc/sys/net/ipv4/conf/tun0/rp_filter

Photo blogging via email with Drupal

The one thing missing from my posting by email setup was support for images. The Mailsave module has finally been updated for Drupal 6, and so I can now submit attachments with email posts. The one shortcoming is that files are simply added to posts as normal attachments, and so images aren't automatically displayed. I therefore have to manually insert images in the body of the post, but I actually prefer this since it's a simpler system and gives me more control.

I also needed a way of resizing images on my phone since they are too big. I found the Nokia Image Editor1 which seems to work fairly well, although it only allows resizing to specific resolutions.


  1. It does work on my phone even though it's supposedly only for the Nokia 3250. 

Cisco un-Clean Access

The [CHPC][] installed a new network this past weekend as part of the [SANReN][] project. The new network consists of [Cisco][] equipment, including their [NAC][] (or "Clean Access") system. This requires all clients to authenticate before they are allowed access to the network, and can also enforce a configured security policy (such as requiring operating system updates and anti-virus).

The system works as follows. By default, the ports on the switch are in an "unauthenticated" [VLAN][]. When a client is connected, it is provided with an IP address (via [DHCP][]) in an "unauthenticated" subnet. The system then presents a captive portal which requires the user to authenticate with a username and password using their browser. If the authentication is successful, the port is moved to a different VLAN (depending on the user's access level), and the switch briefly disconnects the link which causes the client to negotiate a new IP address (in a different subnet).

Before the portal presents the login page it requires that a [Java applet][] be run on the client. The applet gathers various bits of information about the client (including the operating system) and submits this information to the portal. (I assume that the portal uses this information to determine what policies must be enforced. In our setup, Windows machines must have the Clean Access Client installed, while Linux and Mac OS X machines are simply allowed access.) The portal then presents the login page.

Being a geek, I wasn't very happy to go through this rigmarole everytime I connected to the network. (I also couldn't use my [normal browser][konq] since the applet didn't work in it.) So I set out to automate the process. Initially I tried to script everything (including the Java applet) but then I noticed that the output of the applet wasn't sent with the login form submission. The only other information the form contained was a session key and random string, both of which were present on the [HTML][] page which contained the applet. A manual test confirmed that the login page could be submitted successfully as long as the session key and random string were correct — the applet could be bypassed.

I quickly scripted the login process using a

[] script and [wget][]. I then installed it in <code>/etc/network/if-up.d</code> after adding some logic to only execute if the current IP address was on the unauthenticated network. The result is that I can plug in the cable, and my machine automatically authenticates to the system.
 
While searching for information about the Clean Access system, I came across this [Slashdot article][] about a guy who was suspended from university for bypassing the Clean Access checks. I only realised last night that this is exactly what my script does![^1] I haven't tested it on Windows yet, but the only possible change I can think of is to change the [user agent][]. Seriously Cisco, the fact that I managed to bypass the applet simply by submitting the login form programmatically is ridiculous.
 
I have attached my script to this post. The way in which I have parsed the HTML page is rather ugly and likely to only work on this specific version of Clean Access. I plan to rewrite it in [Python][] sometime.
 
<strong>Update:</strong> I have rewritten the script in Python, which should be a bit more solid since it parses the HTML using a [DOM][]. The script requires [libxml2dom][] and [ipy][]. After configuring the parameters it can be dropped in <code>/etc/network/if-up.d</code>[^2] where it should run automatically.
 
[^1]: Note that it doesn't bypass the authentication: you still need a valid account in order to gain access.
 
[^2]: Make sure not to use a dot in the filename though.
 
<em>[CHPC]: Centre for High Performance Computing
</em>[SANReN]: South African National Research Network
<em>[NAC]: Network Admission Control
</em>[VLAN]: Virtual LAN
<em>[DHCP]: Dynamic Host Configuration Protocol
</em>[HTML]: HyperText Markup Language
*[DOM]: Document Object Model
 
[chpc]: http://www.chpc.ac.za/
[sanren]: http://www.meraka.org.za/sanren.htm
[cisco]: http://en.wikipedia.org/wiki/Cisco_Systems
[vlan]: http://en.wikipedia.org/wiki/Virtual_LAN
[dhcp]: http://en.wikipedia.org/wiki/Dynamic_Host_Configuration_Protocol
[nac]: http://en.wikipedia.org/wiki/Cisco_NAC_Appliance
[java applet]: http://en.wikipedia.org/wiki/Java_applet
[konq]: http://en.wikipedia.org/wiki/Konqueror
[Slashdot article]: http://it.slashdot.org/article.pl?sid=07/04/27/203232
[user agent]: http://en.wikipedia.org/wiki/User_agent
[python]: http://en.wikipedia.org/wiki/Python_(programming_language)
[html]: http://en.wikipedia.org/wiki/HTML
[wget]: http://en.wikipedia.org/wiki/Wget
[bash]: http://en.wikipedia.org/wiki/Bash
[dom]: http://en.wikipedia.org/wiki/Document_Object_Model
[libxml2dom]: http://www.boddie.org.uk/python/libxml2dom.html
[ipy]: http://russell.rucus.net/2008/ipy/

July GeekDinner

I attended my very first GeekDinner last night, for which I managed to get a place at the very last minute (although not everyone made it and I could have gatecrashed anyway ;-) ). It was hosted by Da Capo in Greenmarket Square, which was an awesome venue. (I had the Tomato and Basil Soup for starters, and the Polo Piccata, both of which were delicious.)

Andy spoke about his experiences in Nigeria, which was quite an eye opener. We often complain about Eskom and Telkom here in South Africa, but we actually aren't that bad off. Donna gave a talk about her various activities aimed at encouraging and educating women about IT, and Kerry-Anne spoke about running a daily photo blog.

Mandy then presented a slideshow on personal finance (prepared by Tim) which she had never seen before. The result was hilarious due to the dubious advice in the slideshow, and more humour was provided by the questions afterwards (involving emails from Nigerian princes and the best way to launder money).

All in all it was a fantastic evening which I enjoyed thoroughly. Thanks to Perdeberg for sponsoring the wine (even though I don't drink, it did make things more interesting ;-) ), and thanks to everyone involved with organising the evening. I look forward to the next one.