July 2008

Slipstreaming Windows XP SP3 in Linux

Unfortunately Windows is still a necessary evil sometimes: I keep a Windows virtual machine for times when it's absolutely necessary, and I still give my friends Windows tech support. I still like to do things properly, and so I wanted to create a Windows XP install CD with Service Pack 3 slipstreamed in1. I had two CDs to do, and slipstreamed the first one using a Windows VM, but then got curious and wondered if I could do it without Windows.

The answer is that it is possible using Wine to run the service pack installer. I followed this blog post (which was interesting since it's in French), but I then found another blog post which explains it in English. The steps are as follows:

  1. Copy contents of original CD to harddrive.
  2. Extract the service pack using cabextract.
  3. Use Wine to run the service pack installer.

    wine ~/sp3/i386/update/update.exe /integrate:~/xp/
    
  4. Use geteltorito to extract the bootloader from the original CD

  5. Make sure that all the filenames are upper case.

    convmv -r --upper --notest ~/xp/*
    
  6. Create the new CD image. I did this in K3b with the following settings.

    • Boot emulation: none
    • Boot load segment: 0x7c0
    • Boot load size: 0x4
    • Generate Joilet extensions
    • Omit version numbers in ISO9660 filenames (nothing else enabled under "ISO9660 Settings"
    • ISO Level 1
  7. Test in a virtual machine

It seems to be quite particular about the ISO9660 settings and the upper case filenames, so if it doesn't boot check the settings.


  1. This integrates the service pack into the install CD so that a fresh installation is already updated. 

Mobile interface to Vodacom4me and MyMTN

Vodacom4me and MyMTN allow you to send free SMSs from a computer. Unfortunately those sites are not accessible from a cellphone. I came across a site which provides a mobile interface for Vodacom4me and MyMTN1. This means that you can send SMSs from your cellphone for the cost of the GPRS/UMTS data required to access the site. I having been using this for quite a while, and it works fairly well.

However, there are a few aspects of the site which I don't like, and so I wrote my own version which performs the same function with the following extra features:

  1. Uses cookies to store login data instead of a URL with parameters which needs to be bookmarked (although it will fall back to this method if the user agent doesn't support cookies).
  2. Submits forms using POST instead of GET (but will fall back to GET if the user agent doesn't support POST).
  3. Allows multiple recipients (although only Vodacom4me supports this).
  4. Specifies the maximum message length in the textarea so that phones which support it can show how many characters are left.2
  5. Automatically logs into Vodacom4me/MyMTN again if session has expired.3
  6. Cleaner, less cluttered interface (mainly optimised for my phone ;-) ).
  7. Accessible over HTTPS for extra security.

The site is available at http://m.mene.za.net/ (or with HTTPS). Obviously the restrictions enforced by Vodacom4me and MyMTN still apply. Vodacom4me allows 20 SMSs per day to Vodacom numbers for Vodacom subscribers only. MyMTN allows 5 SMSs per day to MTN numbers for anyone. The source code is available for anyone who is interested (and brave enough).


  1. There is also an interface for CellC's site, but mine does not implement this. 

  2. This is technically not allowed by the HTML specification, but it works on my phone. 

  3. This allows the message composition page to be saved on phones which support this (like my Nokia E70) instead of reloading it every time a message is composed. 

Publishing SSH and GPG keys using DNS

I was looking through a list of DNS record types, and noticed the SSHFP and CERT records. I then proceded to implement these in my domain... just because I can ;-)

SSH Host Keys

The SSHFP record is used to publish the fingerprint of a host's SSH key. When an SSH client connects to a server for the first time, it can verify the host's key by checking for this DNS record. The format of the record is specified in RFC 4255, but there is also a tool which will generate the records for you.

$ sshfp -s mammon.gorven.za.net
mammon.gorven.za.net IN SSHFP 1 1 5e6772b6962f3328a0d73f7765097b7622f21447
mammon.gorven.za.net IN SSHFP 2 1 00e59b1843421f13d75e21abb06bf032a6e60b8b

The SSH client needs to be configured to check these records. Specifying "VerifyHostKeyDNS ask" in ~/.ssh/config will make the client look for SSHFP records, but will still prompt you to accept the key. (It will output a messaging saying that it found a matching key.) Specifying "VerifyHostKeyDNS yes" will skip the prompt if the record exists and matches the key presented by the server.

GPG Keys

The CERT record is used to publish public keys or fingerprints. It can be used for PGP, X.509 or SPKI keys. It is specified in RFC 4398, and there is very little mention of it other than this blog post I found. To generate records you need the make-dns-cert tool which is part of GnuPG. It isn't distributed in the Ubuntu package however, and so I had to compile GnuPG from source.

To determine the name of the record to use, convert your email address into a domain name by replacing the ampersand with a dot1. To publish your entire public key, run the tool as follows.

$ make-dns-cert -k ~/pubkey -n michael

The first parameter specifies the file containing your public key in binary format, and the second parameter specifies the domain name to use. To publish a reference to your public key, run the tool as follows.

$ make-dns-cert -f BF6FD06EA9DAABB6649F60743BD496BD6612FE85 -u http://michael.gorven.za.net/files/mgorven.asc -n michael

The first parameter specifies the fingerprint of your key, and the second parameter the URL at which the public key can be found. It is also possible to only publish the fingerprint or only the URL. Simply add the record which the tool outputs to your zone file2.

There is also another method to publish GPG keys called PKA. The only documentation I can find is a specification in German linked from the blog post mentioned above. I still managed to set it up though. This method uses a TXT record (similar to SPF). Here is my record.

michael._pka.gorven.za.net. TXT
"v=pka1\;fpr=BF6FD06EA9DAABB6649F60743BD496BD6612FE85\;uri=http://michael.gorven.za.net/files/mgorven.asc"

This specifies the fingerprint and URL, just as with the second CERT method above. In order to get gpg to check DNS for keys, you need to specify "--auto-key-locate cert,pka" on the command line or in the configuration file.


  1. So john@example.com becomes john.example.com

  2. It should be possible to clean the record up by using mnemonics, but I couldn't get nsd to accept it and so just left it as is. 

OpenVPN through an HTTP proxy server

I discovered that OpenVPN supports connections through an HTTP proxy server. This makes it possible to establish a VPN from a completely firewalled network where the only external access is through a proxy server1. It takes advantage of the fact that SSL connections are simply tunnelled through the server and aren't interfered with like unencrypted connections.

The server setup is almost identical to a normal configuration, except that the tunnel must use TCP instead of UDP (since the proxy server will establish a TCP connection). Since most proxy servers only allow SSL connections to certain ports, you will also need to change the port number that the server listens on. The best is 443 since that is used for HTTPS, but if the server is also running a web server on port 443, then 563 is probably the next best choice. This port is assigned to NNTPS, and is allowed by the default Squid configuration. The following two lines enable TCP connections and change the port number.

proto tcp-server
port 563

The client configuration is also very similar. It simply needs to enable TCP connections, set the correct port number, and specify the proxy server.

remote vpn.example.com 563
http-proxy cache.saix.net 8080
proto tcp-client

OpenVPN can also authenticate to the proxy server using either Basic or NTLM authentication. To enable this add "stdin basic" or "stdin ntlm" to the http-proxy line. This will prompt for the username and password when the VPN is started. For more details see the OpenVPN documentation.


  1. I am not commenting on the ethics of this. If you need to resort to this method, you probably shouldn't be doing it. 

Python Decorators

For my Masters project I need a method by which the user can specify which functions should be run on an SPE1. This method should be simple, clear and easy to turn on and off. I stumbled upon a blog post a little while ago (I think it was this one) which explained decorators in Python, which is the perfect tool for the job. Decorators are used to transform functions, but without changing the function itself or the calls to it.

def spe(func, *args):
    def run(*args):
        return compile(func, *args)
    return run

@spe
def sub(a, b):
    return a - b

print sub(2, 4)

The spe function is the actual decorator. The @spe line applies the decorator to the sub function. Implicitly, the following declaration is made:

sub = spe(sub)

The sub function is being wrapped by the spe function, and so all calls to sub (such as the print line) will use the wrapped function instead. The decorator creates and returns a new function called run which will (eventually) cause the original function to be compiled and executed on an SPE. This means that running a function on an SPE will be as simple as adding @spe before the function declaration2, without having to change the way in which the function is called. Turning it off is as simple as commenting out this line, and it is fairly clear as to what is happening.


  1. Trying to make this decision automatically would be a massive project in itself and would probably be worse than a human decision. 

  2. There will probably be some restrictions on what the function may contain, but that's a different matter. 

Masters project overview

Since I might be posting entries regarding my Masters project, I thought that I would provide a brief overview of the project to put it in perspective. I am doing my MSc in Electrical Engineering at UCT as part of the ACE group headed by Prof. Inggs. The group is based at the CHPC, which is part of the Meraka Institute, which in turn is part of the CSIR. The group's research is focused on developing new platforms and methods for HPC.

My project is to investigate the suitability of the Cell processor for HPC. The Cell processor is found in the PlayStation 3 and in BladeCenters, and is a very powerful processor. It achieves this by using two different types of cores. The one type (PPU) is a general purpose core capable of running an operating system, while the other type (SPU) is designed specifically to crunch numbers.

The disadvantage of this architecture is that it is very difficult to program for. When using the IBM Cell SDK, the user needs to write separate programs for each type of core, and needs to manage the SPEs manually as well as take care of all memory transfers. This requires a good knowledge of the architecture, and results in a lengthy development process and unportable code.

For the Cell processor to be a successfull platform in HPC the development process must be made easier while still making efficient use of the Cell's capabilities. There are a number of commercial and non-commercial tools which aim to do this using a variety of methods. I have looked into these tools and have not found one which is both effective and open.

I therefore aim to create my own platform with which to program the Cell processor. The idea is to use [Python][] as the end user language, and to make a backend which transparently runs certain functions on the SPEs. This will involve converting the functions into C or C++, adding code to manage execution on an SPE and do the required memory transfers, compile it with the [GCC][] compiler and then execute it.

It is quite an ambitious plan, and there are a lot of potential pitfalls. If it succeeds however, I think that it will be a very easy way to develop for the Cell processor while still having portable code.

Disappointed with CSI

I don't watch very many TV series at all. I have watched all the House episodes, which I think are absolutely fantastic. It is a show I enjoy immensely and which is simply awesome. (If you haven't seen House I can strongly recommend it.) I then watched the first season of Heroes, but lost interest partway through the second season. (I found it too disjointed and drawn out.)

The only other series I have enjoyed is CSI. I have watched entire seasons of all three sub-series (Las Vegas, Miami and New York) and have always enjoyed them. The reasons I like it are as follows.

  1. Each episode is independent and doesn't require knowledge of other episodes. This means that I can watch them whenever, and don't need to worry about watching seasons in order.
  2. The content is interesting and well thought out. The plots are intriguing with clever twists, and often provide interesting insights into society.
  3. The emotional content is focused mainly on the victims and people related to the individual cases, as opposed to the main characters which appear in every episode. I tend to form fairly strong attachments to the main characters1, and this means that I don't take the show too seriously since the hectic stuff happens to characters that I don't know well.

However, I have just finished watching the eighth season of CSI Las Vegas, and I am very disappointed with the show. It has continually been breaking points 1 and 3 above, two of the prime reasons that I liked it so much. There have been (significant) recurring stories throughout the season, and the main characters have become personally involved in the cases, which has caused me to take it very seriously.

This has taken the enjoyment out of it for me, and since I watch movies and series to relax and unwind2 it's no longer worth my while to watch CSI. I will continue to watch previous seasons, but I probably won't continue with future Las Vegas seasons.


  1. I probably shouldn't admit this, but hey. 

  2. Yes, it's shallow, and I don't care. 

My Drupal setup

Seeing that I've spent countless hours setting up my Drupal installation, I thought that I would share this with others and document it for future reference. Drupal is an extremely powerful CMS which can be used to create a wide variety of sites. The disadvantage of this is that it requires a fair amount of work to setup a straightforward blog, which involves installing and configuring numerous modules.

Installation

Since there is no Ubuntu package for Drupal 6, I created my own package based on the drupal5 one. I set it up as a virtual host in Lighttpd by simply symlinking the domain name to /usr/share/drupal6. I created a MySQL database for the site and went through the Drupal install process. Since I'm using a multi-site installation, I also needed to alias the /files directory for each site.

$HTTP["host"] == "michael.gorven.za.net" {
    alias.url = ( "/files/" => "/home/mgorven/public_html/" )
}

Clean URLs

Clean URLs allows one to have URLs like /about instead of /index.php?q=about. This however requires that the web server rewrites URLs from the former to the latter. Drupal includes an htaccess file containing settings for Apache, but not for Lighttpd. Lighttpd does have a rewrite module, but it doesn't support the conditions that Drupal needs (such as checking if a file exists).

Lighttpd does however have a module which allows one to add scripts to the request process written in [Lua][]. A script has already been [developed][drupal.lua-devel] which implements the required rewriting for Drupal. The following lines in lighttpd.conf enable this for a specified site (after enabling [mod_magnet][] and downloading the [script][drupal.lua]).

$HTTP["host"] == "michael.gorven.za.net" {
    index-file.names = ( "index.php" )
    magnet.attract-physical-path-to = ( "/etc/lighttpd/drupal.lua" )
}

Blog

When Drupal is first installed, there is no mention of blogging as such. The first step is to enable the Blog core module1. This creates a blog content type and enables a blog for each user. (The module is designed for a multi-user blog, but can be used for a single user as well.) However, this doesn't give you all the functionality you expect from a blog engine.

Tagging is handled by the Taxonomy core module. You first need to create a vocabulary though, and enable it for blog posts. (This took me ages to discover.) In order to get nice URLs (including the date and post title, for example) you need to install the [Pathauto][] module and configure a pattern for blog posts. You may also want to define a pattern for tags.

There is also no archive functionality. The best way that I can find is the [Views][] module. It includes a pre-defined "archive" view which can display posts from a specific month, and links to the monthly pages. Even after much investigation I couldn't get the archive to behave like typical blog archives (i.e. /blog/2008/07/07, /blog/2008/07 and /blog/2008 for daily, monthly and yearly archives respectively).

Other Blog Features

The [Trackback][] and [Pingback][] modules implement automatic linking with other blogs. (I haven't actually tested these yet.) The Blog API core module allows the blog to be managed with external clients. The [Markdown][markdown-mod] module allows you to write posts using [Markdown][] syntax instead of HTML.

Comments

Drupal enables comments for blog posts by default. The [Akismet][akismet-mod] module implements spam filtering using the [Akismet][] service. The [CAPTCHA][captcha-mod] and [reCAPTCHA][recaptcha-mod] modules allows you to require users to answer a [reCAPTCHA][] when submitting comments. (I haven't actually enabled [CAPTCHAs][captcha] since I haven't gotten any comment spam yet. Or real comments for that matter...)

Posting by email

The [Mailhandler][] module allows you to submit posts via email. The configuration is fairly straightforward, except for the available commands which can be found [here][commands]. These can be specified at the beginning of emails and in the configuration as defaults. I use the following commands.

type: blog
taxonomy: [mail]
promote: 1
status: 1
comment: 2

This creates blog posts and tags them with the "mail" tag. Posts are published and promoted to the front page, and comments are enabled.

The one thing it doesn't handle is attachments (such as images). There are a couple of modules2 which support this, but they aren't available for Drupal 6 yet. ([Vhata][] has also hacked together a [photo blogging][phoblog] system, but this isn't implemented as a Drupal module.) I don't really need this feature, so I'm going to wait until these modules are updated.

OpenID

The [OpenID][openid-mod] module allows you to log into your site using [OpenID][]. The [OpenID URL][openidurl] module allows you to delegate your Drupal site as an OpenID by specifying your provider and/or your [Yadis][] document.

Yadis Advertisement

Yadis documents are advertised with a meta header in the HTML document, but this isn't the ideal method of doing so since the relaying party needs to download the entire HTML file. The [preferred methods][intertwingly] are to insert an X-XRDS-Location in the HTTP headers, or to automatically serve the Yadis document if the user agent specifies application/xrds+xml in the Accept header.

The former method can be accomplished with the setenv module for Lighttpd. The second is essentially a conditional rewrite, and so requires some Lua scripting again. The following script will do the job.

if lighty.request["Accept"] == "application/xrds+xml" then
    lighty.env["uri.path"] = "/files/yadis.xrdf"
end

The following lines in lighttpd.conf will announce the Yadis document for the root URL.

$HTTP["url"] == "/" {
    magnet.attract-raw-url-to = ( "/etc/lighttpd/yadis.lua" )
    setenv.add-response-header = ( "X-XRDS-Location" => "http://michael.gorven.za.net/files/yadis.xrdf" )
}

Random Stuff

The tag block is generated by the [Tagadelic][] module. The "Recent Tracks" block is generated from my [LastFM][] feed by the Aggregator core module, and the list of networks is simply a custom block. The [Atom][] feed is generated by the [Atom][atom-mod] module. The contact form, search and file upload are all core modules.

Missing Stuff

The one thing I haven't sorted out is image handling. There are a couple ways to [handle images][drupal-images] in Drupal, but none of these appeal to me (they're too complicated). I will probably just upload images as attachments and insert them manually in the body.


  1. It is however possible to run a blog without the Blog module. 

  2. [Mailsave][] and [Mobile Media Blog][mmb]. 

Gmail-like mail setup

I have been using Gmail for a while now, and really think that it's about the best email provider out there. I recently moved my mail over from Google Apps to my own server, but I wanted the major features that I liked. I've always used a desktop mail client and used POP3 and SMTP to receive and send mail.

These are the features I particularly like:

  1. Secure access with TLS/SSL
  2. Outgoing SMTP with authentication
  3. Messages sent via SMTP are automatically stored in the mailbox
  4. Messages downloaded via POP3 are still stored on the server
  5. IMAP and Web access

I therefore set out to recreate this setup as closely as possible. The first two are satisfied by a standard Postfix setup with TLS and SMTP AUTH. The last one is done with Dovecot and Roundcube.

To automatically store sent messages on the server, I used Postfix's sender_bcc_maps to BCC messages I send to myself, and the following Procmail recipe to move these messages to the Sent folder.

:0
* ^Return-Path.*me@example.com
.Sent/

To make POP3 access independent from IMAP, I first configured Dovecot to use a different mail location for each as follows.

protocol imap {
    mail_location = maildir:~/Maildir
}
protocol pop3 {
    mail_location = /var/mail/%u
}

I then used the following Procmail recipe to send incoming messages to both locations.

DEFAULT=$HOME/Maildir/
:0c:
/var/mail/mgorven

At the moment this is only setup for my user, but it should be possible to do it for all users by creating a global procmailrc and telling Postfix to deliver all mail using Procmail. This is working fairly well. The only part missing is that Gmail can archive or mark messages as read when they are downloaded via POP3, whereas in my setup POP3 and IMAP are completely independent.

Postfix with SPF and DKIM

As most people know, email is horribly insecure. It is trivial to forge the From address in emails since there is no authentication when sending1. This means that one cannot trust the From address, and also that people can forge messages from your address. In order to address this a number of new schemes have been developed. These include SPF, DomainKeys, DKIM and SenderID. All of these aim to verify that mail is actually from the address it appears to be from. SPF and SenderID do so by restricting which hosts are allowed to send messages from a certain domain, while DomainKeys and DKIM use cryptographic signatures.

Unfortunately all of these schemes have problems due to the fact that they are an addition to the existing mail system. SPF and SenderID prevent plain forwarding (requiring additional schemes like SRS or whitelisting of forwarders), and MTAs and mailing lists which modify messages break DomainKey and DKIM signatures. Despite these issues, email forgery is an issue which needs to be addressed, and we cannot wait for a perfect solution before adopting it. Some major mail providers (including Gmail and Yahoo) are already implementing these schemes.

I have therefore configured SPF and DKIM in my Postfix mail setup. My SPF policy allows mail from my server and SOFTFAILs all other hosts, and all outgoing mail is signed with DKIM. Incoming mail is checked for SPF and DKIM, but aren't discared even if the checks fail. I will be keeping an eye on things and will revise my policy when I think it safe.

SPF Configuration

To create an SPF policy, add a TXT record to your DNS records according to the SPF syntax. The policy should authorise all hosts from which you send mail. (Mine simply authorises my mail server since I send all mail through it.) You also need a policy for the hostname presented by your mail server in its HELO/EHLO command. You should also create policies for all subdomains which aren't used for mail.

To check SPF records for incoming mail, I used the SPF policy daemon for Postfix. It is packaged for Ubuntu as postfix-policyd-spf-python. Simply follow the instructions in /usr/share/doc/postfix-policyd-spf-python/README.Debian2, and set defaultSeedOnly = 0 in the configuration file if you don't wish to reject mail which fails the test. Remember to whitelist any servers which forward mail to you (i.e. you have another address which gets forwarded to your mail server), unless they implement SRS3.

DKIM Configuration

To sign and check DKIM I use DKIMproxy. There isn't an Ubuntu package so I installed it from source. The instructions on the site are good, and include details for Postfix. You will need to generate a key to sign with and publish it in DNS, and then configure Postfix to sign outgoing messages and validate incoming messages. DKIMproxy won't discard messages with invalid signatures by default.

DKIM includes a component called ADSP which allows domains to publish their signing policy. The strongest policy states that all messages are signed with DKIM and any messages without signatures should be discarded. This will allow mail servers to reject messages not sent through your mail server. However, the standard is not finalised yet, and issues regarding mailing lists still need to be addressed.


  1. Yes, I know about SMTP authentication, but people can simply use a relay. 

  2. Just watch out for the location of the configuration file -- the README uses a different location to the package. 

  3. Gmail doesn't implement SRS as such, but does use a compatible rewriting scheme.