February 7, 2013

Railway Data Revisited

I mentioned Network Rail’s feeds briefly last time but hadn’t yet looked into them much. Thanks to Samuel Littley for letting me know a bit more about them. Sign up and your access will be activated within an hour or so. Given that mine was granted at 3am GMT I suspect it’s automated.

My interest in this data comes down to how much I take trains to London, Manchester and elsewhere. Ever since I built TrainTrackr in 2011 I’ve wanted to get a better idea of how late trains really are. These open data feeds firehoses offer me that chance.

Using the data

After getting access I discovered that the entire Data Feeds is almost as close to undocumented as you can get. Thankfully I’d signed up to a talk at the Open Data Institute on this topic, given by Jonathan Raper.

The honest truth is that it seems to be quite a pain. A few immediate problems:

You then face the fact there’s half a dozen different datasets: the train timetable, the live updates, changes to train journeys - and when trains have to divert you can lose track of where it’s going altogether.

Learning more

Unsurprisingly the train community got to this before I did, so there’s quite a bit of useful things out there. I’ve not managed to find any overview/introduction that’s up to date, so I might end up writing one next month.

If you want train data in a format where it’s ready to build a Mobile API, try Placr’s Transport API. Free for <1000 hits a day, paid above that. Placr’s founder gave the ODI talk I mentioned, the slides are available online.

For an example of what can be done in terms of end-user applications, see Real Time Trains by Tom Cairns. There’s also plenty of information on his blog regarding the different data formats - and better, lots of relevant code on Github.

Peter Hicks also has some useful code available, although seemingly focused on timetables rather than the live data. He gave a relevant talk last September in Helsinki that dives a bit more into licensing issues and so forth.

My next step

I’ve got a much better idea what I’m doing than a couple of weeks ago, so I’m going to go out and start getting the data. Next post on this probably a few weeks away. Thanks a lot to all mentioned for their help.