Cleanweb, February 2015

Last night, I gave a talk on Open Rail Data at Cleanweb.

I wanted to stay longer – there were plenty of discussions to be had, but after a busy Open Data Day on Saturday, bed won over the pub.

Missed the presentation? I’ve uploaded the slides and they’re available PDF format.

If you want to continue the discussion, join the openraildata-talk mailing list and come chat to like-minded people!

The end is near…

It doesn’t seem like five years, but it is. Five years since I wanted the API to National Rail Enquiries’ Live Departure Board web service to be available for everyone so they can innovate and do great things.

We’ve come a heck of a long way in those five years – as from this week, you can sign up for the Open Live Departure Boards Web Services. A round of applause, please!

So, is that the end? Unfortunately not – there’s even more data to unlock, even more value to be created and stories to be told – but I think it’s been demonstrated that open and permissive trumps closed and expensive.

I get the feeling it’s going to be a smoother ride from here on.

Open Rail Data – Two Years On

After a brief, but really interesting visit to the former Bletchley PSB (or signalbox, if you’re less of a railway geek), I popped in to OpenTech 2013 to present an update to the presentation I gave two years ago.

In some ways, we’ve come a long way – in others, maybe not. Regardless, there’s scope for opening up more data to make us all more aware of what’s going on – suggestions immediately afterwards included getting data on cable theft incidents, counts of people going through ticket barriers at stations in real-time, plus passenger counts from trains.

My presentation is available if you missed it, or if you want to cut-out and keep. Exciting times 🙂

Opening Great Britain’s Rail Data

It’s my first time in Helsinki, and the weather is much the same as a September day in London – wet.
I finished preparing for my talk at OKFestival a little under 24 hours ago, and it went without a hitch. These things are normally OK once you’ve finished worrying about them.
Anyway, the slides and video (which no longer works as of February 2020) of my presentation on Open Train Times and Opening Great Britain’s Rail Data are now online. Enjoy!

The Chancellor's Autumn Statement

Many people reading my blog are interested in Open Data – here are the three important paragraphs from the Chancellor’s Autumn Statement earlier, as they relate to Open Data:
“1.125 Making more public sector information available will help catalyse new markets and innovative products and services as well as improving standards and transparency in public services. The Government will open up access to core public datasets on transport, weather and health, including giving individuals access to their online GP records by the end of this Parliament. The Government will provide up to £10 million over five years to establish an Open Data Institute to help industry exploit the opportunities created through release of this data.”
“A.146 Open Data Institute – The Government will provide up to £10 million over five years, with match-funding from industry and academia, to establish the world’s first Open Data Institute to help business exploit the opportunities created by release of public data”
“A.140 Rail fares data – The Government will consult in early 2012, through the Fares and Ticketing Review, on providing open access to rail fares data, giving passengers and business better information and enabling them to make the most cost-effective travel choices.”
The Cabinet Office website has further details in a PDF here.
I’ll leave it at that for the moment – other people will doubtless be writing their take on it, but I’ll leave you with one word from me: positive.

Crunching rail timetables

For those of you new to this blog, I’ve been doing some work with timetable data for a few months now, and I presented my work at OpenTech with Jonathan Raper earlier this year. I’m working with some other people to bring more information about the rail network out from behind the scenes and in to the hands of the public so people can innovate and analyse the data, and ultimately to increase transparency and accountability. Importantly, I am also pro-rail and looking to improve on what we have.

So – it’s taken a while, but TSDB Explorer can now load an entire ~500Mb CIF format timetable in around an hour on an average machine. Whilst I can undoubtedly improve this, it’s a lot better than the previous three days and multi-gigabyte monstrosity I wrote previously.

Several people are interested in the format of the CIF file, and I’m going to put a set of slides together soon to explain it. Hopefully David Cameron’s recent letter on open data will help make Network Rail-source CIF timetable data more prevalent, and my “How To” guide will lower the barrier for other people to write timetable analysers, produce train frequency graphs, generate pocket timetables, etc.

Watch this space – these are very exciting times.

Open Source, Open Data

I’ve had a rethink about source code hosting. CVS is dead in the water, Subversion requires online connectivity, and I’m starting to use git with vigour. Hey, offline commits are perfect for coding on the train! (As an aside, I gave up trying to get WiFi access on a train to Leicester on Saturday, and didn’t even bother trying on Sunday coming back). Github is where it’s at – although despite today being World IPv6 Day, they don’t appear to have access over IPv6 natively.

The code for TSDB Explorer is up and out there and being actively worked on, as is TubeHorus, which is in a lesser working state. I anticipate getting around to putting TransportHacker‘s code on Github in the next week or so.

On another note, I’d like to thank the people at Network Rail who’ve been so helpful in talking to me about some of the data sets they hold. Whilst I’m not in a position to let the cat out of the bag yet, I am pretty excited about what’s coming in the next few weeks. Time to investigate Amazon EC2 I think… this may take some horsepower.

Open Rail Data

Jonathan Raper and I gave presentations on Open Rail Data – Jonathan from a more political angle, and me from a decidedly technical angle.

The material went down really well – there’s plenty of scope for us to show what can be done if timetable, real-time running and fares data is made openly available. I thoroughly enjoyed delivering the presentation – I haven’t done that since Berlin in 2006, and I’d forgotten how easily I slip in to “presenter mode”.

Here is a copy of my OpenTech 2011 presentation in PDF format if you’re interested. Or, if you simply want to get in touch, peter.hicks@opentraintimes.com.

I’m celebrating this evening with a curry.

UpdateJonathan’s presentation is also available

Google Maps' Data Quality

Harry Wood pointed out that Google Maps has removed Camden Town tube station from its map.Whilst I doubt Google have done this intentionally, it has set me thinking about data quality.
When developing TransportHacker (which isn’t live yet, there aren’t enough hours in the day!), I noticed the M25 was named “Autoroute Britannique M25”. It’s been corrected now, but how on earth did that one slip by?
More data quality issues (which may have been fixed by the time you read this):

  • Upper Holloway station has three icons – the Underground roundel, the Overground roundel, and the National Rail symbol. Click the Underground/Overground (Wombling Free?) icon, and you see it’s actually from the bus stop outside the station
  • Hop down to Highbury Corner, and you can see that Highbury and Islington station has the Underground and Overground roundels, but no National Rail symbol. Click on the roundels, and you’ll see that – yes – National Rail trains do serve the station
  • Examine, if you will, The Famous Cock. On Google Maps, it’s between Starbucks and Flight Centre. Google Streetview shows no Famous Cock there – in fact, it’s right next to Highbury and Islington station
  • Finally, what is White Stadt? I think it should be White City…

Here lies the danger with processing large sets of data – do you know they’re correct?