Tweaking the site a little

I've made some updates to the site over the past couple of days, tweaking some of the overall styling (and the styling on my CV), updating my photo to a more recent snap, and also bringing back the old photo galleries that have been missing for a while.

The photos in the photography section of the site are from around 2000-2005. Another 3,695 shots since then currently just reside in my Flickr account, but maybe they'll make it onto this site one day too. It's nice to be able to easily browse through my old photos again (especially places like Niarbyl on the Isle of Man and various places I've visited over the years), but unfortunately the images aren't of the best quality given the improvements in camera, scanner and screen technology over recent years.

Presenting Bomb Sight at geomob

Last week I presented a talk on the Bomb Sight project at the first geomob meetup of the year. I posted the slides on SlideShare afterwards, but I'll add some more narrative here in case you missed the presentation and want to find out more about the project.

Bomb Sight was a year long academic project to map the London Blitz Bomb Census, funded by JISC with the support of the National Archives and the University of Portsmouth. The project team was led by Dr Catherine (Kate) Jones of the University of Portsmouth, detailed knowledge of the archive information provided by Andrew Janes from the National Archives, the web app built by Patrick Weber of Location Insight Ltd, the mobile app by me (Geobits Ltd) and the design of both by Jasia Warren.

The Blitz was a period of London's history that saw prolonged bombing with high explosive bombs during 1940 and 1941. A million houses were damaged or destroyed in London alone, and many other UK cities suffered a similar fate. More bombing continued later in World War II with the arrival of self-propelled V1 and V2 bombs.

The Bomb Sight project used maps from the Blitz Bomb Census Survey which were compiled by the Ministry of Home Security to give an overall picture of where bombs fell across the UK. There are a lot of maps held by the National Archives that relate to WWII bombing of London and other cities, and because we only had a limited amount of time, the project is only covering the area around London (referred to then as Civil Defence Region 5) and for the period of 7 October 1940 - 6 June 1941. The Blitz started in London a month before this, on 7 September, but mapping of bomb locations by the Ministry of Home Security only began on 7 October. We have also included data from the first night of the Blitz, which comes from London Fire Brigade records. The maps were previously only available to access in their original form in the Reading Room at the National Archives, but the Bomb Sight project has now made them available to citizen researchers, students and academics who want to explore the maps.

The main features of the project have been creating digital maps of the Bomb Census, digitising bomb location data from these, analysing the information within various geographical boundaries, combining the information with geolocated stories and photos from the time, creating a web mapping application to explore the information, and creating a mobile application to help people explore bomb locations while they are out and about in London.

The maps for the project came from two collections held at the National Archives. The first set of maps (HO193/13) shows bombs that fell over night during the eight month period from 7 October 1940 - 6 June 1941 without any additional information to show when each bomb fell during that period. We used 35 of these map sheets to cover the London area. The second set of maps (HO193/01) details the bombs that fell on a day by day basis, using colour to depict the day of the week, and different symbols to show the type of bomb that fell. Due to the large number of these map sheets (over 500) we just selected nine of them that show bombing for the week between 7 - 14 October 1940. Each of these maps was then photographed, loaded into a GIS package and georectified to match its geography up with a modern map.

Manually digitising all of the bomb locations from the aggregate maps provided the landing sites of 31,373 bombs, 28,325 of which fell within the current day boundaries of Greater London.

Visualising all of these points on a zoomed out map shows a mass of red dots covering the London area. We investigated some alternative ways of visualising the data at this scale (such as a heat map), but we decided this method helped to convey the impact of The Blitz on London as a whole. The image was one that was widely used by the media when reporting the project.

The website brings together the various datasets (first day of the Blitz, first week of mapping, 8 months of mapping) overlaid on modern OpenStreetMap data or on the original map images (see the layers icon in the top right of the map and click '1940s Bomb Maps' to enable them). You can explore the map by panning around, search for a location you're interested in, or browse information within various geographical boundaries using the 'Explore London' menu. When you are zoomed in to the map you can click on any of the markers to view more information about the bomb as well as some contextual information.

Each digitised bomb has its own information page (e.g. this one near Bank Station) showing the type of bomb, a map of its location and others around it, a current day address for the location, stories from locals mentioning nearby places, and a selection of nearby photos from the time. The photos came from the Imperial War Museum archives, where they are provided in an easily searchable format and with a licence to use them free for non-commercial purposes.

The example bomb information page shown in the presentation was for a bomb that landed near Bank station. I think this bomb is probably the one that The Register claimed was missing from the map in their article London Blitz bomb web map a hit-and-miss affair, but it's difficult to be certain.

There are a number of considerations around accuracy that should be taken into account when dealing with these maps. The original maps were an aggregation of data that was coming in from various sources, presumably of varying levels of accuracy and precision due to the sheer amount of damage that each attack would cause. The base maps would potentially have had accuracy issues as well, especially around the edges, where one sheet meets another, and over the years the paper may have warped slightly. There is then the digitisation process, which can introduce errors and inaccuracies during the georectification of the maps and also while digitisation of each point, which is converted from the original pen mark to a pair of coordinates near the centre of the mark.

As part of the digitisation process, the location of each bomb was reverse-geocoded to determine the nearest street address, to help give some context to the data, and to assist with searching. Looking up a historic location using a modern dataset can give some incorrect results at times, whether its because the bomb didn't actually fall where we think it did (see the note about accuracy above) or the road name has changed name over time, including as old buildings have been knocked down to make way for new developments (e.g. bomb at More London Place).

The technology behind the website is all open source, with PostgreSQL/PostGIS as the database, GeoServer to provide WFS and tiled WMS interfaces to the data, Leaflet and OpenStreetMap for the mapping, Django/GeoDjango to build the dynamic website and Bootstrap to provide a responsive user interface framework suitable for desktop, tablet and mobile devices. The website sits behind the CloudFlare content delivery network (CDN), which caches all of the pages and map data from the site and serves them from their own servers located closest to the end user, meaning that the load on our server is much reduced.

Bomb Sight Android appAs well as the mobile-friendly website, we built a mobile application (currently available for Android and hopefully for iOS in the near future) which replicated some of the mapping features and also added an Augmented Reality interface to help explore the data in the context of your own surroundings. The application was built using the Android SDK, Phonegap to provide a framework for cross-platform development, Leaflet and OpenStreetMap for mapping, and the WIkitude SDK for Augmented Reality functionality that allows you to see bomb locations overlaid on what your camera is seeing in front of you. If you want to find out more, you may also be interested in my post on choosing an AR library for Android, a technical overview, or a walkthrough of the prototype. There are some screenshots posted here too.

The outcomes of the Bomb Sight project were the georeferenced bomb maps, providing a digital record of an important event in the history of London and the country, the sharing of all of the data with the National Archives to help reduce use of the original paper maps and aid with their preservation, the opening up of maps that were previously only accessible if you were able to visit the National Archives, making the information available to a wider audience and providing a geographic framework for the study of the impact of bombing on the social and economic climate.

The media response to the project was phenomenal, and the initial response completely overwhelming, as we struggled to keep the site up and running under the immense traffic (see my post on scaling from 40 visitors a day to 6 every second).

The project is officially at an end, but there are a few things that are still to come as we find the time to work on them. In the next couple of months, we will be providing data downloads for non-commercial purposes, allowing researchers to investigate the data in more detail, and after that we will be adding some tutorials to the site to help people find their way around. I'm also planning to port the Android app to iPhone when I get a chance.

All of us on the project team would quite like to work more on the project as and when possible, on these planned updates, and also on expanding the project further if we can...

The data we are using is only a small subset of the data available in the National Archives. There is potential - if we can find both time and funding - to expand the temporal coverage of bomb data to cover the entire period of the Blitz and the rest of the war, including mapping the V1 and V2 bombs that were launched into London later in the war. We could add more detailed reports (known as BC4 reports) to the site, which were written reports that described many of the bomb sites. We could also include more contextual information around each bomb location, such as more photos, user-submitted stories and comments. And of course, London wasn't the only city that was bombed during the war - there are many other British cities that were bombed, and many European cities that were hit even worse by Allied bombing.

If you would like to find out more about the Bomb Sight project, have a read of the project blog, or tweet us on the project at @BombSightUK.

On scaling from 40 visitors a day to 6 every second

When we launched the Bomb Sight website at the end of November, none of the project team had imagined quite how much interest there would be in the project.

We were pleased when we saw some people starting to tweet about it, and saw the number of visitors rise to 40 a day this time last week.

When I spotted at lunchtime on Wednesday that we were seeing an increasing level of traffic from Twitter, I sent a tongue-in-cheek email to the team saying I thought it might be starting to go viral. At that point we were seeing about one visitor a minute, and by the end of the day we were seeing about one every ten seconds, and had served 2,235 visitors over the course of the day. The web site was starting to feel a bit slow at times, but we think it was serving most visitors who tried to access the site.

A tweet from @qikipedia to their 357,000 followers suggests we were struggling by Thursday morning, but we kept serving visitors at about the same level of demand as we were on Wednesday... until 4pm on Thursday when we saw the first media article about the project.

That's when we really realised that the server couldn't keep up with the demand of 8,000 visitors an hour - and you would likely have been seeing errors quite often - so Patrick and I started to look at the server to see what we could do to help it scale with the resources we had available. I think we managed to squeeze a bit more capacity out of it, so some more people were able to view the site if they were lucky enough to get through. We definitely weren't able to make it available to everyone who was trying to access the site though.

Thankfully we spotted a tweet that evening from CloudFlare, a company that specialises in delivering content faster and more reliably by sitting between the users accessing the site and the website itself. They shelter your server from intense levels of traffic by keeping a copy of your pages on their own servers and serving that out to users on your behalf.

Thursday evening was spent trying to get this set up and working, and by 1am we were seeing about 30 users a minute again. It wasn't quite the 5 minute setup they advertise - as we waited for DNS changeovers, tried to get to grips with the configuration needed, competed with everyone else to access our own pages, and tried to keep our server online enough for CloudFlare to cache copies of the pages - but by the end of the night I was feeling a bit more comfortable that we'd be able to keep the website running a bit better for the next day. The front page of the site (the main map) was being cached by CloudFlare, and that was the most important part to keep working.

We had already served 18,459 visitors on Thursday, but that still wasn't the busiest day we'd see. By far.

By Friday morning, Patrick noticed that CloudFlare had started serving out a 404 Not Found error message instead of the homepage, which we battled with for a while, but managed to get rid of in the end. At 11am we were serving about 40 visitors a minute, until the BBC published their article around noon and we hit 380 visitors a minute, or 6 a second. And probably quite a few error pages too, unfortunately, as we worked to get rid of the remaining ones.

Friday saw around 185,000 visitors coming to the site, which just would not have been possible if we didn't have the support of CloudFlare to serve the vast majority of that traffic for us.

Up to this point on Tuesday, we have already had somewhere between 300,000 - 500,000 visitors* to the Bomb Sight website, and are currently serving about 900 visitors an hour. According to CloudFlare, these visitors have viewed upwards of 1.8 million pages between them.

We have shifted around 3 terabytes of data, of which CloudFlare has served about 2.7TB for us. 1.2TB of that were in the first 24 hours, and at the peak we were seeing a throughput of about 150GB an hour. That's a lot of data, especially when it was wasn't planned for in advance.

The moral of the story - if you think there's even the slightest possibility your project may go viral, plan in advance to add in a service such as CloudFlare to take some of the traffic away from your servers and give you some breathing space. Especially considering it's free to use their service!

As we posted on Thursday, we're sorry if you've had troubles accessing the site so far, but we hope that you were eventually able to access the site, and will be able to explore it further now that we're better equipped.

We are noticing from the statistics that people are starting to come back and explore the site more - with almost 25% of visitors today being people who have viewed the site more than once - which is a good sign that the data is of value and interest to people who want to learn more about the history London. We're also starting to see more people finding the site from Google searches about the Blitz, and about particular parts of London.

* it's difficult to accurately measure the number of visitors to the site, partly because some users will have seen error messages, others may not have been counted if they viewed the site embedded in a media article, and different services give statistics calculated in different ways. The visitor statistics in this post are all from Google Analytics while the data statistics are from CloudFlare.

This post is cross-posted from the Bomb Sight project blog

Subscribe to Dan Karran