October 15, 2009

Tiling Kibera

The upcoming Map Kibera project acquired some imagery recently, and I got ahold of it yesterday to set up a quick tilecache preview. There's actually been quite a few requests here recently for getting some tiles up quickly from various sets of source imagery, so I thought I'd write a few blog posts on some different ways to go about it.

First, I'm assuming the end user will be requesting tiles, and that these tiles will be projected in Spherical Mercator for viewing on the web in a browser like OpenLayers or Google Maps (so I'm skipping over the bits for creating tiles that might be used in a browser like Google Earth). With that in mind, there are a few ways to get your tiles.  Note that the Kibera imagery is a nice simple example, because the area of the imagery is not that large (about 25 square km), and the source file is only a couple hundred megs as an uncompressed TIF.

Option A: Pre-generate all your tiles in advance

The easiest way to generate all your tiles in advance is probably to use the newish MapTiler software, which is a nice graphic interface for the gdal2tiles project.  After installing MapTiler, I just selected my projection, selected my single TIF file (note your source files do not have to match the output projection -- and because my data was a GeoTiff with appropriate metadata, MapTiler automatically figured out the projection info and appropriate transformation by itself), selected my zoom levels and other options, and hit Render.  Because I wanted tiles all the way up to zoom level 18, it took just under 15 minutes.  The output of MapTiler is just awesome - it creates not only the tiles but also sample Google Maps and OpenLayers html, each of which is full of nice features.  I'm impressed (though I'd like to see a CloudMade tile layers or two in the OpenLayers example).

If for any reason MapTiler isn't working out for you, you can also use gdal2tiles directly.  Mano Marks recently wrote a nice tutorial on using gdal2tiles for creating KML superoverlays.  The concept is the same for creating spherical mercator tiles - you just need to change the warping projection (to use EPSG:3785 instead of EPSG:4326) and remove the geodetic option from gdal2tiles and you should be good to go (note that epsg:900913 is equivelent to epsg:3785, and if you do not have one of them in your epsg file, you may need to add it manually)

The source imagery was .6 meters/pixel, and because we're near the equator, tiles at level 18 are close to the scale of the original image.  Going up to zoom 19 added a little viewing clarity, but it took ~4x the space as was required for my level 18 tiles, not to mention the time to render them.  In this case, rendering zoom level 19 alone took over 45 minutes.

Option B:  Generate tiles on demand

Often, you are dealing with a larger dataset than Kibera, and rendering all the tiles might take many hundreds of gigabytes (or much more).  In addition, it's very likely that the vast majority of your tiles will never be requested by any user -- rendering the middle of a 'boring' area up to zoom level 20 is basically a waste of space.  But since you can't be exactly sure which tiles will be requested, you may want to render them on-demand, and then cache those requested tiles under the assumption that if they were requested once, they're more likely to be requested again.  Another reason to do this is time:  It only took 15 minutes to render Kibera up to zoom level 18, but what if you just got imagery for Afghanistan, and you'd like to start looking at the tiles _right now_ instead of waiting overnight (or longer) for the pre-rendering to finish?  One answer is TileCache

A common use-case for TileCache is to put it as middleware between an existing WMS server and the end users.  This works great, but requires you've got a WMS server already configured.  However, TileCache can also read GDAL Data Formats directly, and then spit out the tiles.  To use this, it's important you have both PIL and Numpy installed (along with GDAL and TileCache, of course).  Here's a simple tilecache configuration for creating google-map compatible tiles:


In addition, however, you need to make sure your source data is in the matching spherical mercator projection.  To reproject (or transform) the Kibera imagery, I used this command:

gdalwarp -t_srs epsg:3785 09FEB19_BOOST.tif kibera.tif

Finally, you can also use tilecache_seed to pre-render some or all of the tiles using tilecache itself.  It can be useful for example to seed all but the last couple zoom levels (these will take relatively little disk space) so the first users of the map won't have to wait for tiles to render until they zoom way in to see some detail.

Tips and Tricks

There's a few things you can do to speed up tile generation and lessen the load on your server.  With a small dataset like this, it's not a big deal - but when dealing with bigger data sources, speeding up your render time can mean hours or days of computer time saved.

Transforming your Source Data:  Making sure your source data is in the same projection as your output tiles means more then creating a VRT with the metadata for the projection transformation - it means actually transforming the raw data so it doesn't have to be transformed on the fly during tile creation.  This has to be done for the tilecache option above, but if using MapTiler or gdal2tiles, you may wish to use gdalwarp as noted at the end of the TileCache section above to actually output a new tiff file to use as your source.  The disadvantage of this is that you end up using extra space for the source data while you render, but if you're plan is to pre-render all the tiles then disk space is probably not your concern.  

Creating Overviews: In the Kibera example above, only zoom levels 18 and 19 were near the source dataset resolution.  All of the lower zoom levels could have been rendered more quickly if we had them reading from a more course (downsampled) data source.  Fortunately GDAL ships with a utility to let us create these downsampled "overviews", which will in turn be used by any of the above rendering methods.  To create overviews of my gdal data source I run:

gdaladdo kibera.tif -r 2 4 8 16

I can also add the "-r" parameter to the gdaladdo command which will create a separate overview file rather than incorporate them directly into my source tiff.  Either way, this can potentially speed up rendering time for all but your most detailed zoom levels.

Post Processing:  As mentioned by MapTiler during the tile creation process, you can save half your disk space or more by minimizing the output tile size using PNGNQ.  There's a thread here discussing ways to recurse through all your png files on windows or linux.