location at paulcarvill.com, the home of Paul Carvill on the web

link: paulcarvill at flickr

paulcarvill.com

Hi, I'm Paul Carvill and I'm a web developer. I am Head of Interface Development at LBi, Europe's largest digital agency.

I also like walking, cooking, Bollywood and rock 'n' roll.

Posts Tagged ‘location’

Geocoding location data in a Google spreadsheet

Wednesday, June 3rd, 2009

The problem: I have a spreadsheet full of locations, addresses and place names that I want to publish, along with a map, for at least tens of thousands of people to view.

A solution: Easy — I can put it in a Google spreadsheet, publish it, add a Google map to a page, download the data, geocode the locations and display them on the map.

Another problem: While this is ok in most cases, with a large spreadsheet the geocoding can take a very long time, making my page appear unresponsive and slow. In addition, I have no way of checking that the location data is good enough to map with.

Another solution: Download the data, geocode it using Yahoo!’s Placemaker service, generate a new spreadsheet containing accurate latitude/longitde data and use that in place of the original. The client then does no geocoding their side, it’s all supplied along with the data. Everybody’s happy!

— Go straight to the spreadsheet geocoder! —

I’ve done just that with this PHP script. It takes a Google spreadsheet key, and you must tell it what columns your location data is in. It will download the spreadsheet data, concatenate those location columns, make a request to Placemaker to geocode each location, and return a new CSV file with the geodata columns appended on the end.

I’ve detailed here the various bits that make up the script. The workflow is as follows:

Capture spreadsheet data from user > Load in spreadsheet from Google > For each line in spreadsheet make a Placemaker request > Append geolocation data columns to spreadsheet > Output all results into a CSV file

The script is set to not autodisambiguate, meaning that if it’s not sure what location you’ve supplied, it will return all likely candidates, in order of likelihood. I should mention that Yahoo!’s Placemaker is utterly awesome in find out the ‘whereness‘ of things.

To build your own version of my script will need a Placemaker API key. Other than that, please feel free to copy and paste the code, fix it, amend it and let me know if it’s useful, or if it needs more commenting, or how I could improve it. I wrote this code to fix a particular problem I was encountering, but I’m sure it could work in a few more cases too.

Something to note before I start: the script doesn’t much like having commas in the location data in your spreadsheet. Because Google only output CSV with a comma delimiter, this upsets my CSV parsing. Any suggestions welcome.

This function gets some CSV data from a published Google spreadsheet using a supplied key:


<?php

function getCsvDataFromGoogle($spreadsheetKey) {
	$key = $spreadsheetKey;
	$output = 'csv';
	$apiendpoint = 'http://spreadsheets.google.com/pub?key='.$key.'&output='.$output;
	$ch = curl_init();
	$options = array(CURLOPT_URL => $apiendpoint,
	                 CURLOPT_HEADER => false,
	                 CURLOPT_RETURNTRANSFER => true
	                );
	curl_setopt_array($ch, $options);
	$r = curl_exec($ch);
       curl_close($ch);
	return $r;
}

This function makes a Placemaker geocode request:


function getPlacemakerGeodata ($location) {
	$key = 'MY_PLACEMAKER_API_KEY';
	$apiendpoint = 'http://wherein.yahooapis.com/v1/document';
	$inputType = 'text/plain';
	$outputType = 'xml';
	$focus = '28298150'; // sets focus to Great Britain, not sure how effective this is yet
	$autoDisambiguate = 'false'; // returns the 1 most-likely place, else returns many likely places
	$post = 'appid='.$key.'&documentContent='.$location.'&documentType='.$inputType.'&outputType='.$outputType.'&focusWoeid='.$focus.'&autoDisambiguate='.$autoDisambiguate;
	$ch = curl_init($apiendpoint);
	curl_setopt($ch, CURLOPT_POST, 1);
	curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	$results = curl_exec($ch);
	return $results;
}

This function does the bulk of the work, and makes calls to all the other functions:


function parseCsvData($googleSpreadsheetKey) {
	$lines=split( "\n", getCsvDataFromGoogle($googleSpreadsheetKey) );
	if($_POST['format'] == 'csv') {
		if($_POST['locationColumns'] == '' || $_POST['key'] == '') {
			echo "please go back and specify both your google spreadsheet key and which columns contain your location data (in comma separated format, zero-indexed e.g. 0,1,9)";
			exit();
		}
		else {
			// get location columns from url
			$locations = $_POST['locationColumns'];
			$splitLocations = split(',', $locations);
			// set headers to 'csv'
			header("Content-type: application/csv;");
			header("Content-Disposition: attachment; filename=yourgeodata.csv");
			$out = fopen('php://output', 'w');
			for($i=1;$i

This function parses the XML which gets returned from Yahoo! Placemaker:


function parsePlacemakerXML($results, $delineator) {
	if($delineator == 'comma') { $delStart = ''; $delEnd = ','; }
	else { $delStart = '<td>'; $delEnd = '</td>'; }

	$places = simplexml_load_string($results, 'SimpleXMLElement', LIBXML_NOCDATA);
	$locarr = array();
	if($places->document->placeDetails) {
		foreach($places->document->placeDetails as $p) {
			if($delineator == 'comma') {
				$locarr[] = $p->place->name;
				$locarr[] = $p->place->centroid->latitude;
				$locarr[] = $p->place->centroid->longitude;
				return $locarr;
			}
			else {
				echo $delStart.$p->place->name.$delEnd;
				echo $delStart.$p->place->centroid->latitude.$delEnd;
				echo $delStart.$p->place->centroid->longitude.$delEnd;
			}
		}
	}
}

This bit runs when you load the page and works out if you're submitting some data or just viewing the page. If you've submitted data, it runs the main function:

if(ISSET($_POST['submit'])) {
	parseCsvData($_POST['key']);
}

Or if you're viewing the page for 1st time, you get a form to fill out:

else {
	echo "<html><head><title></title></head>";
	echo "<body>";
	echo "<p>Please enter your spreadsheet key and specify which columns contain your location data (use comma separated list e.g. 9,10,11):</p>";
	echo "<form method=POST><p><label>Key:<input type='text' name='key' /></label></p><p><label>Location columns: <input type='text' name='locationColumns' /></label></p><p><label>Format: <select name='format'><option value='csv'>csv</option><option value='table'>table</option></select></label></p><p><input type='submit' name='submit' /></p></form>";
	echo "</body></html>";
}
?>