Computing at paulcarvill.com, the home of Paul Carvill on the web

link: paulcarvill on twitter

link: paulcarvill at flickr

paulcarvill.com

Hi, I'm Paul Carvill, I'm a web developer. I'm currently working as Technical Lead at LBi, Europe's largest digital agency.

I also like walking, cooking, Bollywood and rock 'n' roll.

Archive for the ‘Computing’ Category

Increasing memory limit for PHP in Rackspace Cloud Sites

Sunday, February 28th, 2010

If you install more than a few modules in your Drupal implementation the chances are that it’ll run out of memory and you’ll start seeing blank pages and odd behaviour. The fix for this is to increase the amount of memory allocated to PHP, which you can usually do in your php.ini file. But if your hosting is Rackspace Cloud Sites you don’t have access to the php.ini file. You must instead put your PHP settings in a .htaccess file in the root directory of your hosting space.

Here’s some example settings:

php_value memory_limit 96M
php_value upload_max_filesize 50M
php_value post_max_size 50M

Now, here’s the important bit: once you’ve made put those lines in your .htaccess file and FTP’d it to your webspace, the changes might not appear to have taken effect. I had to delete the original .htaccess file and upload a fresh one for my changes to be picked up. Hopefully this might help somebody else in the same situation.

N.B. Rackspace’s support had next to no idea of what I was talking about. Their Livechat support service was as useful as telephoning someone in a library and waiting while they went off to find the information. They obviously don;t support individual applications hosted by them, but I’d have expected a little more help.

The appliance of science, or, how rumours of Flash’s death are greatly exagerrated

Tuesday, February 2nd, 2010

There was some excellent work by Dr. Aleks Krotoski in Sunday night’s BBC documentary Virtual Revolution, especially the interview with Tim Berners-Lee where he reiterated the importance of freedom of information, and freedom of access. Aleks made the point that the federated structure of the internet resists authority. This documentary went out at prime time and did a fantastic job or explaining the absolutely world-changing importance of the web, without patronizing or over-simplifying the issue. Watching it, even after having worked on the web for the last 13 years, almost brought a tear to my eye. They really should do something very, very special with Tim Berners-Lee. Maybe put him on the fourth plinth in Trafalgar Square?

Also, there was yet another polished product launch by Apple this week with the announcement of the iPad.

These two events caused me to think back to a 2008 talk by Jonathan Zittrain, professor of Internet law at Harvard, to employees of The Guardian, as part of their Future of Journalism series. His talk was based on his book, the then about to be published The Future Of The Internet, And How To Stop It.

He talked at length about how we are in danger of adopting a top-down, tightly controlled model for the web, run overwhelmingly in the interests of large corporations; technology’s inexorable move towards locked-down digital units and tethered appliances; and ‘walled garden’ internet access. The beauty of the iPhone, when it launched, was that for what seemed like the first time we had a real web browser on a real mobile device which freed us from the tyranny of telco executives who wanted to control what we used our high-priced WAP data access plans to look at. We could go anywhere we wanted. It felt truly free. Now it seems, as Dr. Aleks pointed out with a useful proportional representation model of the web, we are increasingly moving to a future Zittrain warns about, one with a narrow marketplace controlled by a handful of powerful providers, where we go to iTunes for our music, Amazon, or perhaps iTunes, for our books, to eBay to sell our old stuff and to Wikipedia, run by a sinister cabal of administrators headed by the despotic Jimmy Wales, for our raw factual information. We can’t even view Flash content on an iPhone or Blu-Ray on a Mac due to Apple’s strict control over what can and can’t be installed on these systems. Whether it’s political or, as Steve Jobs supposedly says, because Flash is so buggy, I’m sure we’ll find out when the dust has settled. For now it appears to be a mobile device manufacturer — with something approaching a monopoly — trying to throw their weight around. The documentary raised some fascinating points about power structures on the web, and it certainly seems that we are only really beginning to understand how any of this will work.

Zittrain, in his book (which is itself licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License) ,says,

“A shift to tethered appliances and locked-down PCs will have a ripple effect on long-standing cyberlaw problems, many of which are tugs-of-war between individuals with a real or perceived injury from online activity and those who wish to operate as freely as possible in cyberspace. The capacity for the types of disruptive innovation discussed in the previous chapter will not be the only casualty. A shift to tethered appliances also entails a sea change in the regulability of the Internet. With tethered appliances, the dangers of excess come not from rogue third-party code, but from the much more predictable interventions by regulators into the devices themselves, and in turn into the ways that people can use the appliances.”

The iPad is certainly a continuation on the theme that was started with the iPod – access to app installation is through the App Store only. Jailbreaking your iPhone is possible but verboten. People are already complaining — it doesn’t multitask! there’s no camera! For all that technology has advanced our personal feeling of freedom us, we feel simultaneously liberated and emasculated as a result. I read another quote by someone but I’ve misplaced the link, who said,

“…in the “applianced” world we are threatened by monopolists and potential dictators,”

for whom we could easily substitute Adobe, Microsoft, Google, Oracle, and Mr Steve Jobs.

Recently, Mark Pilgim, a writer an developer advocate at Google, published a blogpost commiserating the demise of the tinkerer; that breed of person who found out how a computer works by poking around the innards of its operating system, or its hardware. Activity seen very rarely these days. Because now, of course, such activity will at the very least get your warranty voided, and at worst get you arrested.

Pilgrim continues this theme in an interview at a great new (to me) blog, The Setup, which asks techies and nerds of all descriptions to describe the technology they use to get the job done. In describing how he wants to still be using his current desktop computer in 20 years, he says

“Commercial vendors have a vested interest in upgrading you to the latest and greatest; supporting the old stuff is unglamorous and expensive. Commercial open source vendors aren’t really much better than commercial proprietary vendors in this regard, but community-led Linux distributions can afford to have different priorities.”

So, does the black box of user-friendliness and usability necessitate a top down, authoritarian attitude to technology, or can the interests of individuals and the market not happily co-exist? There’s certainly an argument for the former when you look at some of the abysmal user experiences offered by open-source software that’s available — Ubuntu; the GIMP. With their vast number of contributors you would expect quality and consistency to improve. But perhaps in the vastly ambiguous area of usability and design a greater number of contributing authors dilutes the quality of a product or an experience. Maybe a lone, dictatorial voice is the only answer here, as in the case of Apple’s justly famous and evangelised user interface. But at what cost comes the power to control every user’s experience, even against their will?

LATE UPDATE: Two other quotes that caught my eye. As an interesting counterpoint, Dion Almaer, erstwhile Mozilla developer and now Developer Advocate at Palm, mulls over profits-based corporations versus goal-based organisations, and passes over the hyperbole about Flash’s rumoured death to express thanks that the Open Web (i.e. the web)

“…is amazing in that there is NO SINGLE VENDOR. If we are able to keep a decent balance between browsers (and thus the platform as we know it) then we have a balance of powers. Sure, in some ways you can’t move as fast as a dictatorship, but there is a reason we don’t want dictatorships in our government (even if the trains run on time!)”

And a former colleague of mine, Daniel Vydra, makes the succinct point,

“Commenters [on this Guardian article] need to decide if apple is a restrictive dangerous monopoly, or a 5% market share joke. They can’t be both.”

Apache Continuum error: Provider message: No such provider: ’s’.

Tuesday, January 26th, 2010

If you’re trying to use Apache Continuum and get the error “Provider message: No such provider: ’s’.” then you probably haven’t provided the correct SCM Url format in your project information. Importantly, it needs to begin with something like “scm:svn” e.g.

scm:svn:https://example.com/svn/project/trunk

Tree command for Mac OS X

Saturday, January 16th, 2010

This outputs a structured view of files and folder of the directory you execute it in:

find . -print | sed -e 's;[^/]*/;|____;g;s;____|; |;g'

e.g.

.
|____.git
| |____branches
| |____config
| |____description
| |____HEAD
| |____hooks
| | |____applypatch-msg.sample
| | |____commit-msg.sample
| | |____post-commit.sample
| | |____post-receive.sample
| | |____post-update.sample
| | |____pre-applypatch.sample
| | |____pre-commit.sample
| | |____pre-rebase.sample
| | |____prepare-commit-msg.sample
| | |____update.sample
| |____index
| |____info
| | |____exclude
| |____objects
| | |____info
| | |____pack
| |____refs
| | |____heads
| | |____tags
|____CHANGELOG.txt
|____COPYRIGHT.txt
|____cron.php
|____includes
| |____actions.inc
| |____batch.inc
| |____bootstrap.inc
| |____cache-install.inc
| |____cache.inc
| |____common.inc
| |____database.inc
| |____database.mysql-common.inc
| |____database.mysql.inc
| |____database.mysqli.inc
| |____database.pgsql.inc
| |____file.inc
| |____form.inc
| |____image.gd.inc
| |____image.inc
| |____install.inc
| |____install.mysql.inc
| |____install.mysqli.inc
| |____install.pgsql.inc
...

ls -R does something similar, but structured considerably differently.

HTML5 HTML text semantics granularity

Thursday, June 18th, 2009

Wow, HTML5 HTML semantic text description options are so granular I had to spend several minutes pondering whether my previous code snippet about relaxing Apache permissions warranted <code>, <kbd>, or <samp> elements, or a combination of all three.

In the end I settled on a <kbd> element for the bit I want you to type in (opening a file in vi from the command line), as that’s the bit you’re going to type. For the contents of the file I chose a <samp> element, as the text shown in the file is a sample of the output of my file, rather than a chunk of code you need to enter.

The difference between <code> and <samp> is very small, but it’s great that we actually now have this level of specificity, which should help ensure that HTML5 HTML is robust enough to last well into the future.

Geocoding location data in a Google spreadsheet

Wednesday, June 3rd, 2009

The problem: I have a spreadsheet full of locations, addresses and place names that I want to publish, along with a map, for at least tens of thousands of people to view.

A solution: Easy — I can put it in a Google spreadsheet, publish it, add a Google map to a page, download the data, geocode the locations and display them on the map.

Another problem: While this is ok in most cases, with a large spreadsheet the geocoding can take a very long time, making my page appear unresponsive and slow. In addition, I have no way of checking that the location data is good enough to map with.

Another solution: Download the data, geocode it using Yahoo!’s Placemaker service, generate a new spreadsheet containing accurate latitude/longitde data and use that in place of the original. The client then does no geocoding their side, it’s all supplied along with the data. Everybody’s happy!

— Go straight to the spreadsheet geocoder! —

I’ve done just that with this PHP script. It takes a Google spreadsheet key, and you must tell it what columns your location data is in. It will download the spreadsheet data, concatenate those location columns, make a request to Placemaker to geocode each location, and return a new CSV file with the geodata columns appended on the end.

I’ve detailed here the various bits that make up the script. The workflow is as follows:

Capture spreadsheet data from user > Load in spreadsheet from Google > For each line in spreadsheet make a Placemaker request > Append geolocation data columns to spreadsheet > Output all results into a CSV file

The script is set to not autodisambiguate, meaning that if it’s not sure what location you’ve supplied, it will return all likely candidates, in order of likelihood. I should mention that Yahoo!’s Placemaker is utterly awesome in find out the ‘whereness‘ of things.

To build your own version of my script will need a Placemaker API key. Other than that, please feel free to copy and paste the code, fix it, amend it and let me know if it’s useful, or if it needs more commenting, or how I could improve it. I wrote this code to fix a particular problem I was encountering, but I’m sure it could work in a few more cases too.

Something to note before I start: the script doesn’t much like having commas in the location data in your spreadsheet. Because Google only output CSV with a comma delimiter, this upsets my CSV parsing. Any suggestions welcome.

This function gets some CSV data from a published Google spreadsheet using a supplied key:


<?php

function getCsvDataFromGoogle($spreadsheetKey) {
	$key = $spreadsheetKey;
	$output = 'csv';
	$apiendpoint = 'http://spreadsheets.google.com/pub?key='.$key.'&output='.$output;
	$ch = curl_init();
	$options = array(CURLOPT_URL => $apiendpoint,
	                 CURLOPT_HEADER => false,
	                 CURLOPT_RETURNTRANSFER => true
	                );
	curl_setopt_array($ch, $options);
	$r = curl_exec($ch);
       curl_close($ch);
	return $r;
}

This function makes a Placemaker geocode request:


function getPlacemakerGeodata ($location) {
	$key = 'MY_PLACEMAKER_API_KEY';
	$apiendpoint = 'http://wherein.yahooapis.com/v1/document';
	$inputType = 'text/plain';
	$outputType = 'xml';
	$focus = '28298150'; // sets focus to Great Britain, not sure how effective this is yet
	$autoDisambiguate = 'false'; // returns the 1 most-likely place, else returns many likely places
	$post = 'appid='.$key.'&documentContent='.$location.'&documentType='.$inputType.'&outputType='.$outputType.'&focusWoeid='.$focus.'&autoDisambiguate='.$autoDisambiguate;
	$ch = curl_init($apiendpoint);
	curl_setopt($ch, CURLOPT_POST, 1);
	curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	$results = curl_exec($ch);
	return $results;
}

This function does the bulk of the work, and makes calls to all the other functions:


function parseCsvData($googleSpreadsheetKey) {
	$lines=split( "\n", getCsvDataFromGoogle($googleSpreadsheetKey) );
	if($_POST['format'] == 'csv') {
		if($_POST['locationColumns'] == '' || $_POST['key'] == '') {
			echo "please go back and specify both your google spreadsheet key and which columns contain your location data (in comma separated format, zero-indexed e.g. 0,1,9)";
			exit();
		}
		else {
			// get location columns from url
			$locations = $_POST['locationColumns'];
			$splitLocations = split(',', $locations);
			// set headers to 'csv'
			header("Content-type: application/csv;");
			header("Content-Disposition: attachment; filename=yourgeodata.csv");
			$out = fopen('php://output', 'w');
			for($i=1;$i

This function parses the XML which gets returned from Yahoo! Placemaker:


function parsePlacemakerXML($results, $delineator) {
	if($delineator == 'comma') { $delStart = ''; $delEnd = ','; }
	else { $delStart = '<td>'; $delEnd = '</td>'; }

	$places = simplexml_load_string($results, 'SimpleXMLElement', LIBXML_NOCDATA);
	$locarr = array();
	if($places->document->placeDetails) {
		foreach($places->document->placeDetails as $p) {
			if($delineator == 'comma') {
				$locarr[] = $p->place->name;
				$locarr[] = $p->place->centroid->latitude;
				$locarr[] = $p->place->centroid->longitude;
				return $locarr;
			}
			else {
				echo $delStart.$p->place->name.$delEnd;
				echo $delStart.$p->place->centroid->latitude.$delEnd;
				echo $delStart.$p->place->centroid->longitude.$delEnd;
			}
		}
	}
}

This bit runs when you load the page and works out if you're submitting some data or just viewing the page. If you've submitted data, it runs the main function:

if(ISSET($_POST['submit'])) {
	parseCsvData($_POST['key']);
}

Or if you're viewing the page for 1st time, you get a form to fill out:

else {
	echo "<html><head><title></title></head>";
	echo "<body>";
	echo "<p>Please enter your spreadsheet key and specify which columns contain your location data (use comma separated list e.g. 9,10,11):</p>";
	echo "<form method=POST><p><label>Key:<input type='text' name='key' /></label></p><p><label>Location columns: <input type='text' name='locationColumns' /></label></p><p><label>Format: <select name='format'><option value='csv'>csv</option><option value='table'>table</option></select></label></p><p><input type='submit' name='submit' /></p></form>";
	echo "</body></html>";
}
?>

You’ll miss me when I’m gone: IE6, cross-browser consistency and device independence

Saturday, February 14th, 2009

A flurry of IE6 related activity on the web this week coincided with a discussion we are having at The Guardian on the same subject. We have been talking about the relative benefits of keeping website performance in IE6 consistent with that of other browsers, and the disproportionate amount of work this requires on the parts of developers and the QA team. We’ve been trying to figure out better processes to reduce the number of styling bugs in IE6, while not compromising the user experience or the hard work put in by our design team.

It turns out people have surprisingly strong views on cross-browser consistency. For some, IE6 represents much more than just ‘a browser’. It also represents, variously: a large market share; an important group of corporate users; a user’s freedom to choose whichever device she wishes to browse the web. Once you start dropping a browser for technological reasons, the argument goes, you might as well arbitrarily drop support for anything which you consider below par – mobile browsers, text browsers, people with small monitors.

The opposing view says that IE6 is many years old and two versions out of date, a huge security risk and a drain on resources. We shouldn’t be pandering to slow or paranoid IT departments who refuse to upgrade their systems. Anyway no one chooses to use IE6, it is forced upon them by said IT departments.

I’m loath to branch the code to produce a separate version of the site for any reason, be it a device or a browser. But I also see the amount of pain IE6 causes developers, especially when they’re trying to do something fancy with JavaScript, and even more especially trying to do so without using a standard library which might easily provide you with cross-browser methods for doing stuff.

I support IE because I have to. But I do also believe strongly in wide accessibility, through as many devices as possible. We should assume nothing — nothing — about how our users access the web. But I don’t think this is the point here. The point here is adhering web standards, which apply to both code and content. Remember, the content itself — the information — usually isn’t broken. It’s what you’re trying to do with it that’s broken. The CSS and the JavaScript. Go back to Tim Berners-Lee’s 2002 document on universality and device independence for a lesson in what putting stuff on the web is all about. Work with the web, not against it. It’s really good at presenting and sharing text and pictures. But it’s not a magazine layout. Berners-Lee once said,

“Anyone who slaps a ‘this page is best viewed with Browser X’ label on a Web page appears to be yearning for the bad old days, before the Web, when you had very little chance of reading a document written on another computer, another word processor, or another network.”

We can infer from this that a site isn’t ‘best viewed in’ anything: it’s just ‘viewed’, however it might end up. So, yes, your site might look lovely, but if getting it there is so complex that it breaks browsers, or takes up 50% of your development time, then you’re plainly doing something wrong.

Try taking your page back to basics, get rid of the awful advertisement JavaScript and the three different kinds of page tracking, and start paying more than just lip-service to web standards and accessibility. That XHTML doctype declaration you’re using, trying adhering to it. There, it probably works a lot better now, yes?

But ultimately, and as usual, I think the whole issue comes down to a business decision: how much time/money are we spending on development versus how much money that development brings in. It’s a brave person who decides to cut off 25% of their users.

Some points that came up as part of our ongoing discussion:

  1. Should the design be 100% consistent across all browsers, or would our designers be happy to sacrifice certain style elements? We currently stop a code release if something looks bad in IE6, although we have already made one or two decisions to remove an element from IE6 in order to expedite a code release. In both cases we ran things past the Guardian’s Creative Editor, Mark Porter, before doing so.
  2. If you want to drop suport for IE6, you have to completely and utterly drop support for it. And in all likelihood never look at it again. Because the next time you do, it will be horrifically broken. Stopping development on that browser doesn’t just mean it won’t get cool new features. It still gets the features, but they won’t be tailored to it, and will break it. That smart Javascript widget you just wrote? That breaks the page in IE6. Some new element you put in with a fixed width and margins? That breaks the page in IE6. You have to cut the cord. Be strong, give it a firm handshake and say goodbye.
  3. Turns out Microsoft haven’t quite cut the cord yet, though. Microsoft support Windows XP Service Pack 3 as a current product (it shipped in April 2008), and will retire support for it 2 years after the next service pack is released, or at the end of the Windows XP product lifecycle, whichever comes first. IE6, which shipped as a component of XPSP3, continues to have Mainstream Support as part of that product:
  4. Our current browser usage figures look like this:
    • IE 7: 35%
    • IE 6: 25%
    • Firefox 3: 25%
    • Safari: 7%
    • Firefox 2: 3%
    • Google Chrome: 1.5%
    • Opera: 0.5%
  5. We currently have a problem even testing in IE6, because the corporate build on the PCs we use doesn’t contain it, it has IE7 as standard. And you can’t run IE7 and IE6 concurrently. Ironically, our technical infrastructure is sufficiently advanced that we have difficulty supporting old technology.

That flurry of activity in full:

Forking hell! OSX, PHP, GD, Freetype problems? Read this…

Sunday, November 16th, 2008

So, I was trying to make a set of Moo cards, using the MOO API, as part of The Guardian’s first ever Hack Day. It’s very easy and fun to use, and I enjoyed the learning process of formatting the images and data and submitting the constructed XML to MOO to print the cards. But…

But, the formats available from MOO are quite restrictive. This is understandable, as they want to retain some control over quality and their own branding, which is held in high esteem. For example, you can only ever put an image on the front of the card, and text on the back. I wanted image and text on the front.

I was using PHP to create the XML to postto MOO, so now I needed to learn how to use ImageMagick to merge some text into the image I was using. Unfortunately I’m not a command line geek, so I tend to get stuck when someone tells me to compile PHP. Luckily, someone was on hand to help me install GD, which is considerably easier to use.

I used GD to merge text into the image using imagestring. But I wasn’t able to successfully specify the fonts to use – every image was rendered with the default system font. GD wouldn’t work. Then I tried using imagettftext. This resulted in a blank page. I was using GD 2.3.5 on PHP 5. Eventually I found a link which explains a problem with Apple’s default implementation of Freetype in GD that crashes GD if you try and specify a font.

The result? I installed Macports, and updated PHP and Apache that way, resulting in a new install with GD 2.3.7

And it’s all working now! Now I’ve gone over the problems, I’ll post a bit more about actually creating the cards next.