There was exciting news yesterday morning, when we announced the next stage of the Guardian’s stated strategy to be the world’s leading liberal voice. The Guardian is opening out — making our content available for other people to use — and also opening in — allowing developers to build on our platform and deploy applications which extend its functionality.
So, the headlines from the announcement are:
- Open Platform API
- search, query, filter and discover content, keywords and tags from the Guardian’s archive
- contains full textual content of all Guardian articles going back to 1999
- currently in private beta (apply for a key)
- free for the first 5000 queries per day
- can be used for commercial purposes (you can make money by running ads with it)
- it will at some point in the future be ad-supported on pages using the full content
- Data Store
- a curated collection of data sets
- researched, verified and attributed to its source
- hosted on Google Docs and free to use
- covering subjects such as diverse as US economic data, environmental statistics, crime figures and religious information
- Data Blog
- accompanies the Data Store
- will provide information around the raw data: how we sourced it, why we use that particular data set, what the information might mean
This constitutes a wealth of information to announce in one go, and it may take people some time to digest it all. The really exciting thing about this move is that we’re putting the full content of our articles out there for people to use. The implications for data mining, linguistic research and deep textual comparisons are endless, and I’m really looking forward to seeing what people come up with. Having context to the data is really important, so people can do much, much more than just link back to our site using a headline or an excerpt.
The Data Store is also a really bold move. Simon Rogers, one of our News Editors, and the journalists here put amazing amounts of effort into research, and here we are returning the fruit of their labour into the community. Of course, we use that research to report and editorialise, but here we give you the opportunity to derive your own patterns and meaning from the same data. The fact that this stuff has been manually sourced, collated and published makes it mean so much more, and I’m sure people, including other journalists, will find it an increasingly useful source of information for years to come.
I’ve collected some useful links here which are specifically related to the Open Platform:
- Open Platform FAQ
- Open Platform T&Cs
- Open Platform API talk group
- Data Store talk group
- Open Platform blog
I’ve also collected some of the news coverage and blogposts about the announcement here: