Museum Web Developers

March 11, 2011

‘Things’ and our collections data

Filed under: Uncategorized — mia @ 12:58 pm

Frankie Roberto has made a web app based on the object records from the collections of the Science Museum, the National Media Museum and the National Railway Museum released yesterday.  In his words:

I thought I’d have a quick play with the data last night, and so managed to import them into a database and built a quick web app called ‘Things’:
http://what-is-this.heroku.com/

The main thing I wanted out of the data was to be able to browse by type-of-thing (eg ‘steam engines’). Given that this information isn’t easily accessible from the existing data, the first thing that ‘Things’ does is ask people to help classify the objects.

It’s sort of like tagging. But easier. :-)

If I get enough things classified I may have a go at seeing if an algorithm can learn from the data and classify the rest.

Let me know what you think.

Source code is here: https://github.com/frankieroberto/things -  patches welcome!

Given the number of crowdsourcing projects around*, the next step for the museum may be working out how to manage and make the most of user-created data we get back from projects like this.  This would be an excellent problem to have.

* I’ve also got lots of data to handover based on tags and facts added by people playing with the astronomy collections on Museum Metadata Games, which was again only possible because the Powerhouse Museum has an API and the Science Museum made an earlier, XML-based API.

March 10, 2011

Update on collections data and geocoded NRM data

Filed under: collections,data,requestforcomment — mia @ 6:05 pm

I’m glad to see the news about the release of objects from the collections of the Science Museum, the National Media Museum and the National Railway Museum has spread so far and wide already.

A few people have commented on the licence (Creative Commons Attribution-NonCommercial-ShareAlike, CC BY-NC-SA) and on the format (CSV).  As tomorrow is my last day, I can’t really speak for the museum but the intention is to learn from how people use the data – the things they make, the barriers they face, etc – and iterate (as resources allow) until we get to an optimal solution (or solutions). So please get in touch if you’ve got requests or think you can help clear up some of the issues these kinds of projects face, because there’s a good chance you’ll help make a difference.

The licence is a pragmatic solution – it’s clarification of existing terms rather than a change to our terms, because this avoided a need for legal advice, policy review, etc, that would have added several months to the process.

And yes, I know CSV is quick and dirty, but it’s effective. The museum sector is still working out how to match the resources available with the needs of mash-up type developers who work best with JSON and those who are aiming for linked open data; my hope is that your feedback on this will help museums figure out how to support people using open data in various forms. A simple solution like this also means it’s easy for the museum to re-run the export to update the data as time goes on, and that anyone, geek or not, can open the files without being startled by angle brackets and acronyms. Also, did I mention it was quick?

Finally, we’ve already had some useful feedback and even some improved files. Richard Light sent us a geocoded version of records from the National Railway Museum (NRM) (index of locations: http://api.sciencemuseum.org.uk/collections/updates_from_other_people/Richard_Light/nrm-geo-sort.xml (63kb), full file http://api.sciencemuseum.org.uk/collections/updates_from_other_people/Richard_Light/nrm-geo.xml – 20mb, browser-beware).

I’ll let Richard explain in his own words:

I converted the source CSV to XML using my CSV Converter program, which is a home-made program I wrote to do a “mail-merge” on CSV data, with the aim of easily generating other formats such as XML.

The geocoding was carried out by calls to my place URL-ifier program. This uses the standard Geonames query API, but splits a place description into its component place names (e.g. “Swindon, Wiltshire, England” becomes three place names) and searches for a “Swindon” contained within places “Wiltshire” and “England”.

I wrote an XSLT transform which copied the source document, and each time it found a place field, it called out to my URL-ifier using the document() function:

<xsl:template match=”PLACE_MADE[text()!='']“>
<xsl:variable name=”geonames”
select=”document(concat(‘http://light.demon.co.uk/scripts/getPlaceURL.exe
?amp;q=’, text()))/*/text()”/>
<xsl:copy>
<xsl:if test=”$geonames!=””>
<xsl:attribute name=”geonamesId”><xsl:value-of
select=”$geonames”/></xsl:attribute>
</xsl:if>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>

Where this was successful in inferring a Geonames identifier, it added a “geonamesId” attribute to the PLACE_MADE field. So the result is a copy of the source data, with added geocoding.

All of the NRM data was geocoded in a single XSLT operation, but this operation had to call my URL-ifier, and hence the Geonames API, many times. There are limits on how hard you can hit this service, so care needs to be exercised! (You can get your own Geonames identifier for free, and then have your own allocation of API calls, if you want to use this service in a serious way.)

Now that the data contains Geonames URLs, you have access to all the background information about each place. All Geonames entries have lat/long co-ordinates (which is what you need to stick a pin on a map in your browser, using e.g. KML markup), but in addition will often have info such as population. You just need to make an HTTP request for the Geonames URL, specifying that you want RDF back, e.g.: http://light.demon.co.uk/scripts/cgiforwarder.exe?url=http://sws.geonames.org/2633352/&accept=rdf and process the RDF/XML which comes back.

Personally, this kind of thing makes it all worthwhile – we can’t easy export our entire geographical hierarchy, so being able to geocode the imperfect data we have is really useful.

If you’ve done something interesting with our data we’d love to feature it. We’re also curious to know who’s having a look at it, even if you’re not at the point of having something to share.

Finally, I’d almost forgotten to thank the many wonderful people who’d contributed to the Museums and the machine-processable web site or come along to #linkingmuseums meetups to work out how to get to re-usable museum data. I’ll be keeping up the wiki in future, and can be contacted @mia_out.

March 9, 2011

Collections data published

Filed under: collections,data — mia @ 8:10 pm

I’m very excited about sharing this with you – we’ve just released 218,822 records about objects from the collections of the Science Museum, the National Media Museum and the National Railway Museum.

The collections include objects relating to aeronautics, agriculture, astronomy, cinematography, medicine, materials, space, television, time measurement, transport and more. They range in size from contact lenses to Concorde 002.

We’ve released the files as a lightweight experiment – we’d like to understand whether, and if so, how, people would use our data. We’d also like to explore the benefits for the museum and for programmers using our data – your feedback will inform decisions about future investment in more structured data as well as helping shape our understanding of the requirements of those users. The files are in CSV format – because it’s a really simple format, viewable in a text editor, we hope that it will be usable by most people.

We’ve published three data sets:

  • 218,822 object records
  • 40,596 media records
  • 173 event records

The files are released under the Creative Commons Attribution-NonCommercial-ShareAlike (CC BY-NC-SA) licence. Please get in touch if you’ve got ideas that require a commercial licence.

The files are available at
Documentation for collections data from Science Museum, National Media Museum, National Railway Museum (NMSI) released as CSV. This page includes information about the fields available and the collections included.

The documentation page includes contact addresses, or you can leave a comment below.

April 4, 2010

Some thoughts on linked data at the Science Museum – thoughts in progress

Filed under: API,babbling — mia @ 3:25 pm

I’ve posted on twitter and my personal blog but forgot to post over here (tsk) – I’ve written some very-much-in-progress thoughts on how the Science Museum could work with linked data/APIs to improve our machine-readable data offerings at the museum data wiki.

I’m particularly interested in finding the balance between a solution we can achieve in the medium-term and something that works with standards as much as possible.

It’s nearly time for the Museums and the Web 2010 conference, where questions like this might be addressed in one of the unconference sessions so I’d love to hear your thoughts.

February 1, 2010

Cosmic Champions – winners of Cosmic Collections competition announced

Filed under: API,competition,cosmosandculture,mashups — mia @ 7:20 pm

In case you missed the announcement on twitter or elsewhere, the winners have been revealed

Our thanks to everyone who participated, commented, critiqued or cheered the project along.

November 18, 2009

Cosmic Collections – do one thing and do it well

Filed under: API,competition,cosmosandculture,mashups — mia @ 7:03 pm

The Cosmic Collections competition has been running for a few weeks now, and while we’ve been sucked into a vortex of other projects, I’ve been keeping an eye out for feedback from the public.

As a result, I’ve realised that there may be some mismatch between the way mashups tend to work, and the scope we’ve suggested for entries to our competition. The types of interfaces someone might produce with the API may lend themselves more to exploring one particular idea in depth than produce something suitable for the broadest range of our audiences.

So I’m proposing to change the scope for entries to the competition, to make it more realistic and a better experience for entrants: I’d like to ask you to build a section of a site, rather than a whole site. The scope for entrants would then be: “create something that does one thing, and does it well”. Our criteria – use of collections data, creativity, accessibility, user experience and ease of deployment and maintenance – are still important but we’ll consider them alongside the type of mashup you submit.

This might mean producing a mashup for one particular way of exploring the objects, or exploring a sub-set of the objects. It’d then be up to us to combine the winning mashups into a larger site that works for our audiences.

What do you think? If there aren’t any huge objections, I’ll go ahead and update the criteria. Of course, if you’ve been working on something and feel it’s unfair to change the criteria at this stage, let me know and we’ll work something out.

As a reminder, here are the basic details for the Cosmic Collections competition:

How to take part

1. Check out the data here

2. Get some help:

Read our tips for entrants, check out these mashup resources, and get some info about our audiences. Check out the documentation and connect with other people who want to enter the competiton. You can also join the Google group or use the hashtag #coscultcom in conversations on Twitter.

3. Get inspired

Visit the exhibition and check out these videos about the exhibition

4. Get creative and get mashing!

5. Send us a link to your entry.

Email us by midnight on November 28 (GMT) – you don’t need to pre-register.

(And the title? I’m a big fan of the Unix philosophy, “do one thing and do it well”.)

October 8, 2009

Join in our ‘cosmic collections’ mashup competition

Filed under: competition,cosmosandculture,mashups,Uncategorized — mia @ 2:20 pm

I won’t repeat the information already available on the Cosmic Collections launch event, and I’ll simply link to the page containing more information on judging criteria, prizes and timelines…  I just want to use this post to encourage you to sign up for the launch and head over to the competition wiki to find some team mates to turn your brilliant mashup idea into a working website.

September 10, 2009

Working out collections online – your questions?

Filed under: collections,design,requestforcomment — mia @ 8:59 pm

I’ve been slowly putting together a list of research questions to try and tackle as I re-work our collections online (with our very own blogger, the transport curator David Rooney) and the ‘Online Stuff‘ section.  I’ll write up the process and my ideas as I go, but in the meantime – what’s your number one question about presenting museum collections online?  It could be ‘does x work better than y’, or ‘do people really want z’, or anything that’s been hanging around at the back of your brain. Leave a comment below or tweet @mia_out.

And speaking of collections online, check out the V&A’s Collections Search, just out in beta today.  There’s so much to explore and the interface is a delight.  Congratulations to all involved!

September 3, 2009

Save the date – Cosmos and Culture launch event

Filed under: competition,cosmosandculture,mashups — mia @ 6:02 pm

A very quick post to let you know that we’ve set a date for the launch of the Cosmos and Culture website competition I’ve mentioned earlier.

The launch event will be held at the Science Museum on Saturday, October 24, 2009.  We’re planning a curator’s talk about the Cosmos and Culture exhibition, and we’re looking at ways to help people meet potential team mates – some kind of ‘hack matching’ thing.  We’ve made it a Saturday so that most people should be able to come, and it’ll be open to anyone.  More details to follow as we work them out!

At this stage, the competition submission date will be November 28, 2009, and we’re aiming to have the winning website(s) live to the public by mid-December.

If you’ve got any questions, leave a comment on this post, or tweet with the hash tag #coscultcom.

April 24, 2009

Some on-going work on museum APIs

Filed under: API,design,requestforcomment — mia @ 10:20 pm

Just a quick note to say we haven’t abandoned this blog, but at the moment I’ve been concentrating on working out issues around schemas/formats, content, and functions for re-usable and interoperable cultural heritage data on a wiki.

There’s a list of things you can do if you work in a museum or are a developer interested in using museum data – jump in!

Older Posts »

Powered by WordPress

Switch to our mobile site