[Crm-sig] CRM in Linked Data

Thomas Baker tbaker at tbaker.de
Fri Mar 5 19:48:30 EET 2010

Hi Martin,

On Fri, Mar 05, 2010 at 06:01:18PM +0200, Martin Doerr wrote:
> Here a test how CRM concepts could be published as Linked open Data.
> http://www.cidoc-crm.org/crm-concepts#E34 etc.
> The idea is, that the concept E34 is meant by "http://www.cidoc-crm.org/crm-concepts#E34",
> and not the document fragment.

If the hash part is actually an anchor in an HTML file (i.e.
document fragment), it is hard to argue that it also stands for
a concept.

An alternative approach is to use the base URI for a document 
and use the hash part to prolong that URI into an identifier for
a concept.  For example, try clicking on:


The first is the identifier for the concept "Biography", so you
would use that whenever you want to reference the concept.
However, clicking on the first takes you to the second, which is
a reference to a Web page containing a description of the concept.

In my understanding, OAI-ORE uses this approach to distinguish a
Resource Map (documentation) from an Aggregation (as a
conceptual entity) [1].

[1] http://www.openarchives.org/ore/1.0/primer#remHashURIs

> Should we better call it "http://www.cidoc-crm.org/crm-concepts/E34", raise an http 303 error
> and then redirect to http://www.cidoc-crm.org/crm-concepts#E34 ?

That is another valid approach.

> Also completely unclear to me, how the Semantic Web community would recommend to have
> a persistent concept URI across multiple RDFS file versions.

DCMI (and some other vocabularies) use separate sets of URIs for the concepts
and for the documents.  For example, [1] for the concept dc:title and [2] for 
the latest RDF file where dc:title is defined.  If you see for yourself with

    curl -I http://purl.org/dc/elements/1.1/title

it will tell you:

    HTTP/1.1 302 Moved Temporarily
    Date: Fri, 05 Mar 2010 17:32:13 GMT
    Server: 1060 NetKernel v3.3 - Powered by Jetty
    Location: http://dublincore.org/2008/01/14/dcelements.rdf#title
    Content-Type: text/html; charset=iso-8859-1
    X-Purl: 2.0; http://localhost:8080
    Expires: Thu, 01 Jan 1970 00:00:00 GMT
    Content-Length: 286

This is actually not best practice because we set up these redirects
before OCLC hired Zepheira to re-implement the purl software to support
303 response codes (which are more correct). It is on our list to correct
the response codes returned by these purls.

Whenever we issue a new version of the RDF schema (e.g., with editorial
changes or additions), we log into the PURL server and change the PURLs
to redirect to the new file.

[1] http://purl.org/dc/elements/1.1/title
[2] http://dublincore.org/2008/01/14/dcelements.rdf

For another approach, see how BBC has done it [1,2,3,4].

[1] http://www.bbc.co.uk/music/artists/a3cb23fc-acd3-4ce0-8f36-1e5aa6a18432
[2] http://www.bbc.co.uk/music/artists/a3cb23fc-acd3-4ce0-8f36-1e5aa6a18432#artist
[3] http://www.bbc.co.uk/music/artists/a3cb23fc-acd3-4ce0-8f36-1e5aa6a18432.html
[4] http://www.bbc.co.uk/music/artists/a3cb23fc-acd3-4ce0-8f36-1e5aa6a18432.rdf

The current documents describing the 303 redirect approach are [1]
and [2], but the RDFa approach used (redundantly) by Library of Congress
is gaining traction  (e.g., view the page source for [3]).

[1] http://www.w3.org/TR/swbp-vocab-pub/ 
[2] http://www.w3.org/TR/cooluris/
[3] http://id.loc.gov/authorities/sh85014152#concept

As of two years ago, we believed that "slash" URIs provided more
flexibility for large-scale vocabularies and "hash" URIs were
simpler for vocabularies that fit easily into a small file that
could be downloaded without using alot of bandwidth.

But as these different methods are deployed, our collective
understanding of the design tradeoffs is improving.

The best place to ask this question is on the mailing list
pedantic-web [1].  I also follow that list and will be eager to
hear what the people there currently have to say.


[1] http://groups.google.com/group/pedantic-web

Tom Baker <tbaker at tbaker.de>

More information about the Crm-sig mailing list