[Crm-sig] Qualifying predicates and objects (advice/criticism needed)

Vladimir Alexiev vladimir.alexiev at ontotext.com
Wed Jan 30 18:23:52 EET 2013

Hi Dan!

Advice: write more of your questions and examples in Turtle, this way it will be more precise and understandable.
I use Turtle below, the spec is at http://www.w3.org/TR/turtle/

> •   object (string/number/date/XMLLiteral)
> •   P72-has_language (for the string-statements)

You can use P72_has_language only with a *node* that's E33_Linguistic_Object, not with a *literal*.
If you look at the class diagram http://personal.sirma.bg/vladimir/crm-graphical/
you may be misled then that you can't use P72 with E41_Appellation and its subclasses.
But in fact you can, since RDF nodes can have *multiple* classes. Eg:

<person/vladimir> a E21_Person; 
  P131_is_identified_by <person/vladimir/name/1>, <person/vladimir/name/2>.
<person/vladimir/name/1> a E82_Actor_Appellation, E33_Linguistic_Object;
  P3_has_note "Vladimir Alexiev"; P72_has_language <thes/lang/en>.
<person/vladimir/name/2> a E82_Actor_Appellation, E33_Linguistic_Object;
  P3_has_note "Владимир Алексиев"; P72_has_language <thes/lang/bg>.

But another RDF practice is much more established: use literals with language:

<person/vladimir> a E21_Person; 
  P131_is_identified_by <person/vladimir/name>.
<person/vladimir/name> a E82_Actor_Appellation;
  P3_has_note "Vladimir Alexiev"@en, "Владимир Алексиев"@bg.

You may even decide to use other ontologies for common things like primary labels and names, 
either instead or together with the CRM constructs, eg:

  <person/vladimir> a E21_Person; skos:prefLabel "Vladimir Alexiev"@en, "Владимир Алексиев"@bg.

Other options beside skos:prefLabel are rdfs:label and foaf:name.

> number
> •   P91-has_unit (for the number-statements, in case they are dimensions :-)

- make sure you emit numbers with correct XSD types
- use an established units ontology, eg
@prefix unit: <http://qudt.org/vocab/unit#>

Dimensions are handled like this:

  <obj/2926> crm:P43_has_dimension <obj/2926/height>.
    crm:P2_has_type <thes/dim/height>;
    crm:P91_has_unit unit:Centimeter;
    crm:P90_has_value "47.2"^^xsd:double.

Unfortunately CRM has not standardized how to represent min/max ranges.

Integers would be used only for P57, eg
  <obj/2926> P57_has_number_of_parts 2;
    P46_is_composed_of <obj/2926/part/1>, <obj/2926/part/2>.
  <obj/2926/part/1> P2_has_type <thes/type/painting>.
  <obj/2926/part/2> P2_has_type <thes/type/frame>.

> dates

You have to emit dates like this (say the production date of an object):

  <object/production> P4_has_time-span <object/production/date>.
  <object/production/date> a E52_Time-Span;
    P82_at_some_time_within "1917-11-07"^^xsd:date.

If you have date ranges:

    P82a_begin_of_the_begin "1917-11-07"^^xsd:date;
    P82b_end_of_the_end "1917-11-27"^^xsd:date.

If you just know the year, it's better to do this:

     P82_at_some_time_within "1917"^^xsd:gYear.

instead of inventing fake dates like 1917-01-01 and 1917-12-31

> XMLLiteral

Can you give a useful example? I think this is only useful to encode HTML or SVG (we use SVG in ResearchSpace).

Use P82/a/b only for valid XSD dates (which means Gregorian). 
For other date variants, use P78_is_identified_by

  <object/production/date> a E52_Time-Span;
     P82_at_some_time_within "1917-11-07"^^xsd:date;
     P78_is_identified_by <object/production/date/name/1>, <object/production/date/name/2>.
  <object/production/date/name/1> a E50_Date;
    P2_has_type <thes/date/gregorian>
    P3_has_note "07.11.1917".
  <object/production/date/name/2> a E50_Date;
    P2_has_type <thes/date/julian>
    P3_has_note "26.10.1917".

> •   syntaxEncodingScheme (for the string/XMLLiteral-statements)

You don't need it, just use Unicode/UTF everywhere.

> https://docs.google.com/file/d/0B33eKhOVEE6-aEhMNFRIOTVDaWs/edit
> I show (as an example) 3 alternative ways of avoiding the specialisation of P78-is_identified_by.

It's a very good idea to avoid such proliferation, even though the CRM spec gives such advice.
See my paper http://www.ontotext.com/sites/default/files/publications/CRM-Properties.pdf
for motivation and examples.

- Example 1 is bad: date versions should be encoded in RDF not in an opaque XMLLiteral.
- Example 2 is what I illustrated above.
- Example 3 cannot be done directly in RDF, since props can't have props. 

My paper outlines 3-4 alternatives, and in ResearchSpace we go with "reification", 
using an extension of E13_Attribute_Assignment called bmo:EX_Association.
In many cases you don't need such complications, eg above you have a node E50_Date, so you can attach P2_has_type there.
But if you need to add a type/role/association code to a relation (eg P14_carried_out_by), you have to do it.

> What if I want to say that a string-object is a transliteration ?

Search the spec for "transliteration" and you'll find P139_has_alternative_form.
The example refers to "P139.1 has type" which would mean you need to attach P2 to the P139 through an EX_Association.
But the simpler way is to attach this P2 directly to the second E82_Actor_Appellation:

  <person/vladimir> a E21_Person; 
    P131_is_identified_by <person/vladimir/name/1>, <person/vladimir/name/2>.
  <person/vladimir/name/1> a E82_Actor_Appellation;
    P3_has_note "Владимир Алексиев"@bg;
    P2_has_type <thes/name/original>.
    P139_has_alternative_form <person/vladimir/name/2>.
  <person/vladimir/name/2> a E82_Actor_Appellation;
    P3_has_note "Vladimir Aleksiev"@en;
    P2_has_type <thes/name/transliteration>.

Note: Transliteration and Translation are handled differently in CRM. 
Search for a prev thread in this mlist (about a month ago).

Hope to help! Vladimir

More information about the Crm-sig mailing list