[Crm-sig] reified association vs sub-event
vladimir.alexiev at ontotext.com
Thu Oct 16 13:43:45 EEST 2014
Richard> the famous "property of a property" issue, which has proved to be a challenge for the CRM in an RDF context.
This and similar issues are of course not unique to CRM, and various approaches have been adopted in the RDF community.
- domain-specific mechanisms, e.g.
-- skosxl:Label reifies a label in its own node
-- PROV has both direct roles, and "qualified" (i.e. reified) roles:
- statement reification (rdf:Statement),
- property reification vocabulary (that allows to express any domain-specific reification in a machine-accessible way)
- or named graphs.
These methods allow you to add any extra data (attributes, provenance, etc) to relations.
I would suggest that before we adopt any modeling decision, we should study these existing practices.
See this paper "Types and Annotations for CIDOC CRM Properties" at http://www.ontotext.com/research-publications/?yr=2012
where I analyze some of the approaches in relation to CRM.
The already quoted https://confluence.ontotext.com/display/ResearchSpace/BM+Association+Mapping+v2
shows the actual patterns used in the BM mapping.
E13_Attribute_Assignment is rather fundamental to CRM because it's the mother of all long-paths (e.g. has_dimension is a shortcut, Measurement is a long-path).
It almost matches RDF reification (rdf:Statement with rdf:subject, rdf:object and rdf:predicate),
but it doesn’t have a way to point to the property (rdf:predicate).
Ergo the need for its extension bmo:EX_Association and bmo:PX_property.
Martin once made what I think is a productive remark, but it hasn't been taken seriously and investigated:
use E13_Attribute_Assignment.P2_has_type to hold the property.
Of course, this will make properties be E55_Type.
BTW, MARC Relators are defined in this way: they are both skos:Concept and owl:ObjectProperty.
E.g. see http://id.loc.gov/vocabulary/relators/abr.nt:
<http://id.loc.gov/vocabulary/relators/abr> a owl:ObjectProperty.
rdfs:subPropertyOf <http://id.loc.gov/vocabulary/relators/role>, <http://purl.org/dc/elements/1.1/contributor>.
I still can't make up my mind whether that’s an awful transgression or a neat trick.
Simon, what do you think?
Simon> Using reified associations labelled with concepts from a version of SKOS supporting hierarchical relationships
> does not automatically entail that hierarchy for the associations
Yes of course, the consequent facts would have to be asserted separately.
But do you see more troubles with such approach than that?
> declaring the more specific property to be a subproperty of the original property.
> BM tried the "subproperty" strategy first, and found that it led to an explosion in the size of their data model,
> and didn't sit well with their actual data, e.g. their roles termlist
Yes, this has the following problems:
- huge ontology compared to the core it uses
- volatile ontology: every time role terms are modified, so needs to be the ontology
- you can't say more about the role instance (e.g. who when asserted it is so; probability/uncertainty, etc) without turning again to Reification
- while role instances are reflected in CRM (through subproperty inference), Reified statements *are not*.
This means that a model using Reification over extension properties *is not* (fully) a CRM Extension as defined in the standard.
Richard> The trouble with the "sub-event" strategy is in my view two-fold:
> it is creating sub-events where there are none, simply to address a modelling problem with people having multiple roles;
> and it is falsely associating the role with the sub-event when that role is actually a property of the person involved in the sub-event.
That's not quite correct.
- The ability of CRM to always breakdown an event into subevents is quite powerful.
E.g. you may consider "engraver", "printer" and "publisher" as 3 roles of the same production event,
but I may very naturally consider them as 3 subevents of the production: "engraving", "printing" and "publishing".
Then the standard carried_out_by is enough: the person who carried out the "engraving" is clearly the "engraver" of that CHO (object).
No falsehood here.
- On the other hand, the BM has gone too far into breaking up events,
since unfortunately in their database there's no info correlating the various fields of events (e.g. facts about production).
So the BM mapping emits every fact in its own subevent: something that I don't think other museums should follow.
- And in some cases I agree the breaking up into subevents goes too far.
E.g. a Change of Ownership where one owner got money (sold) but the other did not (donated).
If that's modeled as subevents, shouldn't we also model fake (ideal) parts of the object that the two people owned before the transaction?
Martin> I think we should stop discussing half-hearted work-around as if the problem would not exist.
I agree, it's an important problem that has connections to one of the most important CRM constructs:
shortcuts vs long-paths.
Martin> introduce classes for all 3ary properties by a standard naming convention,
> such as "R14Node_carried_out_by"
Specific classes have both advantages and disadvanteges compared to a general class like E13.
These should be studied carefully...
I tend more towards a generic solution (currently that is).
In "Solution 2", an important question is where do we put extra info: "PC14 carried out by" or "E7 Activity"
Carlo> I perceive this general sense of distaste for RDF reification, but I must confess I do not understand it.
Me neither. I use it e.g. in Getty to represent historic info on relations:
Maybe because it's non-economical:
If oen already has a domain-specific class that reifies the relation, one should use that.
But if there's no such class, I think rdf:Statement is ok.
Simon> In OWL 2 it is possible to add annotations to a property assertion axiom.
> These annotations are only about the particular act of assertion, rather than what is being asserted.
These are isomorphic to rdf:Statement and I still don't quite grok the difference.
Guess it's the same difference as between data/object properties, and annotation properties.
But rdfs:label is an annotation property, yet the whole world uses it for labels of objects.
More information about the Crm-sig