[Crm-sig] New Issue "Appellations that ARE URIs" (HW Issue 363, 383)
Martin Doerr
martin at ics.forth.gr
Sat Nov 10 23:15:03 EET 2018
Dear All,
After complete rewriting of the text about implementing CRM in RDF I
have temporarily abandoned google Docs. It is more efficient to split
the topic, and then recombine.
Here my reformulation of the "punning" topic, duality of Appellation, as
a discussion Item. Please check at the end the open questions I pose!
"In the CRM names are modelled as instances ofE41 Appellation. This
class comprises any symbolic object used or created to name something
without requiring further meaning. The CIDOC CRM version 6.2 definesE41
Appellation, subclass of E90 Symbolic Object, as:
“This class comprises signs, either meaningful or not, or arrangements
of signs following a specific syntax, that are used or can be used to
refer to and identify a specific instance of some class or category
within a certain context.
Instances of E41 Appellation do not identify things by their meaning,
even if they happen to have one, but instead by convention, tradition,
or agreement. Instances of E41 Appellation are cultural constructs; as
such, they have a context, a history, and a use in time and space by
some group of users. A given instance of E41 Appellation can have
alternative forms, i.e., other instances of E41 Appellation that are
always regarded as equivalent independent from the thing it denotes. “
The CRM is an ontology in the proper sense. Therefore, instances of
physical things and phenomena of the physical worlds are regarded to be
the things themselves, and not their machine representation, and any
identifier or name used for something from the material world is
different from the thing itself. For instance, I, Martin Doerr, am an
instance of E21 Person, and not any of the URIs or records that may
represent me in an information system. I am unique in this world, as is
any particular thing, in contrast to representations of me.
In the CRM, the property“P1 is identified by” from E1 CRM Entity” to
“E41 Appellation” relates the things to their names or identifiers.
In any knowledge representation schema, any item that cannot “reside” in
the machine itself due to its nature, must be represented by one
selected primary identifier, in the case of RDF by a URI. For an
information system to be consistent with the described reality, these
selected identifiers should map one-to-one to the ontological instances
they stand for. Therefore, any instance of a class represented by a URI
in RDF plays a dual role: it stands for the ontological instance and is
an identifier for it (see also Meghini et al. 2014).
For practical reasons, we do not represent this duality by a recursive
use of “P1 is identified by” from an instance to itself in its second
capacity as an identifier. However, all other names and identifiers are
related to the select primary identifier via “P1 is identified by”. This
implies that the choice about which of multiple identifiers is the
primary one may be changed without changing the meaning. In contrast,
owl:same_as relates two primary URIs of things as different
representation of the same real world thing, aggregating the properties
of both representations as valid for the real world thing.
In practice, only the URIs, literals and datatypes “reside” themselves
directly in a machine and need no additional identification because they
are completely identified by their content.
We may distinguish four different kinds of Appellations: URIs,
identifiers from local application contexts, literally defined names
used in human written communication and names from oral communication
and tradition. Typically, URIs and local identifiers have a unique
representation as strings. However, the situation for names is more
complex.
For instance, 北京is a literally defined name for the capital of China.
“Bei Jing” is meant to be an representation of the same name in Latin
characters (underspecified without accent marks), and not meant to be
another name for the same city. “*Doerr* is a respelling of *Dörr*, a
German surname^^[1] <#_ftn1>”. The most elaborate and effective good
practice for registering proper names comes from the library community
(Doerr, Riva and Zumer 2012). The FRBR Review Group of IFLA decided for
practical reasons to identify a name (“Nomen” in their terminology) by
the identical sequence of characters in a given script, not by the
binary encoding.
For historical research however, in particular capturing oral tradition,
this definition is too narrow, and we are confronted in relevant CRM
applications with cases of names with spelling variants and even spoken
variants. All cases of names that cannot uniquely be identified with a
character sequence must be represented with a URI and *further
properties of description must be added, by preference the newly
proposed property “E90 Symbolic Object: has symbolic content”*. Also, if
someone wants to document facts about a name other than its spelling, a
URI must first be assigned, because a character string itself cannot be
referred to in RDF. This case must not be confused with documenting
facts about the relation between a name and a particular carrier of that
name, because that would be a reification of this relation, and not
talking about the name.
Summarizing, there are two cases:
a)A name or identifier is completely defined and identified by a
character sequence or any digitally, unambiguously encoded symbol.
b)A name or identifier is identified but not defined by a URI.
As a matter of fact, RDFS provides the property rdfs:label, which
implements exactly the case a) above, without the possibility to add
descriptions of the name itself. SKOS specializes rdfs:label into
properties such as skos:prefLabel and skos:altLabel, which define indeed
the names by which things are called by people. We take therefore the
use of rdfs:label as existing good practice. Consequently, we have to
regard rdfs:label as a special case of “P1 is identified by”, and all
literals used as range instances of rdfs:label implicitly as instances
of E41 Appellation (see section “RDF implementation tests” item 1.).
Unfortunately, our KR languages have not foreseen the case that an
instance of a datatype is also an instance of a user-defined class. This
causes a range conflict, which can be overcome by “punning” the range of
“P1 is identified by” to be both rdfs:Literal and E41 Appellation (see
section “RDF implementation tests” item 2.).
This recommended implementation allows for using both models for
Appellations, via an additional URI or directly as literal, and
returning with one query all range instances of “P1 is identified by”
following this interpretation. The SPARQL query result separates URIs
from literals automatically. So, there is no ambiguity about the nature
of the result.
Only if the same name is described both directly via rdfs:label and
indirectly via a URI, the matching of both would need another query.
So, the frequently asked question remains, why not avoiding this double
definition and describe any instance of E41 Appellation via another URI?
The answer is, that actually the cases that require explicit
representation of E41 Appellation are relevant but rare. On the other
side, good practice requires all nodes in a semantic graph represented
by a URI to carry a human-readable label in addition. This means that
the storage volume and query performance would be heavily hampered by
such a “pure-logic-driven” decision.
The only ambiguity that remains is the case in which the instance of
Appellation is literally the URI itself, and not a URI representing an
Appellation of different form. There are two solution to this problem:
*Either classify this URI by the class of things it identifies and use
owl:same_as, or we define a specific subclass of E41 Appellation “URI”.*
*Another question is, if label for the readability of the semantic graph
should be distinguished from names used in the referred to world.*
*Tests:*
**
*asking for the subproperties of rdfs:label as follows:*
*
<rdf:Property rdf:about=*="http://www.w3.org/2000/01/rdf-schema#label">*
<rdfs:domain rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource"/>
<rdfs:range rdf:resource=" http://www.w3.org/2000/01/rdf-schema#Literal "/>
*<rdfs:subPropertyOf rdf:resource="P1_is_identified_by"/>*
</rdf:Property>
Query (Give me all the superproperties of rdfs:label) :
select * where {
rdfs:label rdfs:subPropertyOf ?p
}
Result from Virtuoso:
p:
http://www.cidoc-crm.org/cidoc-crm/P1_is_identified_by
*
1.The ttl data that was presented previously has been added in virtuoso:
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix crm: <http://www.cidoc-crm.org/cidoc-crm/> .
<http://example.com/person/alexander_the_great>
crm:P1_is_identified_by
<http://example.com/appellation/alexander_the_great> .
<http://example.com/appellation/alexander_the_great>
rdfs:label "Alexander the Great" .
<http://example.com/person/alexander_the_great>
rdfs:label "Alexander the Great" .
<http://example.com/person/alexander_the_great>
crm:P1_is_identified_by "Alexander the Great" .
2.A query to return all the “identifiers” of alexander the great using
the is identified property was applied:
prefix crm: <http://www.cidoc-crm.org/cidoc-crm/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select * where
{ <http://example.com/person/alexander_the_great>crm:P1_is_identified_by
?identifier }
*result: *
*identifier*
http://example.com/appellation/alexander_the_great
Alexander the Great
------------------------------------------------------------------------
^^[1] <#_ftnref1>https://en.wikipedia.org/wiki/Doerr
--
------------------------------------
Dr. Martin Doerr
Honorary Head of the
Center for Cultural Informatics
Information Systems Laboratory
Institute of Computer Science
Foundation for Research and Technology - Hellas (FORTH)
N.Plastira 100, Vassilika Vouton,
GR70013 Heraklion,Crete,Greece
Vox:+30(2810)391625
Email: martin at ics.forth.gr
Web-site: http://www.ics.forth.gr/isl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ics.forth.gr/pipermail/crm-sig/attachments/20181110/b74796b5/attachment-0001.html>
More information about the Crm-sig
mailing list