[Crm-sig] Using multiple instantiation

Martin Doerr martin at ics.forth.gr
Wed Dec 5 18:05:13 EET 2018


Dear All,

I propose this paragraph to be added to the implementation guidelines 
for RDFS:

"*About implementing multiple Instantiation*

Knowledge representation models and more generally semantic networks 
differ fundamentally in one aspect from data structures, such as XML, 
Relational database schemata and data structures in all programming 
languages, including the object-oriented one:

·Knowledge representation starts with an item in the real world 
regardless its nature, assigns an identifier to it in order to be able 
to make assertions about it, and then accumulates statements 
(assertions, propositions) about it.

·Data structures start with a set of templates, a set of foreseen kinds 
of statements dedicated to a particular category each (class, entity), 
to be filled in by a user.

Consequently, knowledge representation may assign multiple classes to a 
given identifier without any problem. The associated processing software 
will then allow for asserting for this identifier all properties 
applicable to each assigned class. This process is called “multiple 
instantiation. For instance, the “weapon” with all its characteristics 
may also be a “ceremonial object”.

A system based on data structures must create a different instance of 
the respective templates for each class an item belongs to. It may later 
the link the different instances describing aspects of the same thing, 
in order to simulate the mechanism. In particular the very successful 
“encapsulation principle” of object-oriented programming languages 
requires dedicated data structures and constitutes a fundamental 
mismatch with the Open-World modeling of semantic relationships (see, 
for instance Schnase 1993). Fundamental to semantic data integration are 
also superproperties, which are not provided by data structures either.

The CRM as ontology relies heavily on multiple instantiation: Classes 
that use to co-occur on things simultaneously “incidentally”, without 
being associated with properties only applicable to the combination of 
such classes, are not modelled individually as subclasses of multiple 
parent classes. The latter would be called “multiple IsA”. To avoid 
multiple IsA in such cases is an important normalization principle to 
keep the ontology very compact and unambiguous.

Most implementations on top of RDF still use RDF as if it were a fixed 
schema and repeat in the UI code all the schema. Therefore, the promise 
of RDF and other semantic models to be able to accommodate dynamically 
new properties often does not work. It is still as if they were using 
Relational systems. Generic XML editors do adapt already to the schema, 
but usually the rendering paradigms they employ, without additional 
parameters, are too poor for good UI code. One can however write code 
that reads the RDF schema used at run-time and that extends data entry 
and display by the actual properties found. This functionality is 
foreseen by SPARQL, but most programmers still do not appreciate the 
utility of querying the schema. Even if fixed templates are used, the 
data entry system should foresee the same thing to be described by 
multiple templates, relatively freely selectable by the user.

In the specification modules of mapping software used to transform data 
into a CRM-compatible form, care must be taken to foresee and allow the 
user to combine RDF classes systematically. It may be useful to develop 
tools for specific guidance that show users how a valid path from a 
given domain class to a certain range class can be created by using 
multiple instantiation (and, by the way, also by using subclasses of the 
domain class), such as combining /E41 Appellation/ with /E33 Linguistic 
Object/ in order to reach /E56 Language/ via /P72 has language./

In a local system, another workaround for multiple instantiation can be 
the creation of classes that replace all candidate cases for multiple 
instantiation by subclasses using multiple IsA. For good reasons, the 
compatibility with the CRM is defined at the import/export/query level 
and not at the system internals. Therefore, such internal workarounds do 
not affect the interoperability: Whereas the query compatibility of this 
solution with the standard is immediate, the respective import/export 
system simply needs to make the trivial replacements of the respective 
class combinations with their multiple IsA counterparts and vice-versa.

So, partially, problems with multiple instantiation are a question of 
programming practice. On the other side, it is also a question of user 
training and extended good practice. Users may provide feedback about 
frequent cases where multiple instantiation is used, in order to guide 
users to these modelling cases. These could systematically be entered 
into the CRM RDF implementation, without requiring the CRM standard 
itself to repeat them."

John L. Schnase, (1993). "Semantic Data Modelling of Hypermedia 
Associations", in: ACM Transactions on Information Systems, Vol.11,No.1, 
January 1993, p 45.

Comments welcome!

Best,


Martin

-- 
------------------------------------
  Dr. Martin Doerr
               
  Honorary Head of the
  Center for Cultural Informatics
  
  Information Systems Laboratory
  Institute of Computer Science
  Foundation for Research and Technology - Hellas (FORTH)
                   
  N.Plastira 100, Vassilika Vouton,
  GR70013 Heraklion,Crete,Greece
  
  Vox:+30(2810)391625
  Email: martin at ics.forth.gr
  Web-site: http://www.ics.forth.gr/isl

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ics.forth.gr/pipermail/crm-sig/attachments/20181205/ceb6e633/attachment-0001.html>


More information about the Crm-sig mailing list