[Crm-sig] clarification needed

martin martin at ics.forth.gr
Tue Jul 1 13:38:17 EEST 2008


Dear Guenther,

Guenther Goerz wrote:
> Dear Christian-Emil,
> 
> it's pretty hard to get to your point because your text is truncated in
> the middle of a sentence at the end of the first paragraph.
> 
> In your first sentence, you are referring to my "first stroke
> paragraph" which is, I think, the one in the <quotation> starting with
> "The usual way to attach concepts..."  What I am talking about here is
> to generate an "extension" of the CRM --- if at all --- only in the
> sense of attaching a domain ontology, i.e., concepts of a domain
> ontology to the (reference ontology) CRM as it is common in "ontology
> engineering".  It's just in the same fashion as FRBR concepts are
> connected to CRM concepts.
> 
> For the following, there is no need to refer to OWL-DL at all --- it's
> only because the quotation was taken from a paper about the OWL-DL
> implementation of CRM.  Just think of First-Order Logic, or
> preferrably, a decidable subset of it.
> 
> Now, for the "E55 type hierarchy": In my copy of the CRM document,
> v.4.2.4, it is mentioned in the introductory paragraph "On Types" ---
> the one we are discussing about --- on pp. 17f. and furthermore on
> pp. 51 (in the section on E55), 65, 74, 75.  In none of these places I
> found that a type hierarchy under E55 Type *SHOULD* have a subclass
> for each of the classes of the CRM.  Maybe I missed something, so
> please direct me to the proper text location.  It could have, of
> course, in particular cases.  Whether there are subclasses (for
> each??) of the classes of the CRM will in my opinion depend on the
> purpose of the particular modelling activity.  (Just a side remark
> w.r.t. infinite recursion: If this is serious, I think it is a
> conceptual bug and not a feature and it should be fixed as soon as
> possible.  This is another argument in favor of my remark that the
> text proposed to vote on is not technically mature.  And, by the way,
> referring to your earlier remarks about dissemination of the CRM: How
> would you explain to a practitioner that this makes sense, and if it
> does, what he could gain from it? I think my proposal would not allow
> infinite recursion in the sense that the "has-lexconcept" property
> mentioned below would have an inverse property pointing back to the
> one and only exemplar of the CRM.)
> 
> Representing domain classes as subclasses of CRM-classes was my first
> idea as well.  And it will work fine regarding subsumption and
> inheritance.  In fact, reasoning can be done intensionally (i.e., on
> the formal expressions by which the classes are defined) as well as
> extensionally (i.e., on the set of all individuals belonging to a
> class).  But we have to be careful with E55 Type because the E55 class
> has in my view been conceived to provide a weak form of reification.
> If we want to keep a decidable version of the CRM we may not allow
> full reification (as in unrestricted RDF), because otherwise we could
> generate paradoxes with is due to its ability to express
> self-reference.  Now, for my mentioning of "contradiction" in the last
> sentence of the resp. paragraph which I think is what you refer to by
> "why this causes inconsistency": If we allow "artist" to be a subclass
> of E21 Person and at the same time to be a subclass of E55 Type, we
> are in trouble: E55 Type is a subclass of E28 Conceptual Object which
> is something immaterial, whereas an E21 Person is something material.
> So, we would have the artist Vincent, being material and immaterial at
> the same time.  This is in fact Martin's argument in the discussion of
> the last day of the Heraklion workshop in May (which you missed, if I
> remember correctly)

To my understanding, the solution you describe below is exactly the current state of
the CRM. I do not prefer the term "constant", because it does not exist
in the CRM, and comes from an encoding perspective, but rather use the
term "particular".

Note, that in a mediation system, the fact that an instance of a table "agent"
would have type "artist" may be used to decide that the instance is not only an
instance of Actor but also of E21 Person. In order to do so, we must express
a relationship in the respective thesaurus between "artist" and E21 Person.
How would you do that?

About the issue of decidability: To my understanding this issue occurs only,
if a declarative language is used. Procedural implementations would not have
this problem. The CRM however does not make any assumptions about the implementation
method.
.
> 
> Therefore, another solution has to be provided, which I did in my
> second "stroke paragraph", beginning with "Instead, a constant
> "Artist" may be used..."  So we have the E21 Person Vincent which P2
> has type E55 Type "Artist" (a nominal, i.e. a constant, not a
> subclass).  In this representation, we can reason with and on terms
> without problems, using the term hierarchy (which may be called an
> "E55 Type hierarchy").
> 
> To avoid misunderstanding, let me point out what I would understand by
> a "E55 Type hierarchy": Take WordNet as a thesaurus --- cum grano
> salis, just for the sake of its easy availability --- , and take
> furthermore my "artist" example.  The hyperonymy in WordNet provides
> "creator" as a broader term to "artist" and "person" as a broader term
> to "creator".  Narrower terms to "artist" are "painter", "sculptor",
> etc.  In general, I would not claim that such a term hierarchy is a
> priori a class hierarchy in the sense of CRM and therefore I would
> hesitate to merge both.  I am aware that some people tried to turn
> WordNet itself into a formal ontology, but this is another story.
> Here I will hold up the claim that the super-/sub-class relation in
> the CRM is not *IDENTICAL* to the broader/narrower term relation in
> WordNet.  Now we could "navigate" in the CRM class hierarchy
> (i.e. perform subsumption inferences) and we could navigate in the
> WordNet hierarchy, but to mix both needs further justification.  First
> of all I think that a naive combination of both would make problems.
> 
> But there is a more sophisticated way to combine both, namely the one
> I described in the third paragraph starting with "both representations
> are not mutually exclusive..."  Let me repeat that in this case the
> semantic integrity is within the user's responsibility.  In the last
> paragraph of my last mail I referred to a technical solution we found
> to do such a hybrid navigation by introducing a special property
> "has-lexconcept" (and its inverse).  With CRM, we would instead have
> E55 Type as the "interface" between CRM classes and terms from some
> thesaurus related by the property P2 has type (is type of).  Instances
> of CRM classes as E21 Person represent (domain) objects, whereas the
> WordNet entries just represent words (terms) and their use.
> 
> With this background, let me try to understand your T21 example: Let's
> connect E21 Person via P2 has-type E55 Type to the WordNet term
> "person".  Making the transisition to WordNet, we could then navigate
> to the subconcepts I mentioned, like "artist" (or do you mean
> something different by a "person thesaurus"??).  Then we would find
> that the term ("lexical concept" as we use to call it, because in
> general it represents a synonym set which is an equivalence class)
> "artist"/WordNet is a narrower term of T21 "person"/WordNet. We might
> call the former T21-1 and we could use it to say by means of P2 that
> some E21 Person instance P2 has type E55 Type (T21-1)
> "artist"/WordNet.  Did I get you right???
> 
> Best regards,
> -- Guenther
> 
> 
> On 6/22/08, Christian-Emil Ore <c.e.s.ore at edd.uio.no> wrote:
>> Dear Günther,
>>  One does not need to extend the CRM to get your first stroke paragraph. The
>> CRM states that the class hierarchy under E55 type should have a sub class
>> for each of the classes in the CRM.  This, of cause, opens for an endless
>> recursion but that is not the point here. So for the class E21 Person there
>> exists a sub...subclass of E55 Type, let us call it T21. The intention, at
>> least according to my understanding, is that the terms in a Person thesaurus
>> should be mapped to a type-(in the CRM sense) hierarchy under this type,
>> T21, corresponding to the formal structure of the thesaurus. An instance of
>> E21 Person can then be connected via P2 to an instance of a subclass in this
>> hierarchy or to an instance of T21 itself. Due to the generalisation
>> mechanism in the CRM all P2 insta
>>
>>  The latter case corresponds to your artist case. Could you please explain
>> to me without referring to rdf/owl why this causes inconsistency I just
>> don't understand your line of arguments.
>>
>>  Regards,
>>  Christian-Emil
>>
>>
>>
>>  On 13.06.2008 15:24, Guenther Goerz wrote:
>>
>>> Dear all,
>>>
>>> As an attempt to clarify the problem of modelling alternatives with
>>> CRM-types --- this is the term I will use to distinguish it from other
>>> uses of "type", as e.g. in computer science --- let me start with
>>> quoting a section from the paper I submitted to the CIDOC 2008
>>> conference.  In order to avoid higher-order (logic) constructs which
>>> in my view are probably hard to comprehend for practitioners anyway,
>>> without excluding a weak form of reification completely, I suggested
>>> two ways of representation:
>>>
>>> <quotation>
>>> ``... E55 Type has been implemented as a class which ---
>>> for the purpose of reasoning on the conceptual level --- may serve as
>>> an interface to external concepts of formal domain ontologies (or
>>> thesauri) as subclasses or as constants.  In fact, at least two
>>> different representations are possible:
>>>
>>> - The usual way to attach concepts of a domain ontology to the CRM is
>>>  direct subclassing, e.g., the (application domain) class Artist as a
>>>  subclass of E21 Person.  So, ``Vincent van Gogh'' would be an
>>>  instance of Artist and inherit all properties of E21 Person.  In
>>>  that case to represent Artist also as a subclass of E55 Type would
>>>  lead to contradictions.
>>>
>>> - Instead, a constant ``Artist'' may be used; in general, it will be a
>>>  term of a domain-specific thesaurus.  Such constants
>>>  (``individuals'') are admitted in T-Boxes by means of the ``one-of''
>>>  OWL-DL language construct, i.e. an enumeration datatype.  They
>>>  correspond to classes with singleton extensions.  So, we could
>>>  represent ``Vincent van Gogh'' as an immediate instance of E21
>>>  Person and relate it by P2 has type to E55 Type with value
>>>  ``Artist''.  In this case, of course, the constants cannot have
>>>  instances in turn.
>>>
>>> Both representations are not mutually exclusive; in our example the
>>> name of the class Artist (case 1) could additionally be used as a
>>> constant which is assigned as a value to E55 Type (case 2), but then
>>> it is up to the user to guarantee for semantic integrity.  In the
>>> second case the intention expressed in the CRM document is supported
>>> that is shall be possible to deal with domain concepts --- such as
>>> Artist --- as objects of discourse.  Which of these representations
>>> will be chosen for a particular application will depend on the
>>> intended use of the domain model.''
>>> </quotation>
>>>
>>> What I am proposing here is to provide possibilities to argue with and
>>> about terms, i.e. use terms in reasoning in a ``de re'' and a ``de
>>> dicto'' mode.
>>>
>>> De re corresponds to the first way of representation: We introduce
>>> domain level classes as subclasses of CRM-classes.  The domain level
>>> classes may have subclasses in turn as, e.g., Painter and Sculptor as
>>> subclasses of Artist; so all of our instances from, e.g., a museum data
>>> base are instances of the domain ontology which in turn is connected
>>> to CRM as a reference ontology.  In Description Logic (OWL-DL),
>>> reasoning with classes (concepts) is possible
>>> - intensionally, i.e. in terms of their defining expressions (T-Box),
>>>  or
>>> - extensionally, i.e. in terms of their instances (A-Box = extension,
>>>  i.e. the set of instances).
>>>
>>> It is important to keep in mind that we have an ``open world''
>>> semantics, i.e. if we have as a necessary condition for a FATHER that
>>> he is a PERSON and that Exists some CHILD, which is a PERSON,
>>> (``existential restriction''), we can represent an instance of a
>>> PERSON who claims to be a father without representing explicitly the
>>> particular CHILD --- in open world semantics ``I don't know'' is a
>>> legitimate answer to the question for a child.  On the other hand,
>>> there may be two PERSONS claiming to be the father of some particular
>>> CHILD (probably a rare case in the real world...), if we do not
>>> combine the existential restriction with a cardinality restriction,
>>> i.e., that there must be exactly one father for any CHILD.
>>>
>>> De dicto corresponds to the second way of representation: We introduce
>>> a constant as the value of E55 Type.  In this case, we can reason
>>> about the term itself, and not about its denotation.  If we use a
>>> thesaurus where a broader term for ``Artist'' were ``Person'' and
>>> narrower terms were ``painter'', ``sculptor'', etc., we can reason in
>>> the narrower-term--broader-term thesaurus hierarchy as opposed to the
>>> class hierarchy in the domain.  Formally, in both cases we have a
>>> subsumption hierarchy; in our example on the one hand a class
>>> hierarchy which contains CRM classes with integrated domain classes
>>> which have a denotation in the domain (de re), in the second case in a
>>> thesaurus hierarchy of terms which don't have instances (de dicto).
>>> As mentioned above, there may be combinations where the terms are
>>> related to domain classes by a particular property such as (an
>>> extended version of) P2 has type.
>>>
>>> The latter situation reminds me of a technique we applied in our NLP
>>> work where we have a domain ontology representing a rich domain
>>> semantics on the one hand and a hierarchical lexicon --- WordNet in
>>> our case --- on the other hand.  The ``terms'' above would correspond
>>> to WordNet's ``synsets'', i.e. sets of synonymous words.  As synonymy
>>> is an equivalence relation, we have a typical case of an abstraction
>>> from word to (lexical) concepts: Each word in the synset may represent
>>> the equivalence class.  Then we introduced a property
>>> ``has-lexconcept'' into the ontology which relates domain concepts to
>>> the lexical concepts, i.e. words by which they are expressed.  But
>>> this relation has to be maintained by the system implementors and must
>>> be handled with care: Of course, it's up to them to care for semantic
>>> integrity.  We can reason within the domain class hierarchy, as well
>>> as within the lexical concept hierarchy (WordNet), but combined
>>> inferences are possible as well.  In the case of subsumption
>>> inferences, e.g., given a certain (domain) class and, by virtue of
>>> lex-concept, the corresponding lexical concept(s), we could ask for
>>> its superclass and the words it is related to.  Furthermore, we can
>>> stay in the WordNet hierarchy, look for the broader lexical concept
>>> (synset), and ask for the domain concept it corresponds --- if there
>>> is one --- to by virtue of the inverse relation to has-lexconcept.
>>>
>>> Best,
>>> -- Guenther
>>>
>>> On 6/9/08, Vladimir Ivanov <nomemm at gmail.com> wrote:
>>>
>>>> ---------- Forwarded message ----------
>>>>  From: Vladimir Ivanov <nomemm at gmail.com>
>>>>  Date: 2008/6/9
>>>>  Subject: Re: [Crm-sig] About Types: ISSUE PLEASE VOTE
>>>>  To: Guenther Goerz <guenther.goerz at gmail.com>
>>>>
>>>>
>>>>  Dear Guenther,
>>>>
>>>>  > Section 9: I don't understand it at all.  Could you please explain
>> ---
>>>>  > and perhaps also the colleagues who already voted for the text as a
>>>>  > whole what they understand?  As a side remark, I cannot make any
>> sense
>>>>  > out of the last sentence.
>>>>
>>>>  The only sense of the last sentence I've made, was its correspondence
>>>>  to OWL Full language.
>>>>  If one allows to treat "E55.Types" both as classes and as instances,
>>>>  you may face to problems with reasoning.
>>>>
>>>>  Excerpt from OWL spec. (http://www.w3.org/TR/owl-ref/):
>>>>  "... However, use of the OWL Full features means that one loses some
>> guarantees
>>>>  that OWL DL and OWL Lite can provide for reasoning systems."
>>>>
>>>>  these "guarantees" are related to decidability of reasoning.
>>>>
>>>>  "Inference in OWL Full is clearly undecidable as OWL Full does not
>>>>  include restrictions
>>>>  on the use of transitive properties which are required in order to
>> maintain
>>>>  decidability." from
>>>>
>> (http://www.cs.man.ac.uk/~horrocks/Publications/download/2005/Horr05c.pdf
>>>>  , p.2)
>>>>
>>>>  As for the sect. 9 as a whole,
>>>>  I think the main idea was that
>>>>  "you may implement a system of user-defined types (subclasses of E55
>>>>  and properties)
>>>>  at necessary (in your application) level of granularity, but it should
>>>>  correspond to the CRM notion of type".
>>>>
>>>>  Best regards,
>>>>  Vladimir.
>>>>
>>>>  >
>>>>  > Best,
>>>>  > -- Guenther
>>>>  >
>>>>  >
>>>>  > On 6/4/08, martin <martin at ics.forth.gr> wrote:
>>>>  >> Dear All,
>>>>  >>
>>>>  >>  Following the decision in the last meeting, we have to decide via
>> e-mail
>>>>  >> vote on
>>>>  >>  the updated  attached text about types in the CRM document. I have
>>>>  >> desparately tried to
>>>>  >>  describe as exact as possible what the CRM does, and to avoid the
>> metaclass
>>>>  >>  question, once this is a philosophical rather than an applied
>> question in
>>>>  >> the
>>>>  >>  current form the CRM describes.
>>>>  >>
>>>>  >>  Please VOTE:
>>>>  >>
>>>>  >>  ACCEPT [ ]
>>>>  >>
>>>>  >>  REQUEST MODIFICATION: [....]
>>>>  >>
>>>>  >>  by June 12.
>>>>  >>
>>>>  >>  Best,
>>>>  >>
>>>>  >>  Martin
>>>>  >>  --
>>>>  >>
>>>>  >>
>> --------------------------------------------------------------
>>>>  >>   Dr. Martin Doerr              |  Vox:+30(2810)391625        |
>>>>  >>   Principle Researcher          |  Fax:+30(2810)391638        |
>>>>  >>                                |  Email: martin at ics.forth.gr |
>>>>  >>                                                              |
>>>>  >>                Center for Cultural Informatics               |
>>>>  >>                Information Systems Laboratory                |
>>>>  >>                 Institute of Computer Science                |
>>>>  >>    Foundation for Research and Technology - Hellas (FORTH)   |
>>>>  >>                                                              |
>>>>  >>   Vassilika Vouton,P.O.Box1385,GR71110 Heraklion,Crete,Greece |
>>>>  >>                                                              |
>>>>  >>          Web-site: http://www.ics.forth.gr/isl               |
>>>>  >>
>> --------------------------------------------------------------
>>>>  >>
>>>>  >>
>>>>  >> _______________________________________________
>>>>  >>  Crm-sig mailing list
>>>>  >>  Crm-sig at ics.forth.gr
>>>>  >>  http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>>>  >>
>>>>  >>
>>>>  >>
>>>>  > _______________________________________________
>>>>  > Crm-sig mailing list
>>>>  > Crm-sig at ics.forth.gr
>>>>  > http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>>>  >
>>>>  _______________________________________________
>>>>  Crm-sig mailing list
>>>>  Crm-sig at ics.forth.gr
>>>>  http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>>>
>>>>
>>> _______________________________________________
>>> Crm-sig mailing list
>>> Crm-sig at ics.forth.gr
>>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>>
>>>
> 
> _______________________________________________
> Crm-sig mailing list
> Crm-sig at ics.forth.gr
> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
> 
> 


-- 

--------------------------------------------------------------
  Dr. Martin Doerr              |  Vox:+30(2810)391625        |
  Principle Researcher          |  Fax:+30(2810)391638        |
                                |  Email: martin at ics.forth.gr |
                                                              |
                Center for Cultural Informatics               |
                Information Systems Laboratory                |
                 Institute of Computer Science                |
    Foundation for Research and Technology - Hellas (FORTH)   |
                                                              |
  Vassilika Vouton,P.O.Box1385,GR71110 Heraklion,Crete,Greece |
                                                              |
          Web-site: http://www.ics.forth.gr/isl               |
--------------------------------------------------------------




More information about the Crm-sig mailing list