[Crm-sig] Issue 326 Resolving inconsistencies between E2, E4, E52 and E92

George Bruseker bruseker at ics.forth.gr
Tue Mar 12 23:23:52 EET 2019


Dear Martin et al.,

Although we are on a well trodden path, it seems to still engender questions so perhaps worthy meandering on for a bit longer, not necessarily to belabour any previous decisions but at least to ponder them some more.

I wonder about the following:

1) did STV not enter CRMbase when we still thought to include all extension top classes and relations to base? Now that this is no longer the case (we will specify in a separate document all top level classes and relations, thereby alleviating Base of having to hold all possible top level classes and properties), is there a case for reassessing whether STVs belong in base at all? I do not of course question their utility, but they did arise because of questions in CRMgeo. Are they a necessary feature of base?

> Secondly, the lay person will see a knowledge graph of instances, and not a theory. A knowledge graph with a lot of trivial links in my opinion makes the model less intuitive to use.

and

>> I believe we cannot avoid entering some complexity here in our discussions, and resolve it giving priority to the end-user schema.
>> I think the first arguments should be, if the final schema is confusing, and if the alternative is less confusing.

> 


2) I think we all agree that that understandability and teachability is not at all an unimportant factor, especially for something that is supposed to be applied by domain specialists. If the concept cannot be communicated to the community it is supposed to serve, is it serving the community? The lay person doing mapping and modelling (using it) will need to understand these constructs and apply them correctly. We argue that it is this domain/lay person who should ultimately translate or validate their data. Therefore, the concept should be understandable and clear to them (without having to retrain in another discipline). They should more or often than not create intended models.

3) For those people who want and know how to use an STV, the fact that they would populate the knowledge base distinctly would not be a disadvantage but an advantage, as you could query it distinct from the entity of which it was an STV. For everyone else, they simply would not instantiate STVs and so would not have to worry about how they work or what their properties are or if they cause an extra join.

> A knowledge graph with a lot of trivial links in my opinion makes the model less intuitive to use. Moreover, all our RDF databases are still very bad following links. Any additional join has high cost. Still, most CRM implementations materialize a huge number of paths to increase performance. 


4) If we change the ontology based on the present technological capacities, are we not violating a modelling principle? 


>> The alternative you are advocating for is:
>> a) Fill the database with a very large number of necessary 1:1 links: events are some of the the most frequent items we have.


5) In practice, I have never seen a knowledge base of CRM that automatically creates for example the STV to an E2 (or something similar). If someone (most CH users?) isn’t interested in STVs, they don’t use them.

I am most intrigued though about Rob’s example. It seems like a potentially reasonable way of interpreting E18 when it is a subclass of E92 by a user of the ontology. I also think we do not actually want to allow or encourage this modelling pattern by a user of the ontology. I believe we do not want to do so because an E18 and its E92 are actually distinct, and we cannot say the same thing about the E18 and an E92 (but our model leads one to believe they are the same). We can’t say that an STV is married, but we can say Rob is married. If this doesn’t hold (Rob’s STV can be married), then it’s not clear to me why Rob’s modelling is wrong, although the result is of course a bevy of weird instances. I think the point here is that Rob’s interpretation is not out to lunch. It is one which one could assume would be an intended model. It is not, however, and why it is not cannot clearly be specified possibly because we have said E18 isA E92.

Best,

George

------------------------------------------------------
Dr. George Bruseker
Coordinator

Centre for Cultural Informatics
Institute of Computer Science
Foundation for Research and Technology - Hellas (FORTH)
Science and Technology Park of Crete
Vassilika Vouton, P.O.Box 1385, GR-711 10 Heraklion, Crete, Greece

Tel.: +30 2810 391619   Fax: +30 2810 391638   E-mail: bruseker at ics.forth.gr
URL: http://www.ics.forth.gr/isl

> On Mar 12, 2019, at 11:18 AM, Martin Doerr <martin at ics.forth.gr> wrote:
> 
> Dear Christian-Emil, All,
> 
> Thank you for your explanations. I agree with you technically completely. 
> 
> I propose to have cardinality (1,1:0,1) for P4 and P160. 
> 
> The scope note of P4 must be modified, as you say. We have discussed already, that alternative opinions are questions of the knowledge base, and not of the ontology.
> 
> I only disagree :-) with:
> 
> "The model will .... and hopefully be more intuitive for the lay persons." with the additional link:
> 
> Firstly, I have made the argument, and got no response, that introducing a link between E4 and E92 does not solve the problem of equivalence of P4 and P160. It is still exactly the same, only more complex to formulate: we have to equate a path with a single link.
> 
> Secondly, the lay person will see a knowledge graph of instances, and not a theory. A knowledge graph with a lot of trivial links in my opinion makes the model less intuitive to use. Moreover, all our RDF databases are still very bad following links. Any additional join has high cost. Still, most CRM implementations materialize a huge number of paths to increase performance. 
> .......
> 
> I still do not see, why we should reduce performance because we find it difficult to explain the theory;-)
> 
> Basically, as you say, we repeat old arguments here, long before decided, without new insight. 
> 
> Best,
> 
> Martin
> 
> On 3/12/2019 5:04 PM, Christian-Emil Smith Ore wrote:
>> I resend the email, without the long tail of the previous emails due to length restrictions.
>> Best
>> Christian-Emil
>> From: Christian-Emil Smith Ore
>> Sent: 12 March 2019 15:24
>> To: Martin Doerr; steads at paveprime.com <mailto:steads at paveprime.com>; 'George Bruseker'
>> Cc: 'crm-sig'
>> Subject: Re: [Crm-sig] Issue 326 Resolving inconsistencies between E2, E4, E52 and E92
>>  
>> ​​Dear all,
>> 
>> The issue 326 is old. I made some slides (dated 31/3/2017) which can be found at 
>> http://www.edd.uio.no/download/cidoc_crm/issue-326-overview-and-thoughts-HW.pptx <http://www.edd.uio.no/download/cidoc_crm/issue-326-overview-and-thoughts-HW.pptx>
>> 
>> The exchange of emails has two topics:
>> 1) E18 Physical Thing as a subclass of E92 Spacetime Volume
>> ​2) the properties P4 and P160
>> 
>> **********
>> 1: In my opinion it is model theoretically correct that E18 Physical Thing​ as a subclass of E92 Spacetime Volume. However, it may be confusing for persons not so interested in theory.  Therefor we could introduce a property Pxx E18 Physical Thing <->  E92 Spacetime Volume with the cardinality (1,1:0,1) describing the the (model theoretical) fact that  a part of  E18 Physical Thing is in a 1-1 correspondence with a subset of E92 Spacetime Volume​. 
>> 
>> The model will still have the same explanatory power, and hopefully be more intuitive for the lay persons.
>> 
>> ***********
>> 2:
>> In the slides I give the following comment:
>>  
>> "The cardinality of P4 has time-span is (1,1:1,n), that is, two or more instances of E2 Temporal Entity can “share” an instance of E52 Time-span. This was introduced in an early stage to model simultaneity. 
>> This way of modeling simultaneity is considered obsolete and the cardinality of P4 should be (1,1:1,1)-
>> E2 Temporal Entity and E52 Time-span in a one to one relation 
>> E2 Temporal Entity and E92 Spacetime Volume  in a one to one relation. "
>>  
>> Please, note that  P4's cardinality states that every instance of P4 is connected to one and only one instance of E52 Time-span. Therefore, the number of instances of E52 Time-span will be equal or less than the number of instances of E2 Temporal Entity.
>>  
>> The number of instance of E92 Spacetime Volume and E2 Temporal Entity will always be equal due to the cardinality (1,1:1,1) of P160  has temporal projection.  E4 Period is a subclass of E92 Spacetime Volume and has less than or equal number of instances. The cardinality of P160 when lowered to 
>> 
>> P160: E4 Period <-> E52 Time-span
>> 
>> must have the more strict cardinality  (1,1:0,1), that is, it is an injection of E4 Period into E52 Time-span. There may exist instances of E52 Time-span which are not related to an instance of the subclass E4 Period
>>>> Correspondingly:
>> 
>> P4: E4 Period <-> E52 Time-span
>> 
>> must have the cardinality constraint (1,1:0,n).
>> 
>> The scope note of P160:
>> “This property describes the temporal projection of an instance of an E92 Spacetime Volume. The property P4 has time-span is the same as P160 has temporal projection if it is used to document an instance of E4 Period or any subclass of it.”
>> So the formulation discussed in the emails is already there.
>> 
>> The scope note of P4:
>>  “This property describes the temporal confinement of an instance of an E2 Temporal Entity. The related E52 Time-Span is understood as the real Time-Span during which the phenomena were active, which make up the temporal entity instance. It does not convey any other meaning than a positioning on the “time-line” of chronology. The Time-Span in turn is approximated by a set of dates (E61 Time Primitive). A temporal entity can have in reality only one Time-Span, but there may exist alternative opinions about it, which we would express by assigning multiple Time-Spans. Related temporal entities may share a Time-Span. Time-Spans may have completely unknown dates but other descriptions by which we can infer knowledge.”
>> 
>> The formulation “A temporal entity can have in reality only one Time-Span, but there may exist alternative opinions about it, which we would express by assigning multiple Time-Spans.” should be deleted. Such multiple assignment due to uncertainties or alternative opinions is the case for many properties in CRM.
>> 
>> In my opinion “Related temporal entities may share a Time-Span.” should also be deleted and the cardinality of P4 (E2 Temporal Entity <-> E52 Time-span) made stricter to (1,1:1,1).
>> 
>> 
>> Best,
>> Christian-Emil
>> From: Crm-sig <crm-sig-bounces at ics.forth.gr> <mailto:crm-sig-bounces at ics.forth.gr> on behalf of Martin Doerr <martin at ics.forth.gr> <mailto:martin at ics.forth.gr>
>> Sent: 12 March 2019 11:09
>> To: steads at paveprime.com <mailto:steads at paveprime.com>; 'George Bruseker'
>> Cc: 'crm-sig'
>> Subject: Re: [Crm-sig] Issue 326 Resolving inconsistencies between E2, E4, E52 and E92
>>  
>> Dear Steve, George,
>> 
>> Your arguments well taken, I may remind you that the argument was not only a 1:1 relation.
>> 
>> It contained 4 elements:
>> 
>> a) a 1:1 relation
>> b) a common identity condition: The identity of the STV depends on the identity of the phenomenon
>> c) There existence conditions are identical: the one exists where and as long as the other
>> d) Properties do not interfere.
>> 
>> The condition d) becomes more tricky with the question of the time spans, as you have seen. Here, the question for me is not ontological, but of the logical formalism. As I have shown, it can be described in FOL. It is the only complication we have. We just declare two properties to be identical downwards. 
>> 
>> The alternative you are advocating for is:
>> a) Fill the database with a very large number of necessary 1:1 links: events are some of the the most frequent items we have.
>> b) You have not solved anything wrt P160, because P4 is still the same as P160 in these cases, and the path of correspondence is even more confusing. 
>> 
>> So, we just buy in a much more confusing schema, to my opinion. The schema is what we use on a daily base. Discussing CRM extensions is not the end-users interest, but the task of the SIG. 
>> 
>> I believe we cannot avoid entering some complexity here in our discussions, and resolve it giving priority to the end-user schema.
>> I think the first arguments should be, if the final schema is confusing, and if the alternative is less confusing.
>> 
>> I am not sure where to publish adequately the above reasoning. It should be somewhere buried in the minutes. But we tried very hard to make the things clear in the scope notes of E4, E18.
>> 
>> What do you think?
>> 
>> All the best,
>> 
>> Martin
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Crm-sig mailing list
>> Crm-sig at ics.forth.gr <mailto:Crm-sig at ics.forth.gr>
>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig <http://lists.ics.forth.gr/mailman/listinfo/crm-sig>
> 
> -- 
> ------------------------------------
>  Dr. Martin Doerr
>               
>  Honorary Head of the                                                                   
>  Center for Cultural Informatics
>  
>  Information Systems Laboratory  
>  Institute of Computer Science             
>  Foundation for Research and Technology - Hellas (FORTH)   
>                   
>  N.Plastira 100, Vassilika Vouton,         
>  GR70013 Heraklion,Crete,Greece 
>  
>  Vox:+30(2810)391625  
>  Email: martin at ics.forth.gr <mailto:martin at ics.forth.gr>  
>  Web-site: http://www.ics.forth.gr/isl <http://www.ics.forth.gr/isl> 
> _______________________________________________
> Crm-sig mailing list
> Crm-sig at ics.forth.gr <mailto:Crm-sig at ics.forth.gr>
> http://lists.ics.forth.gr/mailman/listinfo/crm-sig <http://lists.ics.forth.gr/mailman/listinfo/crm-sig>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ics.forth.gr/pipermail/crm-sig/attachments/20190312/5b56b8f6/attachment-0001.html>


More information about the Crm-sig mailing list