[Crm-sig] Issue: Solution for Dualism of E41 Appellation and rdfs:label

Martin Doerr martin at ics.forth.gr
Fri Sep 14 20:19:58 EEST 2018


Dear Richard,

I'll shorten now:

On 9/14/2018 7:54 PM, Richard Light wrote:
>
>
>>> My suggestion is that we define the "has symbolic content" property, 
>>> and then put our energy into agreeing one or more subproperties of 
>>> rdf:value which meet the known recording requirements for cultural 
>>> heritage information.  By doing this, I suggest that we will have 
>>> solved the main problem which confronts implementors who want to 
>>> express CRM in RDF.
>> Yep, subproperty of rdf:value is not bad.
>>
>>>> I think the polymorphism we describe here, well studied in 
>>>> object-oriented languages, is in the nature of Appellations. The 
>>>> problem for me is, that the the respective KR models have NOT 
>>>> THOUGHT of the case that such polymorphisms can occurr. 
>>>> Nevertheless, RDFS is tolerant enough to accept the Superproperty 
>>>> statement, but not to create a class which is either URI or *inline 
>>>> expanded* object.
>>>>
>>>> This polymorphism occurs EXCLUSIVELY for Symbolic Objects with 
>>>> symbol sets a certain machine supports. Another reason not to use 
>>>> rdfs:value, because it does not give credit to the fact that only 
>>>> Symbolic Objects can have such a "value".
>>> I'm afraid you have lost me here. It would be very helpful to me 
>>> (and might encourage others to join in the conversation) if you 
>>> could post one or two concrete examples of what you mean.
>> OK, in simple words: there are names which have an identity based on 
>> a certain sequence of characters. There are others, historically 
>> interesting, which have a phonetic identity, and even that may vary. 
>> We collaborate with historians, that deal with family names in the 
>> Aegean area around 1800, which have no standard spelling at all, not 
>> even a preferred one. The different spelling variants have later 
>> evolved into distinct family names. But in order to match instances 
>> in the documents, we need both concepts of identity.
> True, but any instance of the name in a document will only take one 
> concrete form, not all of them.  (For handwritten sources it may be a 
> matter of judgement what that form actually is.)  So you can record 
> the form of name it exhibits (as a string), and then assert that it is 
> (in your view) an attestation of the generic family name for which you 
> have a URI.
This is not true. We do have counterexamples. The name may take multiple 
forms in the same document.
>>
>> Even my ancestors used "Derr" instead of "Dörr". Since the local 
>> dialect does not distinguish "e" and "ö", it is unclear if it is a 
>> spelling variant of the same phonetics or if the "ö" is an 
>> etymological misinterpretion, because "Dörr" has a linguistic meaning 
>> and the "e" in "Derr" may have another semantic root, but this is not 
>> widely accepted.
>>
>> So, the names that are not identical to a Literal must be represented 
>> using a URI. That is what I mean by polymorphism. Also, if we want to 
>> talk about the name itself as a historical fact, we need a distinct 
>> identity. All these cases are needed but rare for names. 
> There are perfectly good reasons for considering names to be worthy of 
> study and recording in their own right.  I would argue that this is 
> equally true whether the name in question has one, or many, possible 
> forms.  So there is always an argument for minting a URI to represent 
> the name as a Symbolic Object. Doing this allows you to make 
> statements, for example, about its genesis, its meaning, its 
> historical distribution, etc., and means you can record specific 
> instances of the name as attestations of this Symbolic Object.
>
> However, I would still argue that /instances /of the name should be 
> recorded as strings - the actual value found in the resource in question.
Sure. this is another issue. And they can be multiple...

best,

Martin
>
>> For texts, it is the opposite. They are more often in files than in 
>> literals.
>>
>> On the other side, only Symbolic Objects can "reside" on computers 
>> and outside. Therefore the "punning" problem does only occur in 
>> connection to Symbolic Objects. Only these can have a "value" in the 
>> machine, whereas rdfs:value may be about anything.
>>
> Thanks,
>
> Richard
>
> [1] https://www.w3.org/community/openannotation/
>
>>
>> Best,
>>
>> Martin
>>>
>>> Best wishes,
>>>
>>> Richard
>>>>
>>>> I agree that we may over-think the point. As I mentioned, the 
>>>> superproperty statement I propose has no other effect than that I 
>>>> can get E41's and labels back by querying P1 only.
>>>>
>>>> Opinions?
>>>>
>>>> Best,
>>>>
>>>> Martin
>>>>
>>>> On 9/12/2018 9:56 AM, Richard Light wrote:
>>>>>
>>>>> On 11/09/2018 20:02, Martin Doerr wrote:
>>>>>> Dear All,
>>>>>>
>>>>>> Firstly, apologies, the RDF was wrong, it was intended to be P1 
>>>>>> is superproperty of rdfs:label.
>>>>> I'm not sure that this is something we need to state at all, and I 
>>>>> worry that - if it is included in our RDFS Schema - it may bring 
>>>>> unwanted side-effects.  Isn't this saying that any instance of 
>>>>> rdfs:label is to be treated as an instance of P1?  Bear in mind 
>>>>> that CRM data may co-exist in triple stores in company with other 
>>>>> RDF data, which may well use rdfs:label for its own purposes.  
>>>>> This assertion that 'all rdfs:labels are P1 relationships' would 
>>>>> then be applied to this other data as well.  This might well 
>>>>> result in incorrect/spurious results when SPARQL queries are 
>>>>> applied to the data.
>>>>>
>>>>> In general, I suggest that we are ok to define 
>>>>> sub-classes/properties of standard RDFS types, but we shouldn't 
>>>>> define super-classes/properties of them.  (I would welcome 
>>>>> comments on the validity of this suggestion from someone who 
>>>>> understands RDF better than me.)
>>>>>
>>>>>> Semantically, the range of rdfs:label, when used, is 
>>>>>> ontologically an Appellation in the sense of the CRM.
>>>>> Agreed (see my reply from yesterday).  The conclusion I draw from 
>>>>> this is that we can validly say:
>>>>>
>>>>> E1 rdfs:label "string value" is a shortcut for the path 'E1 CRM 
>>>>> Entity' 'P1 is identified by' 'E41 Appellation' ...
>>>>>
>>>>> in exactly the same spirit as the similarly-worded note which we 
>>>>> find in the definition of P1 itself. (Obviously, by using this 
>>>>> shortcut, we lose the information that this string value is an 
>>>>> Appellation, but that's the nature of short-cuts.)
>>>>>
>>>>>> I agree with George, that all RDF nodes should have a human 
>>>>>> readable label. They name the thing, even if it is a technical node.
>>>>>> I would find it confusing to say, labels are not to be queried, 
>>>>>> only to be read, and the "real" names must have a URI,
>>>>>> regardless weather I have more to say about it.
>>>>>>
>>>>>> I am really not a fan of punning, we definitely forbid it in the 
>>>>>> CRM.
>>>>>>
>>>>>> The point with Appellations is that some, the simple ones, can 
>>>>>> directly be represented in the machine, or be outside. The 
>>>>>> solution to assign a URI in all cases, and then a value or label, 
>>>>>> does not make the world easier. It is extremely bad performance. 
>>>>>> We talk here about implementation, not about ontology.
>>>>>> You get simply a useless explosion of the graph for a purpose of 
>>>>>> theoretic purity.
>>>>> Agreed. What we need to do is to propose a simple way of 
>>>>> expressing simple Appellations in RDF.  That is why my shortcut 
>>>>> definition above ends with '...': I don't think we have yet 
>>>>> decided how to do this.
>>>>>
>>>>> I've just been looking over the draft document we are trying to 
>>>>> write, and it currently says that a fully-worked-out path will use 
>>>>> 'P3 has note -> E62 string' to express the value of an E41 
>>>>> Appellation.  This (i.e. the suggestion to use P3) comes from the 
>>>>> definition of the superclass E90 Symbolic Object.  A comment in 
>>>>> our draft RDF document questions whether this is sufficiently 
>>>>> precise, since P3 is simply "a container for all informal 
>>>>> descriptions about an object that have not been expressed in terms 
>>>>> of CRM constructs".  I suggest that we need either to use 
>>>>> rdfs:value to hold the string value, or (better) to define a 
>>>>> CRM-specific subproperty of rdfs:value and use that.  (This 
>>>>> subproperty could be part of the published CRM, or it could just 
>>>>> form part of the 'RDF implementation' guidelines.)  I don't think 
>>>>> that we should use rdfs:label here.
>>>>>
>>>>> I don't think we should concern ourselves with URLs in our RDF 
>>>>> guidance document.  Any implementer of our RDF solutions can 
>>>>> choose to assign a URL to represent any node in the structure, but 
>>>>> it won't change the logic of the resulting RDF, or how it responds 
>>>>> to SPARQL queries.
>>>>>
>>>>>>
>>>>>> Those claiming confusing should be more precise. Has someone 
>>>>>> looked at query benchmarks? Has someone looked at graphical 
>>>>>> representations of RDF graphs. Do they really look better?
>>>>>>
>>>>>> So either we either ignore the issue, and write queries that 
>>>>>> collect names either via P1, URI and a value/label, or via a 
>>>>>> label, because this is where names appear in RDF, we make no 
>>>>>> punning, but our queries implement exactly this meaning. So, we 
>>>>>> are not better, but do as if we wouldn't know.
>>>>>>
>>>>>> Or, we describe the fact by punning, have one superproperty for 
>>>>>> all cases, which we can query, and stop thereby the discussion if 
>>>>>> labels are allowed or not, and how they relate to appellations. 
>>>>>> The punning comes in, because the range of the superproperty must 
>>>>>> comprise the ranges of the subproperties. We can play a bit more, 
>>>>>> make the punning with a superproperty of P1, and have both P1 and 
>>>>>> rdfs:label subproperties of it, if this is preferred.
>>>>>> The solution I describe is just a logical representation of the 
>>>>>> situation, not creating a different situation. It just says that 
>>>>>> names can be complex objects or simple literals.
>>>>> As I said yesterday, I don't see how any punning strategy can make 
>>>>> differently-structured RDF equivalent for the purposes of 
>>>>> querying. Therefore, I think we will have to accept that if we 
>>>>> allow more than one way of representing a given statement in CRM 
>>>>> RDF, we will have to construct queries which look explicitly for 
>>>>> each of the possible patterns.
>>>>>
>>>>>> The problem is, that the RDF literals do have meaning beyond 
>>>>>> being symbol sequences.
>>>>> Insofar as they have such meaning, I would argue that we define it 
>>>>> (i.e. that meaning) by the CRM context in which we place the 
>>>>> string/literal value.  I think there is a danger that we could 
>>>>> over-think this problem.
>>>>>
>>>>> Richard
>>>>>>
>>>>>> The punning does not introduce the problem. With or without, the 
>>>>>> queries have to cope with names in either form.
>>>>>> This holds similarly for space primitives and large geometry 
>>>>>> files, for short texts and equivalent files etc.
>>>>>>
>>>>>> Opinions?
>>>>>>
>>>>>> Best
>>>>>>
>>>>>> Martin
>>>>>>
>>>>>>
>>>>>
>>>>> -- 
>>>>> *Richard Light*
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Crm-sig mailing list
>>>>> Crm-sig at ics.forth.gr
>>>>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>>>
>>>>
>>>> -- 
>>>> --------------------------------------------------------------
>>>>   Dr. Martin Doerr              |  Vox:+30(2810)391625        |
>>>>   Research Director             |  Fax:+30(2810)391638        |
>>>>                                 |  Email:martin at ics.forth.gr  |
>>>>                                                               |
>>>>                 Center for Cultural Informatics               |
>>>>                 Information Systems Laboratory                |
>>>>                  Institute of Computer Science                |
>>>>     Foundation for Research and Technology - Hellas (FORTH)   |
>>>>                                                               |
>>>>                 N.Plastira 100, Vassilika Vouton,             |
>>>>                  GR70013 Heraklion,Crete,Greece               |
>>>>                                                               |
>>>>               Web-site:http://www.ics.forth.gr/isl            |
>>>> --------------------------------------------------------------
>>>>
>>>>
>>>> _______________________________________________
>>>> Crm-sig mailing list
>>>> Crm-sig at ics.forth.gr
>>>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>>
>>> -- 
>>> *Richard Light*
>>>
>>>
>>> _______________________________________________
>>> Crm-sig mailing list
>>> Crm-sig at ics.forth.gr
>>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>
>>
>> -- 
>> --------------------------------------------------------------
>>   Dr. Martin Doerr              |  Vox:+30(2810)391625        |
>>   Research Director             |  Fax:+30(2810)391638        |
>>                                 |  Email:martin at ics.forth.gr  |
>>                                                               |
>>                 Center for Cultural Informatics               |
>>                 Information Systems Laboratory                |
>>                  Institute of Computer Science                |
>>     Foundation for Research and Technology - Hellas (FORTH)   |
>>                                                               |
>>                 N.Plastira 100, Vassilika Vouton,             |
>>                  GR70013 Heraklion,Crete,Greece               |
>>                                                               |
>>               Web-site:http://www.ics.forth.gr/isl            |
>> --------------------------------------------------------------
>>
>>
>> _______________________________________________
>> Crm-sig mailing list
>> Crm-sig at ics.forth.gr
>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>
> -- 
> *Richard Light*
>
>
> _______________________________________________
> Crm-sig mailing list
> Crm-sig at ics.forth.gr
> http://lists.ics.forth.gr/mailman/listinfo/crm-sig


-- 
--------------------------------------------------------------
  Dr. Martin Doerr              |  Vox:+30(2810)391625        |
  Research Director             |  Fax:+30(2810)391638        |
                                |  Email: martin at ics.forth.gr |
                                                              |
                Center for Cultural Informatics               |
                Information Systems Laboratory                |
                 Institute of Computer Science                |
    Foundation for Research and Technology - Hellas (FORTH)   |
                                                              |
                N.Plastira 100, Vassilika Vouton,             |
                 GR70013 Heraklion,Crete,Greece               |
                                                              |
              Web-site: http://www.ics.forth.gr/isl           |
--------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ics.forth.gr/pipermail/crm-sig/attachments/20180914/07504f10/attachment-0001.html>


More information about the Crm-sig mailing list