[Crm-sig] String values

Richard Light richard at light.demon.co.uk
Mon Mar 5 20:51:09 EET 2018


On 01/03/2018 15:19, Martin Doerr wrote:
> Dear Rob,
>
> On 3/1/2018 3:54 PM, Robert Sanderson wrote:
>>
>>  
>>
>> Let me try and summarize your position, to see if I understand correctly…
>>
>>     It is theoretically impossible to fulfil all of the complex
>> possibilities,
>>
> Yes.
>>
>> so the vast majority of pragmatic cases that just need a value
>> associated with the resource also cannot be fulfilled.
>>
> Whatever the vast majority is  and rdf:value does the job, I have no
> objections to its use.
> Just define precisely what you use it for. We can add that to our
> guidelines. It is already standard rdf.
This discussion started from a consideration of P90, but has moved away
from that, hence the change of subject.

One obvious area to address is that of Appellations. The examples for
this class are all simple strings, possibly with an associated type or
system ("ISSN", "ISRC", "Shelf mark").  The CRM is silent on how this
string value should be associated with the appellation.  I suggest that
rdf:value would be a pragmatic encoding of primitive values which
respects the CRM class itself (which would still be there, and could
have an identifier associated with it if required) and offers the chance
to encode a type, or system, or indeed custom units as per the example
in the RDFS spec [1]:

        crm:E44_Place_Appellation rdf:value "Vienna" .
        crm:E42_Identifier [
            rdf:value "0041-5278" ;
            crm:P2_has_type <http://loc.gov/systems/ISSN> ]

> If it is about persons, a good practice is for instance to use a URI
> of a resource such as ULAN, which defines all the name values. Having
> rdf:value to fill in a name, is equally good as rdfs:label.
A URI for a person would be an E42 Identifier.  This would be separate
from, and probably additional to, their name(s) being recorded as E41
Appellation (or E84 Actor Appellation, though IIRC we are deprecating
that class).  Obviously, identifying a person via a Linked Data URI
leads to a whole set of associated data via the RDF payload of that URI,
but that is no concern of ours (unless that resource also happens to be
encoded using the CRM - but there is no reason why it need be).


> If it is the content of a digital object, we should make this an
> ISSUE, because it has a certain complexity and needs harmonization
> with FRBRoo.
Another potential area of complexity I would flag up is that of
dimensions with units.  The CRM guidelines are very firm about the
special nature of pure numbers (as I would expect from a mathematician
:-)) but they are pretty silent on the properties of real-world
dimensions.  Examples:

        3ft 5 3/4"
        6°5’29”N 45°12’13”W
        31.5-32.5 cm

If each of these is recorded as a single string the simple rdf:value
approach can be used, though the results will be of little value for
search or analysis.  If we want to analyse them into their component
sub-values then more structure is needed.  The issue of imprecise
physical measurements is analogous to that of imprecise dates, and may
need an analogous solution within the CRM.

Best wishes,

Richard

[1]
https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-schema/index.html#ch_value

> Best,
>
> Martin
>>
>>  
>>
>> For the rest of us, I think we should agree in this community to use
>> rdf:value, per Conal’s email.
>>
>>  
>>
>> Rob
>>
>>  
>>
>>  
>>
>> *From: *Crm-sig <crm-sig-bounces at ics.forth.gr> on behalf of Martin
>> Doerr <martin at ics.forth.gr>
>> *Date: *Thursday, March 1, 2018 at 7:19 AM
>> *To: *"crm-sig at ics.forth.gr" <crm-sig at ics.forth.gr>
>> *Subject: *Re: [Crm-sig] Domain and range of P90
>>
>>  
>>
>> Dear Conal Tuohy,
>>
>> Your comments well taken, first again a general note: We try since 22
>> years carefully to make standards were standards are possible, and to
>> take care that the most relevant semantics for integrating data are
>> modelled, as long as they can be modelled at all with these means.
>> Standards making in the first place means following good practice
>> around the world and understanding, were a consensus appears, and
>> interfacing rather than redoing to those communities that have
>> effective competence in certain fields.
>>
>> So: "these gaps could be filled in a manner which is clear and simple
>> and interoperable" .... How much would we love to do! The point is,
>> that the encoding of names has an immense complexity. The most
>> comprehensive and experienced standards come from the library world.
>> If you believe there is a clear and simple solution, please try to
>> extract it from the library cataloguing rules (AACR2, RDA
>> https://www.oclc.org/en/rda/about.html), but there is also EAC-CPF
>> (http://eac.staatsbibliothek-berlin.de/) and FOAF. FOAF works badly
>> for historical data, as I was informed.
>>
>> The names of people are indeed an issue of interoperability. However,
>> if we have a particular person described with events etc., the exact
>> name itself has no further links to other kinds of facts than
>> instance matching with other occurrences of the same person (not
>> talking about families here). Therefore the damage to global
>> reasoning of having representation problems is relatively limited.
>> ULAN, for instance, registers an average of two names per known
>> artist. All practice of instance matching shows, that even encoding
>> well names, the identity question is not settled. Instance matching
>> is a science and says everything about the respective reasoning
>> needed, and the effectivity of name standards.
>>
>> In each case, in which an unambiguous formulation of some properties
>> cannot be achieved because the word is more complex, we can only rely
>> on mapping.
>>
>> More comments below:
>>
>> On 3/1/2018 5:02 AM, Conal Tuohy wrote:
>>
>>     One of the "gaps" which puzzles me most is the example you give
>>     of encoding the string value of an Appellation. I understand the
>>     recommended practice is to attach the string value of a person's
>>     name using P3_has_note, or actually, using a custom subproperty
>>     of P3_has_note. The semantics of P3_has_note itself are weak; a
>>     note is simply an "informal description" of something, so if I
>>     have a particular name (an RDF resource) which P3_has_note the
>>     literal string "Conal Tuohy", then I should really define
>>     subproperties so as to be able to distinguish that string value
>>     from a note which really is nothing more than an "informal
>>     description" of that name e.g. "A very uncommon name of Irish
>>     origin". What puzzles me most about this "gap" in the RDFS
>>     specification is that the distinction between a note ABOUT a
>>     name, and the actual textual representation OF a name is somehow
>>     considered out of scope of the CRM in RDFS. It's puzzling,
>>     because the string value of a name is something which really must
>>     be encoded in a standard fashion, to achieve interoperability (as
>>     an aside, my personal view is that the string literal "Conal
>>     Tuohy" could be attached to an Appellation using the rdf:value or
>>     rdfs:label predicate defined in the RDFS spec).
>>
>> I can only repeat that the instructions of using the CRM and even the
>> RDFS is making your own extensions. The weak semantics of P3 ensures
>> that information is reached, but not, that it is specifically
>> interpreted. Since there are world-wide no comprehensive encoding
>> schemes for personal names, you can reuse for instance FOAF
>> properties as subproperties of CRM properties, or reuse MARC encoded
>> strings. Both represent a good practice and are well defined. As long
>> as an LOD system has this information, it can map between them, run
>> instance matching algorithms, and display the information.
>>
>> Using rdfs:label can be a solution, as well as rdfs:value, which
>> should be discussed as possible recommendations. Also,
>> instantiating Appellation with a URI in its own right is not
>> necessary, if rdfs:label is sufficient. The problem is, that
>> rdfs:label creates overlapping semantics with any ontology dealing
>> with names.  We can only register this fact, by admitting that there
>> is more than one representation, depending on the case.
>>
>>     But the important thing is that the RDFS schema should stipulate
>>     how to attach this literal data rather than leave it as an open
>>     question. In general these are the kinds of issues which puzzle
>>     many people who approach the CRM from a position of having
>>     already worked with other RDF ontologies in the cultural heritage
>>     space, and find themselves wondering how they are supposed to
>>     make these details CRM work in RDF in an interoperable way,
>>     without having to pick and choose from a variety of techniques
>>     for "finessing" the gaps.
>>
>> Yes, but only if it is feasible at all, see above.
>>
>>      
>>
>>     These kinds of gaps are serious barriers to interoperability in
>>     the Linked Open Data cloud, and they need to be addressed by
>>     agreeing on some encoding procedures that can be used
>>     consistently by different projects on the web. It would be
>>     helpful to CRM adopters in the Linked Data community if these
>>     gaps could be filled in a manner which is clear and simple and
>>     interoperable. I am not in favour of just offering a menu of
>>     possible approaches, especially where individual projects would
>>     have to make local customisations to their schema. If there is
>>     some particular value in multiple approaches, then they could be
>>     published as different "profiles" that encoders could simply
>>     adopt, as a whole. I think the recent effort by Richard Light
>>     (and other contributors) to collate guidelines on RDF encoding is
>>     a great initiative!
>>     <https://docs.google.com/document/d/1zCGZ4iBzekcEYo4Dy0hI8CrZ7dTkMD2rJaxavtEOET0/edit>
>>     It deserves more input and I hope it will continue to be
>>     discussed on the list. I also think the Linked Art
>>     project http://linked.art/ with its "profile" of the CRM is
>>     another really good way forward.
>>
>> Every profile activity is well received and encouraged. If I say
>> "define your own extensions", this is of course not meant to be
>> individuals:-D, but communities of practice, the larger, the better.
>>
>> But in general, I ask all CRM_SIG members to develop an understanding
>> for the fact that perfect interoperability by perfect cataloguing
>> instructions is simply impossible, and has never been achieved. The
>> CRM approach is therefore to create a hierarchy of more and less
>> important semantics to be agreed on, not to have a perfect format,
>> but to make mappings *possible * and to *minimize *the need for
>> mappings.
>>
>> I use to describe the dilemma in these terms:
>>
>> */Making Standards/*
>>
>> The good with standards is there are so many!
>>
>> */When you have a standard, /*
>>
>> */You need to transform to the standard/*
>>
>> */You need to renew and adapt the standard/*
>>
>> */You need to transform to the renewed standards/*
>>
>> */Why not just transform data?/*
>>
>> */There are too many transformations, you need a standard/*
>>
>>
>> Therefore I ask for your understanding that my sometimes defensive
>> answers are nothing against the requirement raised, but to keep the
>> layering of relevance intact, otherwise any standard (as many) gets
>> completely out of control in the attempt to close all gaps with a
>> "clear and simple solution".
>>
>> Since this is an open forum, you are all encouraged to form active
>> working groups coming back with viable solutions for the gaps. These
>> "gap fillers" can be additional RDFS modules. They need *NOT* be
>> integral part of the official CRM version (even though the may
>> become!!). RDFS is definitely designed to be *modular. *In order to
>> become a CRM-SIG recommendation, they can be in a separate module.
>>
>> All the best,
>>
>> Martin
>>
>>      
>>
>>     Regards
>>
>>      
>>
>>     Conal
>>
>>      
>>
>>      
>>
>>     On 22 February 2018 at 19:46, George Bruseker
>>     <bruseker at ics.forth.gr <mailto:bruseker at ics.forth.gr>> wrote:
>>
>>         Dear Phil et al.,
>>
>>         I think this is a case of interpreting the label of the
>>         property rather than its intention. CRM ‘has value’ isn’t
>>         supposed to cover all possible meanings of the natural
>>         language interpretation of has value. Rather it has a very
>>         restricted use. It is meant to give the quantitive number
>>         value associated to a dimension. Dimension is a class that
>>         should be used to store information that results from a
>>         measurement activity. The measurement activity is specified
>>         as some procedural event that has the intentional objective
>>         of producing quantitative data. It is an activity of
>>         interacting with the world with the intention of producing a
>>         quantitive result.
>>
>>         So it would be a nonsensical, to say 'this paragraph (E73)
>>         has dimension (E54 defined as a quantitive result from a
>>         measuring procedure) has value “the characters in this
>>         paragraph” (E59 primitive value). The definition of E54
>>         forbids it because a string is not a quantity (though of
>>         course it may have a quantity… that would have to be measure).
>>
>>         That of course sounds irritating. It would be nice to have a
>>         property that could store all values. But then of course that
>>         property would mean everything and nothing and the ontology
>>         wouldn’t work for getting specific information, like the
>>         quantitative results of measurement activities separate from
>>         any other value ‘good’ ‘bad’ ‘ugly’ ‘monogamy’ ‘world peace’
>>         ‘all the characters in this present string’.
>>
>>         That’s the ontological argument. The practical question is
>>         why you are looking to expand the scope. I’m guessing that
>>         the reason is because you want a unique place to store a data
>>         value (this is a guess, so please do correct my presumption
>>         if I’m wrong).
>>
>>         This seems to me to get back to the encoding issue and having
>>         a standard strategy. I think that a usual suggestion could be
>>         to throw it into string via P3 via note. Another suggestion
>>         would be to put it in label and, as I recall, there is rdf
>>         has value which could hold the actual data points. You will
>>         note, in retort, that p3 handles different kinds of
>>         information so is not a good solution. Point taken.
>>
>>         In any case, I would argue that increasing the range of the
>>         existing property to E1 E59 clearly cannot work because that
>>         would be a completely different meaning of the property and
>>         it would cause all sorts of backwards incompatibilities and
>>         data problems. It would really be an undoing of good
>>         information structure. That being said, some sort of solution
>>         either in the ontology or as an encoding formalization of
>>         where to stick the actual ‘values’ of an entity ought to be
>>         found.
>>
>>         I think the right direction might already have been found
>>         with CRMsci which generalizes the notion of measurement to
>>         observation. Observation is a class that documents events of
>>         systematic observing (without that this be measuring, a
>>         clearly distinct and different real life human activity with
>>         different parameters of interest) and allows the tracing of
>>         observing a value (here the range is even more radical, set
>>         at E1) and setting the property type. (see the definitions
>>         http://www.cidoc-crm.org/crmsci/sites/default/files/2017-03-22%23CRMsci1.2.3_esIP.pdf)
>>         This has a great deal of flexibility since we need to know
>>         not just the value of any random thing that someone has
>>         assigned to some object, but at the very least, of what type
>>         it is.
>>
>>         Consider one of Rob’s examples:
>>
>>         ‘linguistic objects have values’
>>
>>         Linguistic Object: here do mean the characters themselves?
>>         the propositional content? the darkness of the font, the font
>>         type, the style of encoding. these are all potential values
>>         of the linguistic object. Obviously we don’t want to let our
>>         ontology toss all this in the same bucket, right? I think the
>>         same argument would go for appellation.
>>
>>         Not to mention, how one could irritatingly misinterpet the
>>         sentence ‘linguistic objects have values’ to imply their
>>         adherence to a dogma, a political party, a certain sense of
>>         taste in dress.
>>
>>         Digital Image, I am not sure we would have a problem with, as
>>         it is a mathematical object and as such I guess its
>>         properties are quantitive and therefore just good old
>>         fashioned dimension.
>>
>>         All this being said, obviously you raise the issue because
>>         there are things that you need to document in the real world
>>         and are presently unable to encode as you would need using
>>         CRM. Obviously, something like a property with the natural
>>         language interpretation of ‘has value’  has an intuitive
>>         appeal. Would you give a few examples of the problems areas
>>         (I would certainly not assert that they do not exist), so we
>>         can think together of a solution that is ontologically sound
>>         and pragmatically applicable?
>>
>>         Cheers,
>>
>>         George
>>
>>
>>
>>         > On Feb 21, 2018, at 7:30 PM, Franco Niccolucci
>>         <franco.niccolucci at gmail.com
>>         <mailto:franco.niccolucci at gmail.com>> wrote:
>>         >
>>         > Hm.
>>         >
>>         > The current way of representing something similar (but
>>         different) to what you propose is:
>>         >
>>         > E70 Thing -> P43 has dimension -> E54 Dimension -> P90 has
>>         value -> E60 Number
>>         >
>>         > The path starts from Things (and not CRM Entities) and ends
>>         to Numbers (and not Primitive Values, i.e. also Strings, Time
>>         Primitives and whatever we can invent in the future): it
>>         gives a numeric value to a thing.
>>         >
>>         > The proposed change would allow giving, through the "new"
>>         P90, a generic value defined as E59 Primitive Value, i.e
>>         anything, also to E2 Temporal Entities, E53 Places etc, all
>>         subclasses of E1.
>>         >
>>         > What can be an example of the Primitive Value of a Temporal
>>         Entity or of a Place?
>>         >
>>         > For example “Bronze Age”, an instance of E4 Period, cannot
>>         have a primitive value whatever; it may have a Time Span and
>>         take place somewhere in a Place. Time spans may P83/84 have
>>         durations, instances of E54.
>>         >
>>         > Dimensions would need to be considered not only as
>>         something that can be measured with numbers only: for example
>>         “poor - fair - good - excellent” would be acceptable for the
>>         space of Values, same for “strings of UTF8 characters”. It is
>>         not necessary to specify what the values is, as it by
>>         definition could be anything
>>         >
>>         > So I would rather suggest to leave the domain of P43 as is,
>>         i.e. Things only; and the range of P90, as you propose, could
>>         become E59, i.e. strings or anything else to be created as
>>         subclass of E59, without short-cutting the above.
>>         >
>>         > This allows specifying what we are talking about the Thing
>>         (its length, its social value, its ranking on its Facebook
>>         page, its translation into Estonian), i.e. the dimension; and
>>         how we measure it if desired, - E58 Measurement Unit.
>>         >
>>         > Best
>>         >
>>         > Franco
>>         >
>>         > PS This discussion reminds me of a commercial advertising a
>>         credit card. It showed somebody buying a ring for the beloved
>>         one, paying the dinner with her, buying flowers, and ended
>>         saying that one can buy everything with the card, but romance
>>         has no price.
>>         >
>>         > Prof. Franco Niccolucci
>>         > Director, VAST-LAB
>>         > PIN - U. of Florence
>>         > Scientific Coordinator
>>         > ARIADNE - PARTHENOS
>>         >
>>         > Piazza Ciardi 25
>>         > 59100 Prato, Italy
>>         >
>>         >
>>         >> Il giorno 21 feb 2018, alle ore 17:13, Robert Sanderson
>>         <RSanderson at getty.edu <mailto:RSanderson at getty.edu>> ha scritto:
>>         >>
>>         >>
>>         >>
>>         >> Definitely in favor of this.  Linguistic Objects can have
>>         values. Appellations have values. Digital Images have values.
>>         Etc.
>>         >>
>>         >>
>>         >>
>>         >> Rob
>>         >>
>>         >>
>>         >>
>>         >>
>>         >>
>>         >> From: Crm-sig <crm-sig-bounces at ics.forth.gr
>>         <mailto:crm-sig-bounces at ics.forth.gr>> on behalf of
>>         "Carlisle, Philip" <Philip.Carlisle at HistoricEngland.org.uk
>>         <mailto:Philip.Carlisle at HistoricEngland.org.uk>>
>>         >> Date: Wednesday, February 21, 2018 at 4:04 PM
>>         >> To: "crm-sig (Crm-sig at ics.forth.gr
>>         <mailto:Crm-sig at ics.forth.gr>)" <Crm-sig at ics.forth.gr
>>         <mailto:Crm-sig at ics.forth.gr>>
>>         >> Subject: [Crm-sig] Domain and range of P90
>>         >>
>>         >>
>>         >>
>>         >> Dear all,
>>         >> Naïve question.
>>         >>
>>         >>
>>         >>
>>         >> Is there any reason why P90 has value could not/should not
>>         change its domain and range from:
>>         >>
>>         >>
>>         >>
>>         >> Domain:                        Range
>>         >>
>>         >> E54 Dimension              E60 Number
>>         >>
>>         >>
>>         >>
>>         >> to
>>         >>
>>         >>
>>         >>
>>         >> E1 CRM Entity              E59 Primitive Value
>>         >>
>>         >>
>>         >>
>>         >> I look forward to you answers
>>         >>
>>         >>
>>         >>
>>         >> Phil
>>         >>
>>         >>
>>         >>
>>         >>
>>         >>
>>         >>
>>         >>
>>         >> Phil Carlisle
>>         >>
>>         >> Knowledge Organization Specialist
>>         >>
>>         >> Listing Group, Historic England
>>         >>
>>         >> Direct Dial: +44 (0)1793 414824
>>         <tel:%2B44%20%280%291793%20414824>
>>         >>
>>         >>
>>         >>
>>         >> http://thesaurus.historicengland.org.uk/
>>         >>
>>         >> http://www.heritagedata.org/blog/
>>         >>
>>         >>
>>         >>
>>         >> Listing Information Services fosters an environment where
>>         colleagues are valued for their skills and knowledge, and
>>         where communication, customer focus and working in
>>         partnership are at the heart of everything we do.
>>         >>
>>         >>
>>         >>
>>         >>
>>         >>
>>         >>
>>         >>
>>         >> <image001.jpg>
>>         >>
>>         >> We help people understand, enjoy and value the historic
>>         environment, and protect it for the future. Historic England
>>         is a public body, and we champion everyone’s heritage, across
>>         England.
>>         >> Follow us:  Facebook  |  Twitter  |  Instagram     Sign up
>>         to our newsletter
>>         >>
>>         >> Help us create a list of the 100 places which tell
>>         England's remarkable story and its impact on the world. A
>>         History of England in 100 Places sponsored by Ecclesiastical.
>>         >>
>>         >> We have moved! Our new London office is at 4th Floor,
>>         Cannon Bridge House, 25 Dowgate Hill, London, EC4R 2YA.
>>         >>
>>         >>
>>         >> This e-mail (and any attachments) is confidential and may
>>         contain personal views which are not the views of Historic
>>         England unless specifically stated. If you have received it
>>         in error, please delete it from your system and notify the
>>         sender immediately. Do not use, copy or disclose the
>>         information in any way nor act in reliance on it. Any
>>         information sent to Historic England may become publicly
>>         available.
>>         >>
>>         >>
>>         >>
>>         >> _______________________________________________
>>         >> Crm-sig mailing list
>>         >> Crm-sig at ics.forth.gr <mailto:Crm-sig at ics.forth.gr>
>>         >> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>         >
>>         >
>>         > _______________________________________________
>>         > Crm-sig mailing list
>>         > Crm-sig at ics.forth.gr <mailto:Crm-sig at ics.forth.gr>
>>         > http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>
>>
>>         _______________________________________________
>>         Crm-sig mailing list
>>         Crm-sig at ics.forth.gr <mailto:Crm-sig at ics.forth.gr>
>>         http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>
>>
>>
>>      
>>
>>     -- 
>>
>>     Conal Tuohy
>>
>>     http://conaltuohy.com/
>>
>>     @conal_tuohy
>>     +61-466-324297
>>
>>
>>
>>
>>     _______________________________________________
>>
>>     Crm-sig mailing list
>>
>>     Crm-sig at ics.forth.gr <mailto:Crm-sig at ics.forth.gr>
>>
>>     http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>
>>  
>>
>> -- 
>> --------------------------------------------------------------
>>  Dr. Martin Doerr              |  Vox:+30(2810)391625        |
>>  Research Director             |  Fax:+30(2810)391638        |
>>                                |  Email: martin at ics.forth.gr <mailto:martin at ics.forth.gr> |
>>                                                              |        
>>                Center for Cultural Informatics               |
>>                Information Systems Laboratory                |
>>                 Institute of Computer Science                |
>>    Foundation for Research and Technology - Hellas (FORTH)   |
>>                                                              |
>>                N.Plastira 100, Vassilika Vouton,             |
>>                 GR70013 Heraklion,Crete,Greece               |
>>                                                              |
>>              Web-site: http://www.ics.forth.gr/isl           |
>> --------------------------------------------------------------
>
>
> -- 
> --------------------------------------------------------------
>  Dr. Martin Doerr              |  Vox:+30(2810)391625        |
>  Research Director             |  Fax:+30(2810)391638        |
>                                |  Email: martin at ics.forth.gr |
>                                                              |        
>                Center for Cultural Informatics               |
>                Information Systems Laboratory                |
>                 Institute of Computer Science                |
>    Foundation for Research and Technology - Hellas (FORTH)   |
>                                                              |
>                N.Plastira 100, Vassilika Vouton,             |
>                 GR70013 Heraklion,Crete,Greece               |
>                                                              |
>              Web-site: http://www.ics.forth.gr/isl           |
> --------------------------------------------------------------
>
>
> _______________________________________________
> Crm-sig mailing list
> Crm-sig at ics.forth.gr
> http://lists.ics.forth.gr/mailman/listinfo/crm-sig

-- 
*Richard Light*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ics.forth.gr/pipermail/crm-sig/attachments/20180305/a9a2048b/attachment-0001.html>


More information about the Crm-sig mailing list