[Crm-sig] P72 has Language

George Bruseker george.bruseker at gmail.com
Tue Oct 15 12:43:35 EEST 2019


Dear all,

I think that the turn to the data is the right move here.

So some examples at hand:

Viaf - widely used ref

http://viaf.org/viaf/27251336

Emu - widely adopted collections management system


Chin makers in Canada model - national standard body Canada

It is in the requirements by researchers for building their persons model

Finally, I would come back to the initial example. The project in question
as far as I follow it, looks to extract analytic data typically encoded by
researchers in archival formats that HAVE NOT ALLOWED the precise recording
of information that they would have liked to record Analytically and
formally. Ie they are precisely trying to find good structures where there
were none. The solution is certainly not to document it as an information
object.

I am glad to keep finding examples. I hope we can use this practice
generally and could even think of formalizing a list of schemas of
reference.

Best,

George

On Tue, 15 Oct 2019 at 3:50 AM Franco Niccolucci <
franco.niccolucci at gmail.com> wrote:

> Dear all,
>
>
> having somehow started this discussion in a hot August evening, let me
> remind you that the initial question was:
>
> "When describing biographical information [in an archive] it’s common to
> state that some person was fluent in some language, or languages, apart
> from his/her native one. Using current archival descriptions standards
> [ISAD(G) 3.2.2; EAD <bioghist>] this is represented within a text, usually
> a very long text string with information of distinct natures. So far we
> have been able to decompose the different elements and represent them
> adequately as instances of CIDOC-CRM classes and link them trough the
> suitable properties.
> We cannot link a Person (E21) to a language (E56) and neither use multiple
> instantiation, as it has been suggested in other cases (
> http://www.cidoc-crm.org/Issue/ID-258-p72-quantification), because Person
> (E21) and Linguistic Object (E33) are disjoint.”
>
> I understand these bios consist in a text, and metadata are added to it as
> instances of various CIDOC-CRM classes. The question was how to indicate in
> such metadata the knowledge of a language as reported in the bio: so not a
> real quality of the person, but a fact documented. My suggestion was to use
> E74 Group. I always prefer to use what is already available and avoid the
> unnecessary proliferation of classes and properties, in my opinion there
> are already (more than) enough. But in doing so I try to maximize
> expressiveness, as otherwise one class (E1 CRM Entity) and one property (P2
> has type) would be sufficient for the whole world: P2 is not a
> jack-of-all-trades.
>
> Reportedly, the Group solution seemed to please the person who made the
> question.
>
> I don’t know if the "language spoken" is an information usually taken into
> account in CH; but in this case it was by the archivist, otherwise no
> question would have been aaked.
>
> Best regards
>
> Prof. Franco Niccolucci
> Director, VAST-LAB
> PIN - U. of Florence
> Scientific Coordinator
> ARIADNEplus - PARTHENOS
>
> Editor-in-Chief
> ACM Journal of Computing and Cultural Heritage (JOCCH)
>
> Piazza Ciardi 25
> 59100 Prato, Italy
>
>
> > Il giorno 14 ott 2019, alle ore 22:39, Detlev Balzer <db at balilabs.de>
> ha scritto:
> >
> > Dear George, Martin,
> >
> > this discussion made me curious whether or not I can confirm George's
> assertion that such statements are common in the cultural heritage field.
> >
> > EAC-CPF does have a language element, which is, however, only used to
> indicate in which language the name of a person or corporation is
> expressed.
> >
> > GND, the authority file for libraries in German-speaking countries, has
> a Language entity which is used for making statements about the "field of
> study" of a person. Other predicates for the person-language pair of
> entities do occur, but these are obvious data entry errors.
> >
> > Having extracted person-related data from a dozen or more cultural
> heritage projects, I don't remember any example where languages spoken or
> known by somebody have been considered in any other sense than relating to
> the documented activity, rather than to the (possibly un-instantiated)
> capacity of the person.
> >
> > Of course, this is just an observation that doesn't prove anything.
> Personally, I would tend towards Martin's view that there is little, if
> anything, to be gained by defining such kind of statement in a reference
> model such as the CIDOC CRM.
> >
> > Best wishes,
> > Detlev
> >
> >> George Bruseker <george.bruseker at gmail.com> hat am 14. Oktober 2019 um
> 19:45 geschrieben:
> >>
> >>
> >> Dear Martin,
> >>
> >> The conversation began with a use case from an archive. I just inform
> that
> >> this is also found in all the projects I work on for memory
> institutions.
> >> They find it in scope, so looking further afield for what
> anthropologists
> >> do doesn't seem like a necessary step? Though highly fascinating!
> >>
> >> Best
> >>
> >> George
> >>
> >>
> >>
> >> On Mon, Oct 14, 2019, 6:58 PM Martin Doerr <martin at ics.forth.gr> wrote:
> >>
> >>> Dear George, All,
> >>>
> >>> As a second thought:
> >>>
> >>> I think documentation formats such as LIDO are an adequate place to add
> >>> such useful properties to characterize items in a more detailed way, we
> >>> would not put in the CRM analytically. Shapes, colors etc. being
> typical
> >>> examples.
> >>>
> >>> Question: Are there formats from the archival world that use to
> describe
> >>> the languages people speak? EAD CFP?
> >>> Libraries are interested in the languages someone publishes in, not
> >>> speaking.
> >>>
> >>> What are the anthropologists registering? Would they be interested in
> >>> languages learned at school, or rather in the language used for
> >>> communication in a typical group? Would they document people being
> >>> incapable of communicating in that group?
> >>> Or just infer language via group?
> >>>
> >>> How to distinguish native speakers from non-native?
> >>>
> >>> Would historians make cases of people that could not communicate in a
> >>> given language, with societal effects?
> >>>
> >>> What about illiterate people? Speaking, not writing...? Maintaining
> oral
> >>> history with great precision, etc.
> >>>
> >>> What about creoles ?
> >>>
> >>> Best,
> >>>
> >>> Martin
> >>>
> >>> On 10/14/2019 7:33 PM, Martin Doerr wrote:
> >>>
> >>>
> >>> Dear George,
> >>>
> >>> The first principle of all is are there relevant queries that need that
> >>> property for integrating disparate sources, which indeed provide such
> data,
> >>> and is that research one we like to support with the CRM?
> >>>
> >>> Second, using p2 on E21 does the job, doesn't it? What is the added
> value
> >>> of "knows language"?
> >>>
> >>> Next principle, keep the ontology small. Querying 1000 properties is
> >>> already more than anybody can keep in mind. Each additional property
> has an
> >>> implementation cost. We need strong arguments for relevance.
> >>>
> >>> It has been the mos t important success factor of the CRM to keep the
> >>> ontology small and still expressive enough. If we loose this
> discipline, we
> >>> will loose the whole project.
> >>>
> >>> Finally, we are not repeating in the CRM the way typically information
> >>> systems document, but always tried to find a more fundamental
> >>> representation. With that argument, we could never have introduced
> events.
> >>> They did NOT appear in any of the typical systems at that time. It is a
> >>> principle *not *to model all the valuable description elements, which
> are
> >>> relevant to characterize an item, but not creating interesting links
> across
> >>> resources.
> >>>
> >>> I did not say that it is a personal opinion that someone speaks a
> >>> language. I said, this is observable. I document: Franco has spoken
> Latin,
> >>> repeatedly? But talking about skills, is another level, it introduces a
> >>> quality, which is hard to objectify, as Franco has pointed out.
> Actually,
> >>> it is a typical classification problem, with all its boundary case
> >>> questions, and the CRM is about relations between particulars.
> >>>
> >>> So, what is the* added value* against p2, and what are the typical
> >>> research data and typical research questions for *integrating* such
> data,
> >>> that cannot be answered with P2?
> >>>
> >>> Best,
> >>>
> >>> martin
> >>>
> >>>
> >>>
> >>>
> >>> On 10/14/2019 4:24 PM, George Bruseker wrote:
> >>>
> >>> Dear Martin,
> >>>
> >>> Which is CEO’s proposition that you support? It gets lost in the
> string.
> >>> Do you mean that a) a person speaking a language means being part of a
> >>> group, or b) using the p2 on E21 and then make types for ’Speakers
> of...'
> >>>
> >>> I am (still and very much ) a supporter of a new property ‘knows
> >>> language'. I do not think that the group solution works because of the
> >>> identify criteria of groups. I also don’t think the event solution is
> >>> necessary (another suggestion that has floated in this conversation).
> It is
> >>> often the case that for person we do not know events of their
> acquisition
> >>> or use of language or a skill but we do have proposition that they had
> that
> >>> language or skill! I also don’ t support the ‘English Speakers’ type
> >>> solution since it provides a different URI than the URI for ‘English’
> and
> >>> forces more, obscure, modelling.
> >>>
> >>> Another CIDOC CRM principle is model at the level of knowledge that is
> >>> typically present in information systems. Again, I think the present
> case
> >>> (people know languages) is identical to the case of
> >>>
> >>> E22 consists of E57 Material
> >>>
> >>> This is a typical piece of knowledge held about an object. It would be
> >>> obtuse to insist that one should create an event node to indicate the
> >>> manner of this material becoming the constituting material of the
> object
> >>> when we don’t know this fact. This is why CRM represents such binary
> >>> relations, because they are real, they are a level of knowledge and
> they
> >>> are observable.
> >>>
> >>> If someone has entered into an information system George: English, Pot
> >>> Making, it is unlikely that what they want to reconstruct are
> instances of
> >>> me using English or performing Pot making. Rather they are interested
> that
> >>> there is an individual which has a particular formation which means
> that he
> >>> knows language x, knows skill x. This information is probably used in
> an
> >>> actual integration to connect an instance of E21 via an instance of E57
> >>> Language to for example E33 that use the same E57.
> >>>
> >>> It would seem we need some sort of hierarchy in the principles which
> can
> >>> also be conflicting.
> >>>
> >>>
> >>> My approach is not documenting skills*.* My approach is documenting
> >>> facts, rather than potentials. I take notice and may document that you
> >>> spoke Latin, as I have done last time at school. I have a document
> stating
> >>> my grade in Latin at high school.  My grade at high school confirms a
> set
> >>> of years of continued successful lessons, not that I could understand
> much
> >>> Latin now;-).
> >>> Speaking a language can be documented as an extended (observed)
> activity,
> >>> as in FRBRoo.
> >>>
> >>>
> >>> It may be, but is it typically? I have never seen an information
> system,
> >>> especially in museum context that would.
> >>>
> >>> For instance, someone writing books in particular language. This falls
> >>> under any kind of extended activity not further specified, such as an
> >>> artist using a technique for some time, and avoids transforming actual
> >>> activities into potentials.
> >>>
> >>> We can document someone's documented opinion about a potential of a
> >>> person, as an information object.
> >>>
> >>>
> >>> That would make this information mostly unusable however. If our goal
> is
> >>> to functionally use the observation person x speaks language y, then it
> >>> needs to be semantically represented and not made a string.
> >>>
> >>>
> >>> In the "Principles for Modelling Ontologies" we refer:
> >>> "7.2 Avoid concepts depending on a personal/ spectator perspective"
> >>>
> >>> This could be elaborated more. In the CRM, we do not model concepts
> >>> "because people use them", but because they can be used to integrated
> >>> information related to them with URIs.  Therefore, your arguments and
> what
> >>> I wanted to say is, "skill" is a bad concept for integration. What
> should
> >>> be instantiated are the observable activities, which may or may not
> >>> indicate skills.
> >>>
> >>>
> >>> I don’t see that this principle applies. It is not a personal
> perspective
> >>> that someone speaks a language, anymore than it is a personal
> perspective
> >>> that an object is constituted of a material. This fact can be
> documented
> >>> and observed. Someone else can come and do the same. Don’t believe
> Franco
> >>> can speak Latin? Watch him and see if he can. When someone writes in an
> >>> information system, they probably typically mean, some evidence leads
> me to
> >>> assert Person y knows language y. They do not mean to say at some
> point in
> >>> the past he learned it, or at some point he performed it.
> >>>
> >>> In the case of documenting that someone knows a language this can be
> used
> >>> practically to integrate using URIs just in case we
> <https://www.google.com/maps/search/in+case+we+?entry=gmail&source=g>use
> the same URI for
> >>> English that we use to describe a document and that we use to describe
> the
> >>> knowledge of the individual
> >>>
> >>> E21 knows language E57 Language URI:AA
> >>> E33 has language E57 Language URI:AA
> >>>
> >>> answers the query, who in this graph knew the language this document
> was
> >>> written in.
> >>>
> >>> Functionally, the issue for me  is, is there a good reason against
> adding
> >>> a binary property off of person which can indicate their knowledge
> ability
> >>> and connect to a well known URI for a language.
> >>>
> >>> Best,
> >>>
> >>> George
> >>>
> >>>
> >>> --
> >>> ------------------------------------
> >>> Dr. Martin Doerr
> >>>
> >>> Honorary Head of the
> >>> Center for Cultural Informatics
> >>>
> >>> Information Systems Laboratory
> >>> Institute of Computer Science
> >>> Foundation for Research and Technology - Hellas (FORTH)
> >>>
> >>> N.Plastira 100, Vassilika Vouton,
> >>> GR70013 Heraklion,Crete,Greece
> >>>
> >>> Vox:+30(2810)391625
> >>> Email: martin at ics.forth.gr
> >>> Web-site: http://www.ics.forth.gr/isl
> >>>
> >>>
> >>> _______________________________________________
> >>> Crm-sig mailing listCrm-sig at ics.forth.grhttp://
> lists.ics.forth.gr/mailman/listinfo/crm-sig
> >>>
> >>>
> >>> --
> >>> ------------------------------------
> >>> Dr. Martin Doerr
> >>>
> >>> Honorary Head of the
> >>> Center for Cultural Informatics
> >>>
> >>> Information Systems Laboratory
> >>> Institute of Computer Science
> >>> Foundation for Research and Technology - Hellas (FORTH)
> >>>
> >>> N.Plastira 100, Vassilika Vouton,
> >>> GR70013 Heraklion,Crete,Greece
> >>>
> >>> Vox:+30(2810)391625
> >>> Email: martin at ics.forth.gr
> >>> Web-site: http://www.ics.forth.gr/isl
> >>>
> >>> _______________________________________________
> >>> Crm-sig mailing list
> >>> Crm-sig at ics.forth.gr
> >>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
> >>>
> >> _______________________________________________
> >> Crm-sig mailing list
> >> Crm-sig at ics.forth.gr
> >> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
> >
> > _______________________________________________
> > Crm-sig mailing list
> > Crm-sig at ics.forth.gr
> > http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ics.forth.gr/pipermail/crm-sig/attachments/20191015/8f652608/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: IMG_0158.jpg
Type: image/jpg
Size: 143513 bytes
Desc: not available
URL: <http://lists.ics.forth.gr/pipermail/crm-sig/attachments/20191015/8f652608/attachment-0001.jpg>


More information about the Crm-sig mailing list