[Crm-sig] ISSUE: representing compound name strings

Christian-Emil Smith Ore c.e.s.ore at iln.uio.no
Thu Nov 22 11:54:39 EET 2018

Dear all,

As Richard and Øyvind show, it is always possible to encode infomormation into a string both in MARC and in XML. It is of course also possible to use JSON for those prefering that schema formalism.  MARC is old, that is correct and it is not very readable.  I would dare to say that MARC and JSON share the unreadability property. MARC and TEI represent predefined but flexible  encoding schemas. XML and JSON are general encoding formalisms.


The main point is that data encoded in XML, TEI-XML, MARC, JSON are strings which  can be represented as literals in RDF.  However, if one want to decode the string into  structured information one need to know (and agree on) the encoding schema. As Øyvind points out, there are many ways to encode the same information in TEI  as well as in MARC.



From: Crm-sig <crm-sig-bounces at ics.forth.gr> on behalf of Øyvind Eide <lister at oeide.no>
Sent: 22 November 2018 06:52
To: Martin Doerr
Cc: crm-sig at ics.forth.gr
Subject: Re: [Crm-sig] ISSUE: representing compound name strings

Dear Martin,

this is how the TEI would do it: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ND.html#NDPER

So something like:

<persName><roleName type="royal">His Majesty</roleName> <roleName type="academic">Dr.</roleName> <forename>Snoopy</forename> <addName type="nickname">Hickup</addName> <surname>Miller</surname> <genName>Jr</genName></persName>

But one would need guidelines, esp. for the type attribute. In TEI everything can be done, and in several ways...



Am 21.11.2018 um 23:11 schrieb Martin Doerr <martin at ics.forth.gr<mailto:martin at ics.forth.gr>>:

Dear Richard,

XML is even better. The distinction between XML tags and MARC subfield markers is not so substantial. An XML file is still a string. The question is about RDF, putting a compound into rdfs:Literal.
So, again, is there a good practice with XML elements ????



On 11/21/2018 6:58 PM, Richard Light wrote:

On 15/11/2018 21:28, Martin Doerr wrote:
Dear All,

I would expect that the library or archival community do have a good practice how to "squeeze" a compound name, such as :
"His Majesty Dr. Snoopy Hickup Miller Jr", with respective separators, in a machine readable string, that could be used as custom datatype in an rdfs:Literal as one instance of Appellation, rather than defining all possible name constituents as individual rdf properties.

Could be a MARC string? XML? TEI?

This would be very helpful for our users.

I'm pretty sure that the most recent attempt at doing this will be the subfield markers ($a, etc.) in MARC. which date from the era of punched cards.  The requirement that all of the name appears in a single string will rule out anything that might have been done in XML (where you might typically use attributes or subelements) or TEI (which is, after all, simply an XML application).

It's a nice idea, which follows the approach of encoding one 'compound' value as a single string, but I don't think we will find a ready-made standard for it.




Richard Light

Crm-sig mailing list
Crm-sig at ics.forth.gr<mailto:Crm-sig at ics.forth.gr>

 Dr. Martin Doerr

 Honorary Head of the
 Center for Cultural Informatics

 Information Systems Laboratory
 Institute of Computer Science
 Foundation for Research and Technology - Hellas (FORTH)

 N.Plastira 100, Vassilika Vouton,
 GR70013 Heraklion,Crete,Greece

 Email: martin at ics.forth.gr<mailto:martin at ics.forth.gr>
 Web-site: http://www.ics.forth.gr/isl

Crm-sig mailing list
Crm-sig at ics.forth.gr<mailto:Crm-sig at ics.forth.gr>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ics.forth.gr/pipermail/crm-sig/attachments/20181122/75e2fb98/attachment-0001.html>

More information about the Crm-sig mailing list