[Crm-sig] representing textual phenomena

Carlo Meghini carlo.meghini at cnr.it
Thu Apr 7 15:40:52 EEST 2022


I'm working on the representation of literary works (LWs) using the CRM 
core and the LRMoo. I've made every LW, or any part of it, an instance 
of both E33_Linguistic_Object and F2_Expression. I'm interested in 
describing the syntactic structure of a LW, so by "part" I do not mean 
just chapters and sections, but also sentences, clauses, syntagmata and 
words; and I have compositional properties of coordination and 

Now I need to describe the fact that any part of a LW occurs within the 
text in several segments (some sentences are in fact broken in 2, 3 or 
even 4 segments), and the position of each segment within the LW. This 
is very similary to the situation where there is a global event (such as 
the execution of a piano concerto) which has one sub-event (such as a 
specific instrument playing) occurring in several time intervals. The 
difference is that these occurrences are not in the relative time of the 
concerto, but in the relative space of a LW.

I've made these segments instance of class TextFragment, a subclass of 
E33 but not of F2 because a fragment may be as small as a word or two. 
The property linking a part to the text fragments where it occurs is 
called "occursIn". Now my first question is (assuming the modeling is 
ok), where should I hook such property in the CRM property taxonomy?

The property linking a text fragment to the actual text is 
P190_has_symbolic_content. I guess this is rather uncontroversial.

Finally, the position of each text fragment within the LW must be 
represented. This is done by using 2 properties, one for the beginning 
and one for the ending of the fragment. The value of each property is a 
position within the text. A position, in turn, is identified by 2 
coordinates: the content unit where it occurs (e.g., chapter) and by a 
number giving the offset within the content unit. Final question: where 
would position, begining and ending position belong in the CRM?

Thank you for your time.


Carlo Meghini
Istituto di Scienza e Tecnologie dell'Informazione [ISTI]
Consiglio Nazionale delle Ricerche [CNR]
Area della Ricerca di Pisa
Via G. Moruzzi 1, 56124 Pisa

More information about the Crm-sig mailing list