[Crm-sig] representing textual phenomena
Carlo Meghini
carlo.meghini at cnr.it
Thu Apr 7 15:40:52 EEST 2022
Hi,
I'm working on the representation of literary works (LWs) using the CRM
core and the LRMoo. I've made every LW, or any part of it, an instance
of both E33_Linguistic_Object and F2_Expression. I'm interested in
describing the syntactic structure of a LW, so by "part" I do not mean
just chapters and sections, but also sentences, clauses, syntagmata and
words; and I have compositional properties of coordination and
subordination.
Now I need to describe the fact that any part of a LW occurs within the
text in several segments (some sentences are in fact broken in 2, 3 or
even 4 segments), and the position of each segment within the LW. This
is very similary to the situation where there is a global event (such as
the execution of a piano concerto) which has one sub-event (such as a
specific instrument playing) occurring in several time intervals. The
difference is that these occurrences are not in the relative time of the
concerto, but in the relative space of a LW.
I've made these segments instance of class TextFragment, a subclass of
E33 but not of F2 because a fragment may be as small as a word or two.
The property linking a part to the text fragments where it occurs is
called "occursIn". Now my first question is (assuming the modeling is
ok), where should I hook such property in the CRM property taxonomy?
The property linking a text fragment to the actual text is
P190_has_symbolic_content. I guess this is rather uncontroversial.
Finally, the position of each text fragment within the LW must be
represented. This is done by using 2 properties, one for the beginning
and one for the ending of the fragment. The value of each property is a
position within the text. A position, in turn, is identified by 2
coordinates: the content unit where it occurs (e.g., chapter) and by a
number giving the offset within the content unit. Final question: where
would position, begining and ending position belong in the CRM?
Thank you for your time.
Carlo
--
Carlo Meghini
Istituto di Scienza e Tecnologie dell'Informazione [ISTI]
Consiglio Nazionale delle Ricerche [CNR]
Area della Ricerca di Pisa
Via G. Moruzzi 1, 56124 Pisa
More information about the Crm-sig
mailing list