[Crm-sig] New Issue CRMtex: differentiate TX8 Grapheme definition

Martin Doerr martin at ics.forth.gr
Sun Oct 10 22:43:49 EEST 2021


Dear Achille, dear Francesca,

My apologies for using "grapheme" for "glyph" and "grapheme". I 
understand the difference😁

You write: "A grapheme is atomic by definition as it represents the 
minimum unit (i.e., a unit that cannot be further decomposed) of 
a writing system."

I agree, but you define:


      TX8 Grapheme

Subclass of: E90 <#_E90_Symbolic_Object> Symbolic Object

Superclass of:

Scope Note:         Subclass E90 <#_E90_Symbolic_Object> Symbolic Object 
used to represent the abstract units with distinctive value in a given 
writing system. *A grapheme is a character **or sequence of characters* 
[MD1] that functions as a distinct unit within an orthography. It
------------------------------------------------------------------------

I wonder about "sequence of characters".  I think this should be 
differentiated.

You define:

*TXP11 transcribed (was transcribed by)*
Domain: TX6 <#_TX6_Transcription> Transcription
Range: TX8 <#_TX8_Grapheme> Grapheme
Subproperty of: P16 <#_P16_used_specific> used specific object (was used 
for)
  Quantification: many to many (0,n:0,n)
  Scope note: This property highlights the specific way in which an 
activity of TX6 <#_TX6_Transcription> Transcription results in the 
rendering of the specific TX8 <#_TX8_Grapheme> Grapheme(s) of which an 
instance of TX1 <#_A1_Excavation_Process> Written Text is composed.

I think we confuse here glyphs, graphemes, and symbolic occurrences of 
graphemes. I understand the following:

The reading in the sense of observation understands*each glyph as 
materialization of a grapheme*. (As I pointed out in an earlier message, 
the candidate graphemes must be limited in some way). A sequence of 
glyphs does NOT correspond to a sequence of graphemes, but a sequence of 
grapheme occurrences.
"EEEEE" is a sequence of *5 occurrences *of one grapheme in one symbolic 
object.


Since /TXP11 transcribed is /the only output property of TX6 
Transcription, you cannot describe by TXP11 the result of a 
transcription of a text. The sequence "EEEEE" uses only grapheme "E", 
one instance.

TX5 Reading does not allow to describe the glyph-grapheme association of 
a single glyph.

I miss the output of a reading process. Is it thought to be a 
transcription? Always? Either TX5 or TX6 should result in a Symabolic 
Object representing the written text, which is at least the sequence of 
grapheme occurrences corresponding to the sequence of glyphs, and 
possibly more structure features.

I propose not to mix observation and understanding, because as far as I 
understand, the use of interpretation of the intended meaning of a 
written text is only relevant for reading when the glyphs and their 
arrangements are ambiguous.

I think the scope note of TX5 is confusing, because is talks about 
"decoding signs", without relating it to the glyph - grapheme association:

"The reading activity, thus, is intended as a specific observation (S4) 
in which the decoding of the signs is performed, i.e. the linguistic 
value is recognised and the message is understood. Cases in which 
decoding does not happen (e.g., the observer is able to describe the 
signs but not to assign a specific linguist value to them), the S4 class 
could be used as it is..."

I further miss a property implementing the glyph-grapheme association.

If we interpret a text in symbolic form as a sequence of grapheme 
occurrences in the symbolic text, we may need another property of which 
graphemes occurred. I would like to have your expert opinion, if a 
Writing system used for a Written Text is different from that used for a 
symbolic text (IsA Expression).

I hope this makes my concerns more clear😁

All the best,

Martin

On 10/10/2021 4:30 PM, Achille Felicetti wrote:
> Dear Martin,
>
> Our HW for issue 545.
>
>
>
> Furthermore, the grapheme is a conceptual and non-concrete unit and is 
> made manifest by the individual act (performance) of writing. For this 
> reason, a grapheme of an actual text cannot exist. Instead, in actual 
> texts we find glyphs that are precisely the physical manifestation of 
> graphemes.
>
> We hope this clarifies :-)
>
> Ciao,
> Achille & Francesca
>
>> Il giorno 17 giu 2021, alle ore 15:02, Martin Doerr via Crm-sig 
>> <crm-sig at ics.forth.gr <mailto:crm-sig at ics.forth.gr>> ha scritto:
>>
>> Dear All,
>>
>> I think we need to distinguish the set of all possible atomic 
>> graphemes of a writing system, from the atomic grapheme, a grapheme 
>> sequence (or arrangement) of an actual text, and the grapheme set 
>> appearing in an actual text.
>> -- 
>> ------------------------------------
>>   Dr. Martin Doerr
>>                
>>   Honorary Head of the
>>   Center for Cultural Informatics
>>   
>>   Information Systems Laboratory
>>   Institute of Computer Science
>>   Foundation for Research and Technology - Hellas (FORTH)
>>                    
>>   N.Plastira 100, Vassilika Vouton,
>>   GR70013 Heraklion,Crete,Greece
>>   
>>   Vox:+30(2810)391625
>>   Email:martin at ics.forth.gr   
>>   Web-site:http://www.ics.forth.gr/isl
>> _______________________________________________
>> Crm-sig mailing list
>> Crm-sig at ics.forth.gr <mailto:Crm-sig at ics.forth.gr>
>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>


-- 
------------------------------------
  Dr. Martin Doerr
               
  Honorary Head of the
  Center for Cultural Informatics
  
  Information Systems Laboratory
  Institute of Computer Science
  Foundation for Research and Technology - Hellas (FORTH)
                   
  N.Plastira 100, Vassilika Vouton,
  GR70013 Heraklion,Crete,Greece
  
  Vox:+30(2810)391625
  Email: martin at ics.forth.gr
  Web-site: http://www.ics.forth.gr/isl

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ics.forth.gr/pipermail/crm-sig/attachments/20211010/5d09bbc4/attachment-0001.html>


More information about the Crm-sig mailing list