[Crm-sig] Compatibility document V4

martin martin at ics.forth.gr
Mon Nov 3 18:40:52 EET 2008


Dear Bernhard,

Thank you very much for your rich input!


Some quick answers:

Bernhard Schiemann wrote:
> Dear all,
> Regarding the compatibility document (V4), we have the following questions:
> -I found no time slot for this issue during the London meeting. Shouldn't
> the SIG discuss and/or vote on that issue in London?

See: http://cidoc.ics.forth.gr/agentas/18th_sig_agenda+13th_frbr_crm.htm
Thursday November 6, 11:30-13:00, "Final text for compatibility"
> 
> -The term "Compatibility" is used for this document and most of the
> contents is about the compatibility between technical systems. (first
> sentence: "of their data structures"). Maybe this could be formulated
> precisely by changing the headline to e.g.
> "CRM compatibility of technical systems and data structures" instead of
> "Compatibility"

This is an "Issue". Is there any other compatibility you think of?
> 
> A general remark: As the goal seems to be to be to assert
> compatibility with the CRM as defined in the CRM document, it is hard
> to impossible to achieve this goal as long as we refer to a text which
> is open for interpretation.  Compatibility in a rigid sense can only
> be proved with a formal definition.  One could propose a weak concept
> of compliance if there is a test suite against which a system claiming
> that can be tested.  However, what is easier is to check for
> incompatibility, i.e. it is far easier to say if something is
> incompatible, i.e in contradiction, with the CRM.

The CRM definition is and will be text based. The notions of subclass,
superclass etc., are clear enough to be transferred to a formal language.
This has good reasons we have discussed in the past.
I vote against any change of this.

The details of spelling out in a formal language are a question of good practice
and certification authorities.

> 
> 
> -"In other words, it does not aim to provide more structure
> than users have previously provided." (Page 1, section 1.1,
> 3. paragraph) Does this address the workflow to produce compatible data?
> If you have a structure you can translate it to CRM structures, and if
> you just have unstructured information, you have to structure it first?

Yes, the workflow of transforming data from a legacy format into a
CRM compatible form. Many users had the impression, they should structure
their texts in order to do so. This is not intended.
> 
> 
> -"exhaustively in terms of CRM concepts" (Page 2, section
> 1.2, use case no. 4), All instances of (a queried) CRM concept?
> 
> -***Nick*** means that Nick provides an appropriate section about use
> cases? Or Examples?
Obsolete. Ignore.

> 
> -The next paragraph is the central point of this document (Page 2, 2nd
> before 1.3, beginning with: "In the context"): the definition of
> "without loss of meaning". Regarding the email Vladimir Ivanov already
> sent, we want to add:
> +"By virtue of this classification" Do you mean: "Using this
> classification"?
What would the word "use" clarify? The fact that there is this classification,
once it has been used, lets the user understand.

> +"expert conversant" How do you define an expert conversant? Is a field
> expert e.g. an  art historian an expert for the classification of a
> painting?

Sure. Obvious to our audience.
> 
> 
> -The first paragraph of 1.3: "A CRM compatible form should
> not implement the quantifiers ..." If the CRM contains
> quantifiers, why shouldn't they be implemented by cardinality
> restrictions? What do you mean by "form"? Do you mean a "profile"?
> I try this at an example from the scope note:
> "Quantification: many to one, necessary, dependent (1,1:1,n)
> Scope note: ... A temporal entity can have in reality only one
> Time-Span, but there may exist alternative opinions about it,
> which we would express by assigning multiple Time-Spans.
> Related temporal entities may share a Time-Span..." coded as
> e.g. E2.Temporal_Entity P4.has_timespan minimal one
> E52.Time-Span. Where is the problem?

I propose to change the 2nd phrase in 1.3 to:

"We call any encoding of such CRM instances in a formal language that preserves the relations between the CRM classes, properties and 
inheritance rules a “CRM-compatible form”".

The CRM suggests to use a monotonic schema for information integration.
Cardinatlity constarints violate monotonicity. That is the problem and the
solution we have so far agreed on. Do you suggest as an issue to drop monotonicity?

> 
> -second paragraph of 1.3: A subset of a consistent set is consistent by
> definition.  

No. Cutting out IsA relations can create inconsistent models.

I propose to add the following 4th condition:
  •	any instance of the reduced CRM-compatible form is also a valid instance of a (full) CRM compatible form

> Is there an implicit claim that the superset is
> inconsistent? 
No.

 > Or do you mean "proper subset" ???
Our audience are not mathematicians...

> 
> Section 1.3
> - The definition of single concepts that crm-compatible data structure
> (or something else) must contain is problematic.
> For example: Think of an address management system. You
> could build such a system compliant to CRM concepts like E21.Person,
> E53.Place etc. and with regard to the necessary properties and
> inheritance rules. There would be no need of a E12.Production Event, so such
> a system will never be crm-compliant, no matter how exact the other
> concepts, properties and restriction are took into account.
> Is this desirable?

The definition of an export comaptible system does not require any minimal concepts.
You confuse the sender system with the receiver data structure. The receiver data
structure is required to have least elements.
> 
> 
> -Section 1.4 first paragraph, second sentence:
> +"in in"

Thank you!

> +What are "implicit concepts"? If "concepts" are present in elements of
> a data structure they are not implicit, but explicit. May be it would be
> better to understand, if one can provide an example for such a concept.
> The whole story about formal ontologies for automatic processing is
> that everything (concepts, properties) has to be defined explicitly.
> Of course, you may infer by inheritance that e.g. an instance has some
> (previously implicit) properties which are now made explicit by
> virtues of the reasoning step.  Nevertheless, the concepts and
> properties are defined explicitly.

The source data structures are not instances of a formal ontology. Nothing like that
has been said. So, their concept are not all made explicit. I do not see a particular
reason HERE to provide an example?

Neither has anyone said that the schema matching process would be automatic!
We require only automatic DATA transformation.

> 
> Section 1.4 / 1. Paragraph / 3. Sentence ("As long as these concepts can
>  be encoded as instances of E55 Type (i.e. as terminology) ..."):
> If a concept is encoded as instances of E55 Type in the sense of a term,
> it has the notion of a lexical concept. As such it can not have the same
> (suitable) properties as the original data item. Shouldn't we update
> this to actually (in London) updated definition of E55?

You misinterpret the English text. The "suitable properties" are used to connect the
term to the data item (such as "has type", or "in the role of"). For me, the text
is unambiguous. Any other opinions?
> 
> Example: The term Artist (i.e. the propositional form Artist(x) in
> Cristian-Emil's proposal) in a thesaurus does not have a birth date,
> but instances of the subclass ARTIST of E21.Person (or whatever) of
> course inherits this property if the superclass has it.

See above.
> 
> 
> 
> -Section 1.4 second paragraph: The first sentence could be easily
> misinterpreted in a way that you have to consider at least only one
> CRM-concept to build an export-compatible data structure. A reference to
> the reduced CRM-compatible form defined in 1.3 should be added at this
> point.
No backwards references needed to definitions in such a text. The reader should not sleep...

I chose the term "represented" for "being mapped to"Of course, what sense would it have otherwise?

What about:

"Note that not all CRM concepts may need to be matched with some elements of an export-compatible data structure."

I fear, that this is more arcane...

> 
> -Section 1.4 third paragraph: First occurrence of the term
> "record". Is this a "data record" from a database?
> Or just "record" in everyday language, i.e. a written account of
> something.  
Please let us know what resonable alternative explanation would exist.

> This point to a major problem in the whole document: It is
> not clear whether the terms are used as in commonsense language or
> whether they are used terminologically, i.e. in  a scientific way.

OK, "data record". Any other ambiguities in the "whole document" ?
> 
> 
> -Section 1.4 5. paragraph: How does the reduction work, if
> we declare that a classification must be implemented "without
> loss of meaning"? (relation to Page 2, 2nd before 1.3, beginning with:
> "In the context")
Obviously, by a controlled loss as described here.
> 
> "Loss of meaning" can be stated only between formal systems --- if we
> refer to text, "meaning" and "loss of meaning" can be argued about,
> but there may be the situation that no agreement will be found.

This is no AI text. There exist scholars that understand natural language,
and their domain, and can say if a translation captures sufficiently
what they wanted to express.

We do not share your concept of meaning. For us, a formal system has
no meaning on its own. Only a human can associate meaning with formally
stated concepts. Constraints are not meaning.
> 
> 
> -Section 1.5 first sentence: "all user data". How does this information
> system work if "not all CRM concepts may be represented by elements of
> an export-compatible data structure"? (Section 1.4, 2nd paragraph)

I don't understand the association. These two sentences have nothing to
do with each other.

> 
> -Section 1.5 2. paragraph: A "partially import-compatible data
> structure" is not yet defined by this document.
This IS the definition. It says: "An information system is partially export compatible if ..."
> 
> -Section 1.5 3. paragraph: What are "generic data"?
The data that the system is designed for. The point is, that someone may device a
container, a flat file e.g., to import and export CRM data parallel to the generic
data.

> 
> -Section 1.5 5. paragraph: What is a "semantic reduction"?

I introduced "semantic" in 1.4, 5th paragraph:
"...been exported into a CRM compatible form by semantic reduction to CRM concepts"

to make the reference more clear.
> 
> -Section 1.5 6. paragraph, last sentence: Do they choose on CRM basis?
What should that mean?

I propose to change the phrase to:
Note that local information system providers may choose to make their systems import-compatible with the CRM
> 
> -Section 1.5 figure XXX: The meaning of figure XXX is not really clear
> to us. E.g. What does the data export arrow to the left side mean? The
> paragraph beneath the figure claims that it shows only *some* of the
> data flow patterns. It is not clear which patterns are shown an which
> not (and why). If we really like to have figures in this document, there
> should be one figure which shows all the data flows mentioned
> (export-compatible, import-compatible, access-compatible). This figure
> can easily become to complex, so that three figures, one for each
> defined pattern, would be necessary.

I regarded the overview far better. The figure is only a help, not a formal
part. Please provide an alternative we can vote on.
> 
> Section 1.6, For export-compatible, a.: "other than" Does this mean all
> other concepts except E1 and E77?

This is how my understanding of English is.
> 
> We additionally found some questions (sent by email) that are still left
> open in the V4 document. E.g.:
> -"Obviously there is also an implicit research issue: How to define a
> mapping that proves incompatibility. Help from the computer science
> community appreciated."
> -"Should we distinguish notions of intensional/extensional meaning?
> Should we introduce relationships that preserve meaning (equivalence,
> subsumtion)?"
> -"Dear Nick,
> I think only you and Patrick can answer this question: What do other
> standards do about verification?"
> -Some questions raised at earlier stages of the document (V2) by Prof. Görz.

Sure, we deliberately do not want to resolve all the implementation
questions. The document must make clear the effect of the compatibility,
not all the possible means to show it.

Ultimately, the user decides if the data are correctly interpreted and
handled, and not the programmer. The programer has to find out how to
satisfy the user. The ISO text must describe the goal and effect, not the
how.
> 
> 
> As a whole there seem to be many points that need to be further
> clarified and discussed so that it is not ready to be forwarded
> to ISO. Nevertheless I see the benefit to have a standarized and
> detailed definition of crm-compatibility inside the iso-standard.
> A suggestion: Would it be possible to send the updated crm-definition
> with the old compatibility part to ISO this year and send a new
> compatibility part as an update of the ISO-standard next year? That
> would lower the deadline pressure of the discussion and could lead to a
> compatibility-definition we all agree on.

So far I have an overwhelming positive feed-back to the document.
It does not help anybody to delay this text, in favour of details that
could be added equally in the following years, or be part of a good
practice guide. Most of your comments pertain to rephrasing, I hope I
could already resolve.

I do not see anything in your concerns that pertains
to the substance of the thing, except for the idea to define
"loss of meaning" by formal methods rather than by expert opinion,
which I vehemently advocate against, since this is not what the CRM was
made for.

I thank you very much for the scrutiny with which you read the text,
and all your effort, that will lead to improvements.

Best,

Martin


> 
> 
> kind regards, on behalf of the authors
> Bernhard
> 
> _______________________________________________
> Crm-sig mailing list
> Crm-sig at ics.forth.gr
> http://lists.ics.forth.gr/mailman/listinfo/crm-sig


-- 

--------------------------------------------------------------
  Dr. Martin Doerr              |  Vox:+30(2810)391625        |
  Principle Researcher          |  Fax:+30(2810)391638        |
                                |  Email: martin at ics.forth.gr |
                                                              |
                Center for Cultural Informatics               |
                Information Systems Laboratory                |
                 Institute of Computer Science                |
    Foundation for Research and Technology - Hellas (FORTH)   |
                                                              |
  Vassilika Vouton,P.O.Box1385,GR71110 Heraklion,Crete,Greece |
                                                              |
          Web-site: http://www.ics.forth.gr/isl               |
--------------------------------------------------------------



More information about the Crm-sig mailing list