[Crm-sig] A simplification of E55 Type

Guenther Goerz guenther.goerz at gmail.com
Fri May 9 17:46:09 EEST 2008


Dear Martin,

a few days ago I tried to send the following message to the crm-sig list, but
I received a message that it was not distributed (probably due to a large
attachment).  A day later, I tried to send it without the attachment, but I did
not yet receive it as a member of the list, so I don't know whether it has been
distributed.  I thought it would be useful if people --- at least those who meet
in Heraklion --- could read it in advance.  Although it may be too late now,
would you please distrbute the text to the list members, if the current CC:
fails again?

Thanks,
-- Guenther


Dear colleagues,

according to my criticism of the presentation of E55 Type in the
current CRM document as given in the paper I read at the last SIG
meeting in Nuremberg, let me present a reformulation of the sections
"About Types" and "Extensions" in the preamble and of the scope note
of E55 Type for discussion in Heraklion.

My intention was to change the existing text as little as necessary.
Certainly, there are opportunities for a much more radical
modification, but currently I don't see a real need for that.
Nevertheless, I would be grateful if some of the native speakers would
suggest stylistic improments --- many thanks in advance.

The idea behind the changes is a simplification of the text and a
demystification w.r.t. to the use of metaclasses --- for the need of
which nobody was able to present a convincing example up to now.  My
claim is that in the field of documentation the idea of interfacing to
external classes as (sub-)classes provides sufficient expressiveness.
I kindly ask all colleagues to realize that this suggestion introduces
a true simplification to the practice of documentation.

In particular, E55 Type was described up to now as a metaclass which
is a subclass of a normal CRM class and where at the same time
external, domain specific classes should be introduced as instances of
it.  Independent of the inconsistency of the definition in the CRM
document, such a use of metaclasses is undesirable not only because it
makes matters unnecessarily complicated, but also because it makes
the language undecidable (that is essentially the difference between
OWL Full and OWL-DL!); i.e. E55 Type as described in the document
cannot be used as described in any practical implementation.

Fur further arguments, those who did not attend the Nuremberg meeting
may consult the attached manuscript of my talk.

(A remark in parentheses for specialists only: Take the well-known CRM
Core example "Almod Blossom" where E21 Person = "Vincent van Gogh" -->
P2 has type --> E55 Type "Artist".  Let "Artist" be defined in a
domain ontology of Fine Arts.  According to my suggestion "Artist"
would be an (external) subclass to E55 Type.  In any decent reasoning
system we can infer on the class level, i.e. intensionally, with the
concept "Artist" ("categorically", as Martin would say), as well as on
the instance level, i.e. extensionally, with the set of all
Artists. Full stop.  This seems to me to be methodologically clean and
obvious to USE for any practicioner.)

Any place in the document where E55 Type is mentioned has to be
checked for compatibility with the proposed text, i.e. drop all
metaclass claims and provide compatibility with all properties, where
E55 appears as domain or range: P24, P32, P42, P71, P101, P103, P125,
P127 (see above), and P135, as well as all Pnnn.1 .

Best regards,
-- Guenther Goerz
------------------------------------------------------------------------
Prof. Dr. Guenther Goerz            Fon: (+49 9131) 852-8701; -8702
Univ. Erlangen-Nuernberg            Fax: (+49 9131) 852-8986
Institut f. Informatik 8/KI         goerz  AT informatik.uni-erlangen.de
Haberstrasse 2                      ggoerz AT csli.stanford.edu
D-91058 ERLANGEN
              http://www8.informatik.uni-erlangen.de/inf8/en/goerz.html



NEW TEXT Preamble
----------------------------------------------------------------------
About Types

Virtually all structured descriptions of museum objects begin with a
unique object identifier and information about the `type' (or `kind')
of the object, often in a set of fields with names like `Object Type,'
`Object Name,' `Category,' `Classification,' etc.  All these fields
are used for terms that declare that the object is a member of a
particular class or category of items in the particular domain of
interest.  As an interface to such domain specific categories, CRM
provides the generic class E55 type such that a connection between the
CRM and the particular domain category as its subclass can be
established.

The class E1 CRM Entity is the domain of the property P2 has type (is
type of), which has the range E55 Type.  Consequently, every class in
the CRM, with the exception of E59 Primitive Value, inherits the
property P2 has type (is type of).  This provides a general mechanism
for refining the classification of CRM instances to any level of
detail, by linking to external vocabulary sources, thesauri,
classification schema or ontologies that function as extensions to the
CRM class and property hierarchies.  The external vocabularies
themselves do not fall within the scope of the CRM.

The class E55 Type also serves as the range of properties that relate
to categorical knowledge commonly found in cultural documentation.
For example, the property P125 used object of type (was type of object
used in) enables the CRM to express statements such as `this casting
was produced using a mould', meaning that there has been an unknown or
unspecified instance of `mould' that was actually used.  This enables
the specific instance of the casting to be associated with the entire
type of manufacturing devices known as moulds.  Furthermore, the
objects of type `mould' would be related via P2 has type (is type of)
to this term.  This indirect relationship may actually help in
detecting the unknown object in an integrated environment.  On the
other side, some casting may refer directly to a known mould via P16
used specific object (was used for).  So a statistical question to how
many objects in a certain collection are made with moulds could be
answered correctly (following both paths through P16 used specific
object (was used for) - P2 has type (is type of) and P125 used object
of type (was type of object used in).  This consistent treatment of
categorical knowledge significantly enhances the CRM's ability to
integrate cultural knowledge.

Some properties in the CRM are associated with an additional
qualifying property which are numbered with a ".1" extension in the
CRM documentation.  These do not appear in the property hierarchy list
but are included as part of the property declarations and referred to
in the class declarations.  For example, P62.1 mode of depiction: E55
Type is associated with E24 Physical Man-made Thing.  P62 depicts (is
depicted by): E1 CRM Entity.  The range of these properties of
properties always falls within the type hierarchy E55 Type.  Their
purpose is to allow dynamic extensions to their parent property
through the use of property subtypes declared as subclasses of E55
Type.  This function is analogous to that of the P2 has type (is type
of) property, which all CRM classes inherit from E1 CRM Entity.
System implementations and schemas that do not support properties of
properties may use dynamic subtyping of the parent properties instead.


Extensions

Since the intended scope of the CRM is a subset of the `real' world,
it has been designed to be extensible through the linkage of
compatible external type hierarchies.

Compatibility of extensions with the CRM means that data structured
according to an extension must also remain valid as a CRM instance.  In
practical terms, this implies query containment: any queries based on
CRM concepts should retrieve a result set that is correct according to
the CRM's semantics, regardless of whether the knowledge base is
structured according to the CRM's semantics alone, or according to the
CRM plus compatible extensions.  For example, a query such as `list all
events' should recall 100% of the instances deemed to be events by the
CRM, regardless of how they are classified by the extension.

A sufficient condition for the compatibility of an extension with the
CRM is that CRM classes subsume all classes of the extension, and all
properties of the extension are either subsumed by CRM properties, or
are part of a path for which a CRM property is a shortcut.

----------------------------------------------------------------------

NEW TEXT E55 Scope Note
----------------------------------------------------------------------

E55
Type

Subclass of:    E28 Conceptual Object
Superclass of:  E56 Language
               E57 Material
               E58 Measurement Unit

Scope note:

The (generic) class E55 Type comprises arbitrary concepts (universals)
external to the CRM and hence provides an interface to such domain
specific concepts (categories).  In this fashion, a connection between
the CRM and a particular (external) domain concept as its subclass
can be established.

This hierarchical relation allows for additional refinement, through
subtyping, of those classes which do not require further analysis of
their formal properties, but which nonetheless represent typological
distinctions important to a given user group.  The interpretation of
these subtypes (subclasses) is based on the agreement of specific
groups.

E55 Type reflects the characteristic use of the term `object type' for
naming data fields in museum documentation and particularly the notion
of typology in archaeology.  It has however nothing to do with the
term `type' in Natural History (cf. E83 Type Creation), but it
includes the notion of a `taxon'.

Ideally, (external) subclasses of the class E55 Type should be
organised into thesauri, with scope notes, illustrations, etc. to
clarify their meaning.  In general, it is expected that different
domains and cultural groups will develop different thesauri in
parallel.  Consistent reasoning on the expansion of subterms used in a
thesaurus is possible insofar as it conforms to both the classes and
the hierarchies of the CRM.

E56 Language, E57 Material and E58 Measurement Unit have been defined
explicitly as elements of the E55 Type hierarchy because they are used
categorically in the CRM without reference to instances of them,
i.e. the CRM does not foresee the description of instances of
instances of them, e.g., the property instance `P45 consists of :
gold' does not refer to a particular instance of gold.

Examples: ... (remains unchanged)

Properties: ... (remains unchanged)
----------------------------------------------------------------------


More information about the Crm-sig mailing list