[Crm-sig] issue 223 "P43 has dimension" should apply to E1 Entity, not E70 Thing

Vladimir Alexiev vladimir.alexiev at ontotext.com
Wed Apr 17 12:27:52 EEST 2013

> As a consequence of moving the range from E70 to E1,
> should  also be to move the domain of the property P43 has dimension (is
> dimension of): E54 Dimension  from E70 to E1, in order to be consistent

Hi Athina!

I quite agree with you. 
See the thread from Nov 2011: http://lists.ics.forth.gr/pipermail/crm-sig/2011-November/001693.html
with subject ISSUE: "P43 has dimension" should apply to E1 Entity, not E70 Thing

I'm sorry I didn't know about the related issue 156 at the time.
I gave a number of examples from various areas, which were dismissed with Philosophical arguments.

When I asked directly about the domain inconsistency of P39 and P43:
> how do you explain this: "P39 measured" points to E1, "P43 has dimension" is a shortcut thereof, but P43 points lower than E1.

This was answered with the unbeatable argument of Convenience:
> P43 represents the common shortcut observed in existing documentation systems that measure particular Things

** The fact that this isn't entered into the issue list http://www.cidoc-crm.org/issues.php until until now
** is quite worrisome about the proper process in the CRM SIG mailing list.

Let's hope now that Athina raises it again (and even "reserved" next issue number), it will be entered in the list.


Wolfgang Schmidle:
> What exactly is the question? Is this more than a mere technicality after the corresponding change to P39?

Hi Wolfgang! 
Doesn't this bother you:
P39 lets you measure various entities, but P43 doesn't let you record the result for all of them (!?)

> Can one expect situations where this specific shortcut describes the Real World differently than the fully developed path?

The purpose of shortcuts is to enable lighter recording (not having to go through Events every time).
Whatever arguments were accepted for lifting the domain of P39, equally apply to P43.

> I don't fully understand Issue 156. Was
> the change introduced in order to measure E2 Temporal Entities? Why can E77 Persistent Item and its subclass E39 Actor now be
> measured when they couldn't be measured before?

Easy to give examples:
- Temporal (activity): speed of a car
- Temporal (event): number of people killed by the eruption of Vesuvuis
- Actor (person): height, IQ
- Actor (group): number of members.
  Here Stephen Stead answered that instead, you should model all individual members. Which is obviously untenable if you don't know them
See the thread cited above for a lot more examples.

Athina> but can you measure an idea

I think you can measure various things about it, eg:
- popularity (with a given population and in given time, e.g. measured through sociometrics)
- impact (in a given intellectual commmunity, e.g. measured by polls or bibliometrics)

> Are the other subclasses (E52 Time-Span, E53 Place and E54 Dimension itself) also supposed to be measured with P43 / P39?

Sure you can measure E53 Place.
As for E52, E54 I won't venture a guess, but just because you COULD attach P43,P39 to it, doesn't mean you SHOULD.

The other options are worse:
- create a class Measurable, meant to be used as auxiliary class on an entity that is measured.
  In CompSci these are called "mixins" or "traits" (in Scala) or  "interface" in Java
  See http://en.wikipedia.org/wiki/Mixin 
  (this is a useful pattern, maybe merits a thread?)
- split the properties into P39_measure_actor, P39_measure_place... (horrible)
- reorg the class hierarchy to make branch for measurable things (impossible)

Domains and ranges should describe POSSIBLE applications as precisely as possible, 
but should err on the side of side of wider inclusion, not on the side of exclusivity.

> (Issue 159 was closed in 2008, but the change to P39 is apparently not yet included in the latest Cidoc version; why?)

Sure it is. Search in the standard for "P39" from the bottom up, and you'll find it:
Amendments to version 4.2.5
P39: Changes in the range and the scope note of P39
BEFORE: Range: E70 Thing
AFTER: Range: E1 CRM Entity

> example "Number of coins in a silver hoard" for E54 Dimension. So, is the process of counting just a specific kind of
> measurement, even if there is no measurement unit, i.e. an E54 with an E60 but no E58?

It's better to introduce a specific Dimensionless unit (not use a number without unit).
"Count" is definitely not the only Dimensionless unit, consider:
- pairs (for socks)
- packs vs master-boxes (for cigarettes)
- radian vs degree (for angle)

See http://qudt.org/  :
"Dimensionless Quantities, or quantities of dimension 1, are those for which all the exponents of the factors corresponding to the base quantities in its quantity dimension are zero. 
Counts, ratios and plane angles are examples of dimensionless quantities."

What are these "exponents" and "base quantities"? See table "SI Base and Derived Quantities and Units"
- kind Dimensionless (U) with unit Unity (U)
- kind Length (L) with unit Meter (m)
- kind Time (T) with unit Second (s)
etc etc

> Is counting included in the description "can be
> measured by some calibrated means" in the Scope note of E54?

Yep. In fact counting is one of the most precise measurement methods ;-)

> Why not use E60 directly without E54?

Because you always need E54_Dimension.P2_has_type to describe WHAT you measured/counted. 

E.g. to distinguish Preferred from Common shares of a company (in Turtle notation):

  <company> a E40_Legal_Body.
  <dim1> a E54_Dimension; P43i_is_dimension_of <company>;
     P91_has_unit <thesaurus/units/count>; P90_has_value 1000; P2_has_type <thesaurus/corporate/shares/preferred>.
  <dim2> a E54_Dimension; P43i_is_dimension_of <company>;
     P91_has_unit <thesaurus/units/count>; P90_has_value 5000; P2_has_type <thesaurus/corporate/shares/common>.

BTW, we could use blank nodels if we don't care about the URIs of the two dimensions.
We also don't need to mention the class if we rely on the domain inferencing of RDFS.
So we can shorten this to:

  <company> a E40_Legal_Body;
    P43_has_dimension [P91_has_unit <thesaurus/units/count>; P90_has_value 1000; P2_has_type <thesaurus/corporate/shares/preferred>];
    P43_has_dimension [P91_has_unit <thesaurus/units/count>; P90_has_value 5000; P2_has_type <thesaurus/corporate/shares/common>].


> Is there, or should there be, a connection to P57 "has number of parts"? This seems to imply a counting process.

Yeah, P57 is an odd buddy. Thanks for opening another bag of worms :-)

Scope note says:
P57_has_number_of_parts: "Normally, the parts documented in this way would not be considered as worthy of individual attention.
For a more complete description, objects may be decomposed into their components and constituents using P46 is composed of (forms parts of) and P45 consists of (is incorporated in). 
This allows each element to be described individually."

Does that mean P57_has_number_of_parts should NOT be used together with P46_is_composed_of ?

When modeling RKD paintings, we DID use them together (a painting plus its frame equals 2 parts).
But then we decided with Martin that since a lot of paintings didn't have parts (no info about frame), it's wasteful to model the painting itself as a part:

<obj/2926> a crm:E22_Man-Made_Object, crm:E84_Information_Carrier;
  crm:P2_has_ type rkd-object:painting;
  crm:P57_has_number_of_parts 1;
  crm:P46_is_composed_of <obj/2926/part/2>. # auxiliary part/2: frame
<obj/2926/part/2> a crm:E22_Man-Made_Object;
  crm:P2_has_type rkd-object:frame;

And then even such basic question was doubtful for us: given a painting with frame, should P57_has_number_of_parts be 1 (as above) or 2 (as we had previously)

> Should P57 be a shortcut, too?

I agree, P57_has_number_of_parts is one particular Dimension of a physical object.

> (When counting words in a text, is there a measurement unit or not?)

Yep, it's "number of words". Different from "number of chars", "non-space chars", "pages", etc.

> Why does P57 not apply here, apart from its domain being E19 Physical Object?

P57 is not very powerful: you can only use it to count parts that are the "obvious" and "only" constituents of an object. 
You cannot even say what kinds of parts you counted.
E.g. you would use it for "hoard - coins" ONLY if we have info about 1 kind of coins (no info about nominations), and no other type of object in the hoard. 

Words are not the only constituents of a text.
There are many (in some cases overlapping) hierarchies, eg chars, sentences, paragraphs, pages, chapters...

- The just released "Recommendations for the representation of hierarchical objects in Europeana"
has examples about this (e.g. Hierarchical description of objects from libraries)
- BI (Warehousing) also uses this all the time. E.g. build a Time dimension including Week, Month, Season, Quarter, Year, Fiscal Year...

Thanks for these thought-provoking questions!

More information about the Crm-sig mailing list