[crm-sig] Temporal concerns (general points)

Nicholas Crofts nickcrofts at yahoo.com
Tue Feb 5 14:26:42 EET 2002

Just to chip in my tuppence worth...

It seems to me that there are a couple of general
points to be made here about *complexity* which go
beyond the immediate question of how to map temporal

The first thing is that the CRM, like any proposed
solution to a problem, has to strike a balance between
sophistication, hence complexity, on the one hand, and
simplicity and lack of functionality on the other.
Increased complexity is the price we pay for a rich,
extensible and polyvalent solution. Lack of
functionality and inflexibility are the price of
oversimplification. Getting the balance right -
finding the optimal solution - is very hard, and I'm
sure we all have our own views on how close the CRM is
to achieving "optimal" status. I would argue that we
should be more explicit in discussing the
"cost/benefit" analysis of extensions and additions to
the CRM. In each case, is the added complexity

Secondly, we need to constantly remind ourselves that
the complexity of the CRM is inevitably greater than
that of any one information system since, within its
stated domain, it aims to cover the scope of all
existing (and possible?) systems. Within scope, and
reason, the CRM has to be able to cope with
*everyone's* data and *anyone's* point of view.
Compared with the optimal approach for a particular
requirement in particular domain, the CRM is almost
bound to appear complex.

Finally, from a methodological point of view, there is
a natural tendency to give prority to dealing with the
most complex cases - following the reasoning that if
you can handle the difficult cases, the simple ones
are easy. (It could be argued that this approach to
dealing with complexity is the simplest ;-) However,
it does have the disadvantage that the "simple" cases
can end up looking complex. Shortcuts are, of course,
the general CRM answer to this. Perhaps we should be
more explicit in evaluating the general impact of
enhancements which are intended to deal with specific
complex cases: a small added inconvenience that
affects the majority might not be worth a large
inconvenience to a small minority of cases.

Just a thought on Tony's example. Martin suggested
that the string representation of a persons' dates has
no semantic equivalent in the CRM. However, it stuck
me that this string could plausibly be handled as an
*appellation* for a time-span (or period): the one
that is defined by the structured date of birth and
and date of death. 

Best wishes for the meeting in California


 --- martin <martin at ics.forth.gr> wrote: > Hi Tony,
> I think this issue should be added to frequently
> asked questions.
> Let me a bit comment on that.
> Tony Gill wrote:
> > Folks,
> >
> > I'm a little concerned about the complexity of the
> temporal reasoning of
> > the CRM; it's getting increasingly powerful,
> offering all kinds of
> > temporal operators and sophisticated reasoning for
> disciplines that need
> > it (such as archaeology), but at the same time
> it's becoming increasingly
> > cumbersome for simple tasks, such as recording an
> individual's birth and
> > death dates.
> I agree, that the temporal operators proposed are
> relatively complex, and we can
> rediscuss the need for them. They are however
> additional to the CRM version 2,
> and version 3.2. Indeed , the examples you give
> below are all features of the
> intitial model, so I don't think that at the same
> time it's becoming increasingly
> cumbersome for simple tasks. The examples below had
> such mappings from the
> very beginning of the CRM.
> > As an example, here are some example mappings from
> the RLG Cultural
> > Materials data model (which I'm currently working
> on for the meeting):
> >
> > 1. Actor Start Date: Actor's birth year, if known.
> > E39 Actor <P92 brought into existence (was brought
> into existence by)> E63
> > Beginning of Existence <P4 has time-span (is
> time-span of)> E52 Time-Span
> > <P82 at most within> E61 Time Primitive
> Actually you can be more specific here, if it is a
> physical person which is born.
> Then you could use "Person" and "Birth". If the RLG
> model implies groups and
> persons, your mapping cannot be made more specific.
> > 2. Actor End Date: Actor's death year, if known.
> > E39 Actor <P93 took out of existence (was taken
> out of existence by)> E64
> > End of Existence <P4 has time-span (is time-span
> of)> E52 Time-Span <P82
> > at most within> E61 Time Primitive
> >
> > 3. Actor Date String: Text version of the actor's
> lifespan (birth and
> > death) dates for display, as provided by the
> contributor.
> > There is no mapping of this element's semantics to
> the CRM that I can see,
> > since there is no shortcut from actor to timespan
> without going through
> > beginning of existence and end of existence, and
> this element combines the
> > two concepts.
> There is a mapping for all text strings. The "has
> note" property. It can be specialized
> by a type. Text elements are not qualified like
> numerical database elements
> for precise formal queries. So we map those to "has
> note" in the CRM, which is
> neither less nor more formal, and attach them to the
> node we took them from.
> "display strings" are very application specific. If
> this is an issue, I propose to
> devote a discussion to that in the next meeting.
> > I'm not arguing that the model shouldn't support
> the sophisticated
> > temporal reasoning, but it seems a little perverse
> to have to use it for
> > something as simple as a person's birth and death
> dates... Couldn't we
> > have a direct short cut such as E21 Person <P4 has
> time-span (is time-span
> > of)> E52 Time-Span?
> >
> > Cheers,
> >
> > T.
> May be the first thing that comes to my mind is,
> that the CRM is not a proposal for
> a metadataschema. I.e. it is not optimized for
> minimal storage and a minimal data
> entry procedures of a specific application case.
> Rather, it is the common form, into which
> we can merge all data, independent from their
> source. In information integration
> systems, the global schema is normally virtual, only
> for formulating queries and for
> data transport between heterogeneous sources.
> The success of the CRM is to abstract from
> application-specific simplifications.
> Without doing that, it would fail as has any attempt
> before to create a common schema for
> the cultural domain.
> Concrete, I do not regard a person's birth date as a
> simple thing. Try to merge ULAN data
> with your data base. ULAN does reasoning about all
> birth- and death related data. Try to
> automatically convert data, which do not have
> explicit birth events into a biography of the
> parents. Things that are simple for one application
> can make the next step of integration
> awefully complicated.
> If we register parents, birth date and birth place, 
> the intermediate node "birth", which seems
> cumbersome for a single date, becomes a welcome
> subunit to bring order into the data.
> Moreover, the CRM does not make use of "nested
> structures" as many programming languages
> and advanced databases. The fact, that we put
> "verbose" links between the intermediate nodes,
> is not a complication for applications, but an
> explanation for their meaning. The textual length of
> a mapping path should not be taken for a measure of
> complexity, only the number of intermediate nodes,
> which could be avoided.
> E.g. if we take out the "Time-Span" entity, the four
> dates of the CIDOC Relational Model
> (begin of begin, end of begin, begin of end, end of
> end) would be flat among all other properties of
> Event, Period etc., a thing a C or Java Programmer
> wouldn't do. It would deprive us of the possibility
> to attach another Time-Span estimation for the same
> event. BTW, the ABC-Harmony model is
> equally or more indirect than the CRM. Actually
> instead of "Time-Span" and "Place", they
> use an intermediate node "context", which indirects
> to date and place, as Time-Span does with
> the dates. This is slightly more symmetric wrt to
> date and place, but introduces an intermediate
> between the place also.
> More important is, that it birth is a physical
> reality, and that we can have information about
> birth
> without dates. Then the simple task becomes
> insufficient, and we loose information in the
> easy schema or attach them to nodes not responsible
> for the information bits.
> The key element of the CRM is to make all events
> explicit. They seem to be the critical element
> for integration. To take out the birth and death
> events, really not marginal event, may be
> counterproductive.
> However, any short-cut you propose is compatible
> with the CRM, as long as you can define an
> AUTOMATIC transformation back into the CRM. As the
> CRM does not prescribe storage,
> there is nothing cumbersome about registrating a
> birth date. You do it as before, but the
> mapping implies some intermediate steps, which are
> extremely helpful to merge data from
> completely different sources, and can be created
> automatically. If you come however about
> more complex data, the CRM form allows to overcome
> the problem.
> The idea to associate a life-span with a Person is a
> new concept. So far, we tried to avoid
> "states" in favour of "events". The reason is, that
> the information about "events" is often
> more direct than about "states", i.e. more
> frequently associated with primary historical
> sources.
> Which single person would witness the whole
> life-span of
> another person? My understanding is, that it is
> simpler from a logical point of view to
> regard a lifespan as the conclusion of birth and
> death events than otherwise round, in
> particular if there are many opinions on each event,
> as in the ULAN data. Therefore we
> recommend to map a life-span to birth and death in a
> global schema.
=== message truncated === 

Nicholas Crofts
rue David-Dufour 5
Case postale 22
CH - 1211 Genève 8
tél +41 22 327 5271
fax +41 22 328 4382

Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts

More information about the Crm-sig mailing list