[crm-sig] CRM XML mapping utility

Nicholas Crofts nickcrofts at yahoo.com
Thu Aug 30 19:20:22 EEST 2001

Dear colleagues,

The long awaited CRM XML mapping utility I promised in
Barcelona is now available. I have mailed it to Martin
who will put it on the web site shortly. It's an
Access application,  and requires Access to be
installed on your computer to run. The application is
written for Access '97. If you have a more recent
version you will be prompted to convert it when you
first open the file. I'd suggest saving the converted
version with a different name.

The application is intended to be simple it use and
has only one form. This form allows you to enter the
data needed to define detailed mappings from a source
format, such as Dublin Core or FRBR, to the CRM. This
definition can then used to generate an XML file,
based on Martin's XML template cf. annexe below.

The screen is a standard "Master - Detail" form. The
top half of the screen is the "master" section, the
lower half the "detail". The screen also separates
(more-or-less) left and right into source and target.
The comment fields are an exception here since they
span the entire width.

The fields at the top of the screen (the "master"
section) are used to define a high level source entity
such as a table, and the equivalent CRM entity to
which it maps. A "source condition" field and a "CRM
constraint" fields allow restrictions to be placed on
the mapping. For example, a DC record can represent
different types of entities, so you would need to
express a condition such as "if DC.type = physical
object" to define the context of the mapping. A
comments field allows you to enter remarks. You should
create separate records for each high level entity
which needs to be mapped and for each "interpretation"
which can be made.

The lower half of the screen contains a list of "link
maps", which are concerned with the "detail" of
individual fields and columns. Any number of
individual link maps can be entered - just use the
"elevator" on the right to scroll through the list. An
 asterisk in the "selector" at the left of this screen
section indicates that you are positioned on a new
entry. You should create separate link maps for each
field and for each distinct "interpretation".

Each link map consists of the following fields: the
"source element" is the name of the field or other
element being mapped. "Source condition" is used to
indicate any constraints which need to be placed on
this field mapping - in much the same way as the
master level field at the top of the screen. "Source
Path" expresses the relationship between the high
level source entity (table) and the field being
mapped. On the right hand (target) side of the screen,
"CRM entity" is the CRM entity to which the source
element maps. (nb source *fields* generally map to CRM
*entities* due to our "object oriented" approach -
almost everything is an entity in the CRM.) "CRM
constraint" is, again, a rule applied as a restriction
to the mapping. "CRM Path" consists of two fields, a
CRM property and a CRM entity, which together form the
"pathway" needed to arrive at the target entity. For
example, given a mapping of DC.Title.Lang (the
language used for a title in Dublin Core), we can
indicate that the target entity is E56 Language. The
path to get there goes through the property "P72 has
language (is language of)", which is a property of
"E35 Title".  "Constraint" is yet another constraint
field limiting, this time, the intermediate entity.
Finally, the "comment" field allows you to make
remarks about this link map.

Several fields have lists provided, of CRM entities
and CRM properties. These are drawn from ver 3.2 of
the CRM. I regret that the list of properties in the
CRM path field does not reduce automatically according
to the entity selected. This would be possible but
would require a list of all properties of all
entities, which I had difficulty generating from the
existing documentation. If anyone feels like modifying
the application, the list of properties is stored in a
table called, not surprisingly, CRM_PROPERTIES. 

Three buttons, with rather naff icons, are sitting
under all the fields. These allow you to 
1. trash the current record - both the domain mapping
and all the link maps, 
2. export the entire database to an XML file (that was
what all this was for, after all)
3. close the application.

The exported XML file can be opened directly in IE
versions 5.x

I took the text for all the pop up bubbles from
Martin's notes in the DTD which is hidden on the web
site server.
(http://cidoc.ics.forth.gr/docs/crm_mapping_dtd.txt) -
Incidentally, Martin, this DTD is not a valid XML
document ;-)

For anyone unfamiliar with MS Access it is worth
pointing out that *everything* you enter or modify is
automatically saved, hence the absence of any "save"
button. This can be a little disconcerting at first.
Just skip on to the next record and back again if you
want to reassure yourself that your work really has
been recorded.

Remarks on the XML template:

I find that the number of "constraint" fields makes
the mapping difficult to follow. I think it might be
sufficient simply to place these constraints in the
free text comments fields since they are unlikely to
be "machine useable".

Two constraint fields appear in the link map :
<src_domain_condition></src_domain_condition> and
I left out the first, assuming it to be an error.

The constraint field:
<crm_domain_ constraint></crm_domain_ constraint>
appears in the target path. This conflicts with the
notes in the DTD so I renamed it to
<crm_interm_constraint> for consistency.

The notes for "Target path " imply a linked list of
intermediate entities and properties. As it stands,
the template does not support this properly since all
the intermediate entities and properties are placed
within the same <target_path> container and  there is
no guarantee that the order will be preserved. At
present the application deals with only one
intermediate entity and property. Since this "path"
information is mainly informative, perhaps the comment
field is sufficient for explaining the detail ?

Martin's example mapping of DC.Title breaks
<src_domain_condition> into three separate elements :
<mapped_entity_of>,<op>and <value>. This break-down
doesn't figure in the template and I wasn't sure what
the operator types should be. To be consistent, a
similar approach ought to be applied to all the
constraint fields. However, this seemed like far too
much trouble so I just left it as a single element.

Martin gives all references to CRM entities using both
id and name, in English. Since the CRM has now been
translated into French, and is being translated into
other languages, I decided to omit the names from the
XML listing and to use the id alone. Displaying the
full name in the user's language is a trivial matter
for an XSL style sheet.

If you study the XML output, you will also notice that
I have included a "cle" attibute in the domain_map.
This is just the internal id used by Access and can
safely be ignored for most purposes. However, it might
come in handy if you need to debug your mapping and
track down where some mysterious output is coming

Since the src_domain_entity already specifies the high
level entity being mapped, the src_path element seems
superfluous. In the example given, DC->DC.Title is
implicit from
<src_domain_entity>DC</src_domain_entity>  and
<src_range_entity> DC.Title</src_range_entity>. Indeed
<src_range_entity>Title</src_range_entity> would seem
to be sufficient. Are there cases in which the
src_path is non trivial?

You are free to use this application as you see fit,
to modify, mutilate or destroy it, and to distribute
it as you wish. My apologies if it causes you
headaches, but I accept no responsibility for any
damage it may cause to you or your computer. 

Best wishes

Nick Crofts

Annexe : Martin's XML template.

The following template illustrates the logical
structure of a mapping from some data structure
(schema, DTD, etc.) to the CRM: 

	<crm_domain_constraint></crm_domain_ constraint>
	  <crm_domain_ constraint></crm_domain_ constraint>

Nicholas Crofts
rue David-Dufour 5
Case postale 22
CH - 1211 Genève 8
tél +41 22 327 5271
fax +41 22 328 4382

Do You Yahoo!?
Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk
or your free @yahoo.ie address at http://mail.yahoo.ie

More information about the Crm-sig mailing list