[Crm-sig] Question: How to model a 'file'

George Bruseker george.bruseker at gmail.com
Wed Apr 15 20:16:55 EEST 2020

Dear all,

Here is another humble modelling problem for which I don't feel that there
is a commonly agreed and documented answer, although it is a common
question. How do we connect an actual file with the semantic network? So
here is the scenario.

I have a file: a word doc, a jpg image, a powerpoint. I want to represent
it in CIDOC CRM and connect it the semantic network and do so in a way that
would be interoperable with all other well formed instances of CIDOC CRM.
How do I do that?

Well part of the answer is clear. Part is unclear. Regarding the
representation of the the fact that there is a digital object we have two
choices. If we use pure CRMbase then we have

E73 p2 has type E55 "Digital Object"

If we use CRM extensions then we have

D1 Digital Object

Great. Now in the semantic network we can relate this in all sorts of
standard ways to other entities (p67 refers to, p128 is about) etc. etc. We
can use a creation event from CRM base or a digital machine event from
CRMdig to document when the file was created, by whom etc. Super. I can use
p1 is identified by E41 appellation to indicate the name of that digital
object (which may differ from the file name) and give it a type with p2 has
type. All standard and wonderful.

I still have to put the file itself, that actual digital object which I
want my user to be able to find and manipulate somehow in relation to the
semantic network.

How do people tend to do that? I have seen many variation but no common

So what is the go-to solution and should it perhaps be documented on the
CIDOC CRM site because it is a really common pattern?

I have seen

the file = E73... just put the file as the URN of the semantic node. But
then this means your file is accessible via a URN which is often not the
case and anyhow you probably want to distinguish your semantic node which
'stands for' the file from the actual file itself.

I have seen and used E41 Appellation as a pattern. So the D1 or E73 p1 is
identified by E41 Appellation p190 has symbolic content df:literal "file
name value goes here". Here you have a problem that you then need also to
store somehow a path by which to reach that on some file system.

I guess another alternative would be to use p190 has symbolic content and
then throw the file in there as a blob. I don't particularly like this
solution, as I would hope to find strings at the end of p190 and not blobs.

Would maybe a sub property of p190 'is encoded in file' be an option in
order to use the blob solution?

Anyhow maybe there are already better solutions than I lay out above, but I
would be interested to hear. Also I think it would be great to identify the
best practice and put in on the main site so that people follow this
strategy consistently.

Probably my examples hide multiple use cases requiring different patterns.
Anyhow, what do you think?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ics.forth.gr/pipermail/crm-sig/attachments/20200415/b0fcd450/attachment-0001.html>

More information about the Crm-sig mailing list