[Crm-sig] Pages reproduced as spreads

Florian Kräutli fkraeutli at mpiwg-berlin.mpg.de
Fri Mar 10 18:21:55 EET 2017


Apologies for having inadvertently split this discussion in two threads. I hope this answer shows up in the right place.

Thank you Dominic and Christian-Emil. This is really useful.

Dominic, you pointed to another part of the problem which I haven't asked about here, but which appears in a different area of our model. The question of how do I specify the page where an expression appears in a PDF?

Martin and George, I hope you don't mind if I share your recommendation on this question here:

7) How to refer to page?

The distinction between pages in the physical work and the digital work was first pointed to.

1) One page in the publication expression. Pages separate below phrase boundaries. Therefore they are units at the symbolic level and parthood should be expressed using P106

2) One in the digital image

If you want to go pages on pdf, would best to use a media indexing Annotation from 3DCoform
METS <area> construct gets an ID and has coordinates in media object2



> On 10 Mar 2017, at 16:29, crm-sig-request at ics.forth.gr wrote:
> 
> Send Crm-sig mailing list submissions to
> 	crm-sig at ics.forth.gr
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://lists.ics.forth.gr/mailman/listinfo/crm-sig
> or, via email, send a message with subject or body 'help' to
> 	crm-sig-request at ics.forth.gr
> 
> You can reach the person managing the list at
> 	crm-sig-owner at ics.forth.gr
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Crm-sig digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: Crm-sig Digest, Vol 122, Issue 8 (Dominic Oldman)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Fri, 10 Mar 2017 15:24:50 +0000
> From: Dominic Oldman <doint at oldman.me.uk>
> To: Christian-Emil Smith Ore <c.e.s.ore at iln.uio.no>,
> 	"crm-sig at ics.forth.gr" <crm-sig at ics.forth.gr>
> Subject: Re: [Crm-sig] Crm-sig Digest, Vol 122, Issue 8
> Message-ID:
> 	<CAHVLp01pyaJqx52N97JX=syvJfGrscPooig9S6a=3h0WcFBimg at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
> 
> Hi Florian,
> 
> Here is an off line discussion that we should have put on the list.
> 
> Cheers,
> 
> D
> 
> 
> orcid.org/0000-0002-5539-3126
> 
> On Fri, Mar 10, 2017 at 12:47 PM, Christian-Emil Smith Ore <
> c.e.s.ore at iln.uio.no> wrote:
> 
>> Do so, and send my regards. Please incorproate the following example:
>> 
>> 
>> To create excerpts is common activity in lexicography and history. An
>> excerpt is indeed a fragement of a text. The  corresponding expression is a
>> fragment expression.  See for example a paperslip for the word 'shovelfork'
>> (used to prepare la (small) field instead of ploughing.  The text is a
>> fragment of a longer text dealing with somebody childhood memories
>> 
>> 
>> http://www.edd.uio.no/setelarkiv/setel1963769.jpg?
>> 
>> 
>> The entire paper slip represents a self-contained expression where a
>> expression fragment is incorporated (in the corresponding work)
>> 
>> 
>> Best
>> 
>> Christian-Emil
>> ------------------------------
>> *From:* Dominic Oldman <doint at oldman.me.uk>
>> *Sent:* 10 March 2017 13:32
>> 
>> *To:* Christian-Emil Smith Ore
>> *Subject:* Re: [Crm-sig] Crm-sig Digest, Vol 122, Issue 8
>> 
>> Hi Christian,
>> 
>> I note that this didnt go on the list - Can I post this to the list as I
>> think it is important generally.
>> 
>> D
>> 
>> orcid.org/0000-0002-5539-3126
>> 
>> On Fri, Mar 10, 2017 at 12:30 PM, Dominic Oldman <doint at oldman.me.uk>
>> wrote:
>> 
>>> Although I think then the scope note could be much clearer on E23 because
>>> it tends to suggest fragments isolated from the whole whereas in this case
>>> the section still resides within a whole. Although the scope note does
>>> state "excerpts" I still think this could be stated far more clearly with
>>> less ambiguity -  if it does mean that these excerpts can be identified
>>> sections of the information object within a whole text.
>>> 
>>> Can we put this on the agenda for the next meeting?
>>> 
>>> D
>>> 
>>> 
>>> orcid.org/0000-0002-5539-3126
>>> 
>>> On Fri, Mar 10, 2017 at 9:37 AM, Christian-Emil Smith Ore <
>>> c.e.s.ore at iln.uio.no> wrote:
>>> 
>>>> It is not necessarily so that the text printed on a page is a
>>>> self-contained expression, it is in general a F23 Expression Fragment ?
>>>> 
>>>> 
>>>> Best
>>>> 
>>>> Christian-Emil
>>>> 
>>>> 
>>>> F22 Self-Contained Expression
>>>> 
>>>> This class comprises the immaterial realisations of individual works at
>>>> a particular time that are regarded as a complete whole. The quality of
>>>> wholeness reflects the intention of its creator that this expression should
>>>> convey the concept of the work. Such a whole can in turn be part of a
>>>> larger whole.
>>>> 
>>>> 
>>>> Inherent to the notion of work is the completion of recognisable
>>>> outcomes of the work. These outcomes, i.e. the Self-Contained Expressions,
>>>> are regarded as the symbolic equivalents of Individual Works, which form
>>>> the atoms of a complex work. A Self-Contained Expression may contain
>>>> expressions or parts of expressions from other work, such as citations or
>>>> items collected in anthologies. Even though they are incorporated in the
>>>> Self-Contained Expression, they are not regarded as becoming members of the
>>>> expressed container work by their inclusion in the expression, but are
>>>> rather regarded as foreign or referred to elements.
>>>> 
>>>> 
>>>> F22 Self-Contained Expression can be distinguished from F23 Expression
>>>> Fragment in that an F23 Expression Fragment was not intended by its creator
>>>> to make sense by itself. Normally creators would characterise an outcome of
>>>> a work as finished. In other cases, one could recognise an outcome of a
>>>> work as complete from the elaboration or logical coherence of its content,
>>>> or if there is any historical knowledge about the creator deliberately or
>>>> accidentally never finishing (completing) that particular expression. In
>>>> all those cases, one would regard an expression as self-contained.
>>>> 
>>>> 
>>>> ------------------------------
>>>> *From:* Dominic Oldman <doint at oldman.me.uk>
>>>> *Sent:* 09 March 2017 20:50
>>>> *To:* Christian-Emil Smith Ore
>>>> 
>>>> *Subject:* Re: [Crm-sig] Crm-sig Digest, Vol 122, Issue 8
>>>> 
>>>> 
>>>> So in this case the self contained expression (information object)
>>>> identified as page 1 can then be represented by a part of a PDF image which
>>>> itself identifies parts (a physical page?) which are identified accordingly.
>>>> 
>>>> I'm still not sure whether this is what Florian means though - so await
>>>> his reply.
>>>> 
>>>> D
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> orcid.org/0000-0002-5539-3126
>>>> 
>>>> On Thu, Mar 9, 2017 at 7:31 PM, Christian-Emil Smith Ore <
>>>> c.e.s.ore at iln.uio.no> wrote:
>>>> 
>>>>> Hi
>>>>> There are many ways to number or put identifiers to parts of written or
>>>>> printed material:folio, sheet (versio/recto), page.
>>>>> If the physical original is known, perhaps a starting point would be to
>>>>> model the physical parts and their relationships.
>>>>> 
>>>>> The pdfs in question seems to be facsimiles of these physical parts. (a
>>>>> single page, double pages etc). A possible way to model them is to see the
>>>>> pdfs as carriers of visual items reperesenting the physical objects of the
>>>>> specific item (P5).
>>>>> 
>>>>> The first example in the compenote of  P138 represents (has
>>>>> representation):
>>>>> ?       the digital file found at http://www.emunch.no/N/full/No
>>>>> -MM_N0001-01.jpg (E36) represents page 1 of Edward Munch's manuscript
>>>>> MM N 1, Munch-museet (E73) mode of representation Digitisation(E55)
>>>>> 
>>>>> Best
>>>>> Christian-Emil
>>>>> ________________________________________
>>>>> From: Crm-sig <crm-sig-bounces at ics.forth.gr> on behalf of Dominic
>>>>> Oldman <DOldman at britishmuseum.org>
>>>>> Sent: 09 March 2017 17:59
>>>>> To: Florian Kr?utli; crm-sig at ics.forth.gr
>>>>> Subject: Re: [Crm-sig] Crm-sig Digest, Vol 122, Issue 8
>>>>> 
>>>>> Hi Florian,
>>>>> 
>>>>> Just trying to understand.
>>>>> 
>>>>> You have an expression that is organised with page numbers. This is
>>>>> reproduced in the PDF. The expression page numbers are the same (the
>>>>> information object) but page 1 is spread over two carrier pages. i.e. page
>>>>> 1 is still page 1 as an information object but on the application adobe
>>>>> spreads it over two application carrier pages. Is that right? or is it
>>>>> something else.
>>>>> 
>>>>> If the expression is the same (the same information object) then isn't
>>>>> page 1, page 1
>>>>> 
>>>>> Can you clarify.
>>>>> 
>>>>> D
>>>>> 
>>>>> 
>>>>> ________________________________________
>>>>> From: Crm-sig [crm-sig-bounces at ics.forth.gr] on behalf of Florian
>>>>> Kr?utli [fkraeutli at mpiwg-berlin.mpg.de]
>>>>> Sent: 09 March 2017 10:38
>>>>> To: crm-sig at ics.forth.gr
>>>>> Subject: Re: [Crm-sig] Crm-sig Digest, Vol 122, Issue 8
>>>>> 
>>>>> Dear Martin,
>>>>> 
>>>>> many thanks for your input!
>>>>> 
>>>>> Our question at the moment is simply, does a page in the PDF represent
>>>>> one or two pages of the book?
>>>>> 
>>>>> Later on, we might have more specific questions that will require us to
>>>>> define the relationships between these two page identifiers (in the
>>>>> physical book and in the PDF) more explicitly. We would then also need to
>>>>> manually assess each PDF as, for instance, we can not assume that page n in
>>>>> a book corresponds to page n/2 in a double-spread PDF. A PDF might contain
>>>>> some additional pages with information about the digitisation process.
>>>>> 
>>>>> For now we however only need a binary answer: double-spread yes or no.
>>>>> 
>>>>> All the best,
>>>>> 
>>>>> Florian
>>>>> 
>>>>> 
>>>>>> On 8 Mar 2017, at 11:00, crm-sig-request at ics.forth.gr wrote:
>>>>>> 
>>>>>> Send Crm-sig mailing list submissions to
>>>>>>      crm-sig at ics.forth.gr
>>>>>> 
>>>>>> To subscribe or unsubscribe via the World Wide Web, visit
>>>>>>      http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>>>>> or, via email, send a message with subject or body 'help' to
>>>>>>      crm-sig-request at ics.forth.gr
>>>>>> 
>>>>>> You can reach the person managing the list at
>>>>>>      crm-sig-owner at ics.forth.gr
>>>>>> 
>>>>>> When replying, please edit your Subject line so it is more specific
>>>>>> than "Re: Contents of Crm-sig digest..."
>>>>>> 
>>>>>> 
>>>>>> Today's Topics:
>>>>>> 
>>>>>>  1. Re: Pages reproduced as spreads (martin)
>>>>>> 
>>>>>> 
>>>>>> ------------------------------------------------------------
>>>>> ----------
>>>>>> 
>>>>>> Message: 1
>>>>>> Date: Tue, 7 Mar 2017 18:24:17 +0200
>>>>>> From: martin <martin at ics.forth.gr>
>>>>>> To: crm-sig at ics.forth.gr
>>>>>> Subject: Re: [Crm-sig] Pages reproduced as spreads
>>>>>> Message-ID: <e4b3d793-40d5-f5d5-1f39-ff2404bab29b at ics.forth.gr>
>>>>>> Content-Type: text/plain; charset=UTF-8; format=flowed
>>>>>> 
>>>>>> Dear Florian,
>>>>>> 
>>>>>> There is no model without a question. Pages of books constitute a
>>>>>> partitioning of an
>>>>>> information object. Each page number can be seen as an identifier.
>>>>>> Paragraphs belong to an alternative partitioning system. The
>>>>>> reproduction has its own particioning, the scanned double pages.
>>>>>> Each scanned image represents, actually also incorporates, the text of
>>>>>> two pages of the reproduced.
>>>>>> Between alternative partitionings, one can define includes/overlaps
>>>>>> relations.
>>>>>> 
>>>>>> If this is elegant, depends on what queries or functions you'd like to
>>>>>> support.
>>>>>> 
>>>>>> Best,
>>>>>> 
>>>>>> martin
>>>>>> 
>>>>>> On 7/3/2017 1:36 ??, Florian Kr?utli wrote:
>>>>>>> Dear all,
>>>>>>> 
>>>>>>> I have a collection of Books (F5) that have been reproduced (F33) as
>>>>> PDFs (E84).
>>>>>>> In some cases, books have been digitised as spreads i.e. one page in
>>>>> the PDF represents two pages in the book.
>>>>>>> 
>>>>>>> Is there an elegant way to model this?
>>>>>>> 
>>>>>>> Best,
>>>>>>> 
>>>>>>> Florian
>>>>>>> _______________________________________________
>>>>>>> Crm-sig mailing list
>>>>>>> Crm-sig at ics.forth.gr
>>>>>>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> --------------------------------------------------------------
>>>>>> Dr. Martin Doerr              |  Vox:+30(2810)391625        |
>>>>>> Research Director             |  Fax:+30(2810)391638        |
>>>>>>                               |  Email: martin at ics.forth.gr |
>>>>>>                                                             |
>>>>>>               Center for Cultural Informatics               |
>>>>>>               Information Systems Laboratory                |
>>>>>>                Institute of Computer Science                |
>>>>>>   Foundation for Research and Technology - Hellas (FORTH)   |
>>>>>>                                                             |
>>>>>>               N.Plastira 100, Vassilika Vouton,             |
>>>>>>                GR70013 Heraklion,Crete,Greece               |
>>>>>>                                                             |
>>>>>>             Web-site: http://www.ics.forth.gr/isl           |
>>>>>> --------------------------------------------------------------
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ------------------------------
>>>>>> 
>>>>>> Subject: Digest Footer
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Crm-sig mailing list
>>>>>> Crm-sig at ics.forth.gr
>>>>>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>>>>> 
>>>>>> 
>>>>>> ------------------------------
>>>>>> 
>>>>>> End of Crm-sig Digest, Vol 122, Issue 8
>>>>>> ***************************************
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Crm-sig mailing list
>>>>> Crm-sig at ics.forth.gr
>>>>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>>>> 
>>>>> _______________________________________________
>>>>> Crm-sig mailing list
>>>>> Crm-sig at ics.forth.gr
>>>>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>>>> 
>>>>> _______________________________________________
>>>>> Crm-sig mailing list
>>>>> Crm-sig at ics.forth.gr
>>>>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>>>> 
>>>> 
>>>> 
>>> 
>> 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.ics.forth.gr/pipermail/crm-sig/attachments/20170310/1ff1d6bd/attachment.html>
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> _______________________________________________
> Crm-sig mailing list
> Crm-sig at ics.forth.gr
> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
> 
> 
> ------------------------------
> 
> End of Crm-sig Digest, Vol 122, Issue 10
> ****************************************

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ics.forth.gr/pipermail/crm-sig/attachments/20170310/d5a65414/attachment-0001.html>


More information about the Crm-sig mailing list