Print

Print


Lots of issues getting mixed in here.

I think Brian wants a safe place for his documents so that he can work on them in future. It's not a commonly held priority for our institutional repositories to provide this kind of service - a document management system would seem a better fit - but clearly an IR can preserve documents.

As Les says, the lack of reusability of the data in PDFs was at one time a feature rather than a bug - increasingly the opposite view is held. However - the sorts of reuse I see suggested tend to indicate a shift away from a document format altogether - or at least to a more straightforwardly web-native format.

I think .docx files fall between stools in this respect. PDF has advantages as suggested earlier in the thread for the general use case of making papers easily accessible to human readers. For machine readability we are probably already looking beyond downloadable files.

Paul

Paul Walk
(sent from phone)

On 25 Jul 2012, at 17:33, Chris Eaker <[log in to unmask]> wrote:

> Thanks for pointing this out, Leslie. I did not know this about Docx files. I can see how this would be a better format for preservation of not only content, but also formatting.
> 
> On Wed, Jul 25, 2012 at 7:57 AM, Leslie Carr <[log in to unmask]> wrote:
> If like to point out that Word files (docx) are an XML-based open standard format, and that our prejudice against them is probably rooted in historic antipathy towards previous proprietary formats rather than any genuine problem with the the format itself.
> 
> PDF, on the other hand, is also an open standard, but it makes reuse very difficult.  10 years ago we thought that was a good thing. Now we believe the opposite.
> 
> Sent from my iPhone
> 
> On 25 Jul 2012, at 14:43, "Chris Eaker" <[log in to unmask]<mailto:[log in to unmask]>> wrote:
> 
> Sorry if I'm asking novice questions (but that's what I am), are you most interested in saving the content or the formatting or both? If the content is the most important thing to preserve, then why not just save the file as PDF and archive that as the master so you have a copy with all formatting intact, but then save a txt for an editable version that maintains content (assuming you need to edit in the future)? I'm wary of archiving *.DOC/X files because they may not be readable for the long-term.
> 
> On Wed, Jul 25, 2012 at 4:49 AM, Brian Kelly <[log in to unmask]<mailto:[log in to unmask]>> wrote:
> I've always deposited an MS Word copy of my papers in my local repository, together with a PDF copy.  I've done this because I've been told of the importance of preserving the master copy of a resource, rather than a lossy derivative version, such as PDF.  As I've experience in having to recreate an MS Word file from a PDF copy I know this can be a cumbersome process. I assume some authors may prefer to deposit a PDF copy as this may be regarded as providing a form of DRM by making it slightly more difficult to process the file.
> 
> What policies and practices do people have in place related to this? A Google search for "Policies on depositing MS Word files" suggests that PDFs are the norm.  Since the MS Office format has been an ISO standard since 2007 I assume the proprietary versus open standard format for deposits argument is not as strong as it was (subject to caveats about support for ISO/IEC 29500 Strict
> and the arguments about the validity of the standardisation process which I don't want to go into).
> 
> Thanks
> 
> Brian
> 
> 
> --
> --------------------------------------------------------
> Brian Kelly
> Innovation Support Centre, UKOLN, University of Bath, Bath, UK, BA2 7AY
> Phone: 01225 383943
> Email: [log in to unmask]<mailto:[log in to unmask]>
> Blog: http://ukwebfocus.wordpress.com/
> Twitter: http://twitter.com/briankelly
> Web: http://isc.ukoln.ac.uk/
> 
> 
> 
> --
> Christopher Eaker, P.E.
> Graduate Research Assistant
> Data Curation Education in Research Centers
> University of Tennessee, Knoxville
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> Christopher Eaker, P.E.
> Graduate Research Assistant
> Data Curation Education in Research Centers
> University of Tennessee, Knoxville
>