For information, Word 2013 (at least in the preview) will open PDF documents for editing.
Like most PDF to word conversion however, complex formatting can go awry.
Matthew
> -----Original Message-----
> From: Repositories discussion list [mailto:JISC-
> [log in to unmask]] On Behalf Of Leslie Carr
> Sent: 25 July 2012 15:28
> To: [log in to unmask]
> Subject: Re: Policies on depositing MS Word files
>
> Even better, the zip file contains a media directory with all the images
> embedded in the document. It's a godsend for third party copyright
> checking.
>
> Sent from my iPhone
>
> On 25 Jul 2012, at 15:16, "Talat Chaudhri"
> <[log in to unmask]<mailto:[log in to unmask]>> wrote:
>
> I was once given this tip in terms of reading DOCX files and seeing whether
> they are usable in the long term, which is quite an interesting practical test
> that anyone can do: re-name it as a zip file and then unzip it. Then check
> through the contents, which contains quite a lot of metadata, machine-
> readable packaging, formatting information and the actual content. On that
> basis, it seems fair to say that it will be fairly easily readable and processable
> in the future, even if current software platforms become unusable or
> unavailable. I second Les' remark that it's an entirely different thing to earlier
> DOC formats, which were proprietary and technically difficult to re-use. It's
> probably fair to say that DOCX is not all that bad from a preservation
> perspective.
>
> I might add that ePub seems to take a very similar approach. The metadata it
> contains can in principle be very extensive although in practice it's far more
> restricted. By comparison, there is often a lot that can be extracted from
> DOCX files (really archives), though not in a standard metadata format. But
> there is much more than just traditional metadata, so I wouldn't like to
> restrict the debate just to that.
>
> I'd be interested to know if people agree or disagree with this position on
> technical grounds. It really is worth taking a look for yourself.
>
>
> Talat
>
> On 25/07/2012 14:32, Chris Eaker wrote:
> Sorry if I'm asking novice questions (but that's what I am), are you most
> interested in saving the content or the formatting or both? If the content is
> the most important thing to preserve, then why not just save the file as PDF
> and archive that as the master so you have a copy with all formatting intact,
> but then save a txt for an editable version that maintains content (assuming
> you need to edit in the future)? I'm wary of archiving *.DOC/X files because
> they may not be readable for the long-term.
>
> On Wed, Jul 25, 2012 at 4:49 AM, Brian Kelly
> <[log in to unmask]<mailto:[log in to unmask]>> wrote:
> I've always deposited an MS Word copy of my papers in my local repository,
> together with a PDF copy. I've done this because I've been told of the
> importance of preserving the master copy of a resource, rather than a lossy
> derivative version, such as PDF. As I've experience in having to recreate an
> MS Word file from a PDF copy I know this can be a cumbersome process. I
> assume some authors may prefer to deposit a PDF copy as this may be
> regarded as providing a form of DRM by making it slightly more difficult to
> process the file.
>
> What policies and practices do people have in place related to this? A Google
> search for "Policies on depositing MS Word files" suggests that PDFs are the
> norm. Since the MS Office format has been an ISO standard since 2007 I
> assume the proprietary versus open standard format for deposits argument
> is not as strong as it was (subject to caveats about support for ISO/IEC 29500
> Strict and the arguments about the validity of the standardisation process
> which I don't want to go into).
>
> Thanks
>
> Brian
>
>
> --
> --------------------------------------------------------
> Brian Kelly
> Innovation Support Centre, UKOLN, University of Bath, Bath, UK, BA2 7AY
> Phone: 01225 383943
> Email: [log in to unmask]<mailto:[log in to unmask]>
> Blog: http://ukwebfocus.wordpress.com/
> Twitter: http://twitter.com/briankelly
> Web: http://isc.ukoln.ac.uk/
>
>
>
> --
> Christopher Eaker, P.E.
> Graduate Research Assistant
> Data Curation Education in Research Centers University of Tennessee,
> Knoxville
>
>
>
> --
> Dr Talat Chaudhri
> ------------------------------------------------------------
> Research Officer
> Innovation Support Centre
> UKOLN
> University of Bath
> Telephone: +44 (0)1970 626206 Fax: +44 (0)1225 386838
> E-mail: [log in to unmask]<mailto:[log in to unmask]> Skype:
> talat.chaudhri
> Web: http://www.ukoln.ac.uk/ukoln/staff/t.chaudhri/
> ------------------------------------------------------------
>
|