Print

Print


Both TeX and Word provide an expletive-based approach to layout involving complex tables and floating figures.

Sent from my iPad

On 25 Jul 2012, at 15:52, "Matthew Dovey" <[log in to unmask]> wrote:

> For information, Word 2013 (at least in the preview) will open PDF documents for editing.
> 
> Like most PDF to word conversion however, complex formatting can go awry.
> 
> Matthew
> 
>> -----Original Message-----
>> From: Repositories discussion list [mailto:JISC-
>> [log in to unmask]] On Behalf Of Leslie Carr
>> Sent: 25 July 2012 15:28
>> To: [log in to unmask]
>> Subject: Re: Policies on depositing MS Word files
>> 
>> Even better, the zip file contains a media directory with all the images
>> embedded in the document. It's a godsend for third party copyright
>> checking.
>> 
>> Sent from my iPhone
>> 
>> On 25 Jul 2012, at 15:16, "Talat Chaudhri"
>> <[log in to unmask]<mailto:[log in to unmask]>> wrote:
>> 
>> I was once given this tip in terms of reading DOCX files and seeing whether
>> they are usable in the long term, which is quite an interesting practical test
>> that anyone can do: re-name it as a zip file and then unzip it. Then check
>> through the contents, which contains quite a lot of metadata, machine-
>> readable packaging, formatting information and the actual content. On that
>> basis, it seems fair to say that it will be fairly easily readable and processable
>> in the future, even if current software platforms become unusable or
>> unavailable. I second Les' remark that it's an entirely different thing to earlier
>> DOC formats, which were proprietary and technically difficult to re-use. It's
>> probably fair to say that DOCX is not all that bad from a preservation
>> perspective.
>> 
>> I might add that ePub seems to take a very similar approach. The metadata it
>> contains can in principle be very extensive although in practice it's far more
>> restricted. By comparison, there is often a lot that can be extracted from
>> DOCX files (really archives), though not in a standard metadata format. But
>> there is much more than just traditional metadata, so I wouldn't like to
>> restrict the debate just to that.
>> 
>> I'd be interested to know if people agree or disagree with this position on
>> technical grounds. It really is worth taking a look for yourself.
>> 
>> 
>> Talat
>> 
>> On 25/07/2012 14:32, Chris Eaker wrote:
>> Sorry if I'm asking novice questions (but that's what I am), are you most
>> interested in saving the content or the formatting or both? If the content is
>> the most important thing to preserve, then why not just save the file as PDF
>> and archive that as the master so you have a copy with all formatting intact,
>> but then save a txt for an editable version that maintains content (assuming
>> you need to edit in the future)? I'm wary of archiving *.DOC/X files because
>> they may not be readable for the long-term.
>> 
>> On Wed, Jul 25, 2012 at 4:49 AM, Brian Kelly
>> <[log in to unmask]<mailto:[log in to unmask]>> wrote:
>> I've always deposited an MS Word copy of my papers in my local repository,
>> together with a PDF copy.  I've done this because I've been told of the
>> importance of preserving the master copy of a resource, rather than a lossy
>> derivative version, such as PDF.  As I've experience in having to recreate an
>> MS Word file from a PDF copy I know this can be a cumbersome process. I
>> assume some authors may prefer to deposit a PDF copy as this may be
>> regarded as providing a form of DRM by making it slightly more difficult to
>> process the file.
>> 
>> What policies and practices do people have in place related to this? A Google
>> search for "Policies on depositing MS Word files" suggests that PDFs are the
>> norm.  Since the MS Office format has been an ISO standard since 2007 I
>> assume the proprietary versus open standard format for deposits argument
>> is not as strong as it was (subject to caveats about support for ISO/IEC 29500
>> Strict and the arguments about the validity of the standardisation process
>> which I don't want to go into).
>> 
>> Thanks
>> 
>> Brian
>> 
>> 
>> --
>> --------------------------------------------------------
>> Brian Kelly
>> Innovation Support Centre, UKOLN, University of Bath, Bath, UK, BA2 7AY
>> Phone: 01225 383943
>> Email: [log in to unmask]<mailto:[log in to unmask]>
>> Blog: http://ukwebfocus.wordpress.com/
>> Twitter: http://twitter.com/briankelly
>> Web: http://isc.ukoln.ac.uk/
>> 
>> 
>> 
>> --
>> Christopher Eaker, P.E.
>> Graduate Research Assistant
>> Data Curation Education in Research Centers University of Tennessee,
>> Knoxville
>> 
>> 
>> 
>> --
>> Dr Talat Chaudhri
>> ------------------------------------------------------------
>> Research Officer
>> Innovation Support Centre
>> UKOLN
>> University of Bath
>> Telephone: +44 (0)1970 626206    Fax: +44 (0)1225 386838
>> E-mail: [log in to unmask]<mailto:[log in to unmask]>   Skype:
>> talat.chaudhri
>> Web: http://www.ukoln.ac.uk/ukoln/staff/t.chaudhri/
>> ------------------------------------------------------------
>> 
>