Hi Everyone,
...
> (3) A desire to avoid perceived preservation costs through maintaining a large number of formats in the repository (caused in part, by a lack of automated conversion tools)
>
...
> The 3rd issue remains a concern, but has not prevented the rise of research data repositories in recent years.
>
I just want to comment here that I use TeX myself (XeLaTeX mostly
because I write in the Humanities and I need unicode support) and for
converting to Word for colleagues, I use Pandoc
(http://johnmacfarlane.net/pandoc/), which translates to and from many
different formats. So, I think, if you are familar with Haskell, that
issue 3 you refer to is less of a problem now.
All the best,
Chris
>
>
>
>
> --
> Gareth Knight
> Research Data Management Project Manager
> London School of Hygiene & Tropical Medicine (LSHTM)
> Keppel Street, London, WC1E 7HT
> Email: (+44) 020 7927 2564
>
>
> >>> Panyarak Ngamsritragul <[log in to unmask]> 26/07/2012 09:44 >>>
> One critical problem of Word Processor formats, either MS Word's docx or
> OpenOffice or LibreOffice's open document formats, is most of them are not
> truely backward compatible. I am sure you could face some trouble in
> opening your MS Word documents created a few years ago. While this sort
> of trouble is not found in PDF reader, if I am not wrong.
>
> Though MS Word is now claiming that they (try to) support open document
> formats, but this is still far from being perfect and it is still quite
> doubtful whether MS is willing to comply with the open document formats.
>
> It is sad to know that MS Word format could become a common format. If
> you are familiar with TeX, you should know that you can still work with
> the TeX files you created about more than 20 years ago...
>
> Panyarak Ngamsritragul
> Department of Mechanical Engineering
> Prince of Songkla University.
>
> On Thu, 26 Jul 2012, David Groenewegen wrote:
>
> > I think the other "problem" with Word comes from the word processor wars of
> > the early 90s, when it was unclear what would be the most common format
> > (remember WordPerfect? WordStar? MacWrite?). It made people nervous about its
> > longevity.
> >
> > But Word has been the default standard for creating documents for a longish
> > time now - there can't be many people or companies who don't rely on it (and
> > even if you don't I bet you still have the capacity to deal with it). If the
> > ability to access the billions of Word documents out there disappeared
> > tomorrow through some bizarre circumstance where every single one of the
> > hundreds of millions of copies of Word
> > (<http://blogs.technet.com/b/office2010/archive/2009/10/07/new-ways-to-try-and-buy-microsoft-office-2010.aspx>
> > and all the various compatible tools (<https://docs.google.com/>) stopped
> > working, someone would have to invent a way of overcoming this pretty quick
> > smart.
> >
> > Please note: I'm not saying that Word is perfect, or that I'm thrilled with
> > this outcome, or that Word is better than <insert your favourite here>, or
> > that it isn't the result of Microsoft exploiting its market share.
> >
> > What I am saying is that a Word document is probably the last format we need
> > to worry about for preservation purposes for the foreseeable future. Except
> > maybe PDF.
> >
> > D
> >
> > On 26/07/2012 2:33 AM, Chris Eaker wrote:
> >> Thanks for pointing this out, Leslie. I did not know this about Docx
> >> files. I can see how this would be a better format for preservation of
> >> not only content, but also formatting.
> >>
> >> On Wed, Jul 25, 2012 at 7:57 AM, Leslie Carr <[log in to unmask]
> >> <mailto:[log in to unmask]>> wrote:
> >>
> >> If like to point out that Word files (docx) are an XML-based open
> >> standard format, and that our prejudice against them is probably
> >> rooted in historic antipathy towards previous proprietary formats
> >> rather than any genuine problem with the the format itself.
> >>
> >> PDF, on the other hand, is also an open standard, but it makes reuse
> >> very difficult. 10 years ago we thought that was a good thing. Now
> >> we believe the opposite.
> >>
> >> Sent from my iPhone
> >>
> >> On 25 Jul 2012, at 14:43, "Chris Eaker" <[log in to unmask]
> >> <mailto:[log in to unmask]><mailto:[log in to unmask]
> >> <mailto:[log in to unmask]>>> wrote:
> >>
> >> Sorry if I'm asking novice questions (but that's what I am), are you
> >> most interested in saving the content or the formatting or both? If
> >> the content is the most important thing to preserve, then why not
> >> just save the file as PDF and archive that as the master so you have
> >> a copy with all formatting intact, but then save a txt for an
> >> editable version that maintains content (assuming you need to edit
> >> in the future)? I'm wary of archiving *.DOC/X files because they may
> >> not be readable for the long-term.
> >>
> >> On Wed, Jul 25, 2012 at 4:49 AM, Brian Kelly <[log in to unmask]
> >> <mailto:[log in to unmask]><mailto:[log in to unmask]
> >> <mailto:[log in to unmask]>>> wrote:
> >> I've always deposited an MS Word copy of my papers in my local
> >> repository, together with a PDF copy. I've done this because I've
> >> been told of the importance of preserving the master copy of a
> >> resource, rather than a lossy derivative version, such as PDF. As
> >> I've experience in having to recreate an MS Word file from a PDF
> >> copy I know this can be a cumbersome process. I assume some authors
> >> may prefer to deposit a PDF copy as this may be regarded as
> >> providing a form of DRM by making it slightly more difficult to
> >> process the file.
> >>
> >> What policies and practices do people have in place related to this?
> >> A Google search for "Policies on depositing MS Word files" suggests
> >> that PDFs are the norm. Since the MS Office format has been an ISO
> >> standard since 2007 I assume the proprietary versus open standard
> >> format for deposits argument is not as strong as it was (subject to
> >> caveats about support for ISO/IEC 29500 Strict
> >> and the arguments about the validity of the standardisation process
> >> which I don't want to go into).
> >>
> >> Thanks
> >>
> >> Brian
> >>
> >>
> >> --
> >> --------------------------------------------------------
> >> Brian Kelly
> >> Innovation Support Centre, UKOLN, University of Bath, Bath, UK, BA2 7AY
> >> Phone: 01225 383943
> >> Email: [log in to unmask]
> >> <mailto:[log in to unmask]><mailto:[log in to unmask]
> >> <mailto:[log in to unmask]>>
> >> Blog: http://ukwebfocus.wordpress.com/
> >> Twitter: http://twitter.com/briankelly
> >> Web: http://isc.ukoln.ac.uk/
> >>
> >>
> >>
> >> --
> >> Christopher Eaker, P.E.
> >> Graduate Research Assistant
> >> Data Curation Education in Research Centers
> >> University of Tennessee, Knoxville
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> --
> >> Christopher Eaker, P.E.
> >> Graduate Research Assistant
> >> Data Curation Education in Research Centers
> >> University of Tennessee, Knoxville
> >>
> >
> > --
> > David Groenewegen
> > Director, Research Data
> > Australian National Data Service
> > Physical Address: 680 Blackburn Road, Clayton, Victoria
> > Postal Address: c/o Monash University VIC 3800
> > AUSTRALIA
> >
> > Ph: +61 3 9902 0570
> > Fx: +61 3 9902 0599
> > Mb: +61 (0) 409 969 658
> > [log in to unmask]
> >
> > --
> > This message has been scanned for viruses and
> > dangerous content by MailScanner, and is
> > believed to be clean.
> >
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
|