Some responses to the message on archiving ejournals I sent earlier
today...
Forwarded message:
> The thread continues on web4lib ...
>
>
> ---------- Forwarded message ----------
> Date: Thu, 11 Apr 1996 12:12:13 -0700
> From: Hal Kirkwood <[log in to unmask]>
> To: Multiple recipients of list <[log in to unmask]>
> Subject: Re: "Archiving" e-journals
>
> Bill, (and others who are interested)
>
> I have also been considering the same issues. I have not come up with
> anything other than slave labor for the transfering of docs to our web
> page. (Thus we have none accessible at this time)
>
> Not only is the loss of a link to the web page an issue but there are many
> e-journals/e-newsletters that are sent by e-mail only.
>
> I have heard of software that automatically downloads the entire contents
> of a web site (WebWhacker ?) to your hard drive....this may speed the
> process somewhat.
>
> And I haven't seen any info on broader initiatives.
>
> So this message isn't much help to you. I just wanted to let you know that
> you're not alone. If you want to discuss this further feel free to email
> me.
>
> Regards,
>
> -hal
>
>
> *****ORIGINAL MESSAGE*****
> ....text deleted.....
> >Is anyone else addressing this issue at their library? Have you
> >developed any effective strategies that don't resemble using slave
> >labor to re-write web pages? I don't have any experience with
> >"mirroring" any web sites. Is this a possible automated solution to
> >the problem? Are there any broader initiatives looking into the
> >problem of archiving and preserving e-journals locally?
> >
> >Thanks in advance for your replies!
> >
> >--Bill Pardue
> >
> >----------------------------------------------------------
> > Bill Pardue--Electronic Resources Librarian
> > Galvin Library, Illinois Institute of Tech.
> > Chicago, IL 60616
> >312-567-3615/312-567-5318 (fax) [log in to unmask]
> >----------------------------------------------------------
>
> ***************************************************************
> Hal Kirkwood [log in to unmask] *
> Assistant Librarian *
> Fraser Library ---- Business/Computer Science/Math *
> S.U.N.Y. at Geneseo 1 College Circle Geneseo, NY 14454 *
> Work#716/245-5334 Fax#716/245-5003 *
> My opinions do not represent the university. *
> Personal Home Page http://137.238.50.66/~hal/hal.html *
> ***************************************************************
>
> Date: Thu, 11 Apr 1996 13:10:04 -0700
> From: Bill Pardue <[log in to unmask]>
> To: Multiple recipients of list <[log in to unmask]>
> Subject: re: "Archiving" e-journals
>
> Thanks to those who replied. Seems like lots of folks are trying to
> think through this same problem. Suggestions have come in two forms:
>
> 1) Use "print driver" software such as Novell's Envoy or the Adobe
> Acrobat PDFWriter to "capture" the printed versions of pages. The
> major downside is that you have to sit there and keep printing out
> pages as you find them, naming files, etc. You also lose the
> "webbiness" of interrelated HTML documents. This is probably a good
> solution for some "flatter" e-journals, however.
>
> 2) Use a downloading tool like Webwhacker (which I heard of for the
> first time today).
>
> I'm trying the Webwhacker solution. I decided to try it out on one
> web site. I started downloading around 10am. It's now 3:50pm and
> it's still downloading the same site. The depth to which it
> downloads a web site can be set as a preference, but for the site in
> question, you pretty much had to let it get files all the way down
> the tree. If nothing else, it makes you realize how much goes into
> maintaining even a moderately-sized web site. If we stick with
> Webwhacker, it looks like we'll just point it at a web site at the
> end of the day, let it go overnight and see what we've got in the
> morning.
>
> Again, thanks for the input!
>
> Oh, yeah...if you'd like to check out a trial copy of Webwhacker,
> connect to the ForeFront page:
>
> http://www.ff.com
>
> and follow the links to products. It's a 30-day trial and you have
> to register to try it out.
>
> --Bill Pardue
> ----------------------------------------------------------
> Bill Pardue--Electronic Resources Librarian
> Galvin Library, Illinois Institute of Tech.
> Chicago, IL 60616
> 312-567-3615/312-567-5318 (fax) [log in to unmask]
> --------------------------------------------------------
>
> Date: Thu, 11 Apr 1996 14:11:21 -0700
> From: Kurt Foss <[log in to unmask]>
> To: Multiple recipients of list <[log in to unmask]>
> Subject: Re: "Archiving" e-journals
>
> The pereference *also* includes downloading all files _linked_ to a
> site, so unless that's what you had in mind, you might wanna be sure
> you're not downloading half the Net. ;->
>
> rgds ~ Kurt
>
> _____
> Bill Pardue wrote:
>
> > I'm trying the Webwhacker solution. I decided to try it out on one
> > web site. I started downloading around 10am. It's now 3:50pm and
> > it's still downloading the same site. The depth to which it
> > downloads a web site can be set as a preference, but
> >---
> KURT FOSS ~ k
> U_of Wisconsin-Madison * School Of Journalism and Mass Comm.
> 5020 Vilas Communication Hall * 821 Univ. Ave. * Madison, WI 53706
> Email: [log in to unmask] * [log in to unmask] * CIS: 70541,1040
> Phone: W 608.263.3391 or .4080 * FAX 262.1361 * H 271.1210
> ONline@UW-Madison <http://www.journalism.wisc.edu/>
> Technology Editor, EPW8 <http://sunsite.unc.edu/nppa/epw8home.html>
> Interactive snappy quote <http://www.xmission.com:80/~mgm/quotes>
> <!-- # -->
>
> Date: Thu, 11 Apr 1996 14:35:16 -0700
> From: Aaron Bradley <[log in to unmask]>
> To: Multiple recipients of list <[log in to unmask]>
> Subject: Re: "Archiving" e-journals
>
> Bill, Hal, et al.:
>
> I really don't believe that local mirroring of electronic journals
> is viable or desirable, for a variety of reasons. By I'm
> referring to a large-scale, large-scope transfer and local
> archiving of electronic resources that Bill and Hal seem
> to be talking about: I think that the mirroring of a few, specific
> journals is possible and benefitial.
>
> First of all, as Bill and Hal have found, the logistics can be
> horrible. Plain text or binary files (like documents in PostScript)
> are fairly straightforward, but once you start dealing with
> any sort of complex html files you're bound to have to edit
> them manually to clear up the referencing. And wait until
> more sites start using Java: it will make writing the
> data-fetching robot a nightmare. And there's the whole
> resource allocation problem of dealing with such
> huge amounts of data, as Bill has just brought up
> in a post as I'm writing this:
> >web site. I started downloading around 10am. It's now 3:50pm and
> >it's still downloading the same site. The depth to which it
>
> Secondly, I think a better use of Web resources -- both
> locally and network wide -- is to work on efficient indexing
> and tracking of remote electronic journals. That is to say
> its better to have a large index of Internet journals rather
> than a large collection of them. Indexing is a task that can
> be accomplished relatively easily -- at least in comparison
> with trying to make large chunks of data site-conformant.
> If one is going to engage in that sort of manual labour it's
> probably better applied to cataloguing of journals.
>
> Finally, the copyright issue alluded to in the original post
> could be an insurmountable hurdle. True, it's relatively
> easy to assume that any journal freely available full-text
> wouldn't mind having you mirror them, but you couldn't
> be sure: unless there was a stated release you'd have
> to check with individual producers on a case-by-case
> basis. This is simply untenable for a large number of
> journals.
>
> The primary drawbacks of relying on remote sites are
> that the site will become temporarily or permanently
> unavailable, or that the resources will move to another
> server or directory structure.
>
> It is in regard to availablity that I think mirroring of
> a few specific journals is worthwhile and tenable,
> and maybe this is what Bill and Hal were actually
> referring to. Even one mirror site of an electronic
> journal assures a lot better chance of access than
> one site alone. A mirror site often provides a
> geographically distant server, which enables the
> user to select the most efficient site. You still
> have the logistical problems I described earlier,
> but you have it on a smaller scale. If you can't
> work out a common format with the information
> producer, you can at least write a script, or even
> a word-processing macro, that will make the mirrored
> date site-compliant: you can afford to invest some
> time in this is you're only dealing with a couple of
> journals. I think that such mirroring arrangements
> would work especially well between subject specialist
> libraries and electronic information producers in the
> library's subject field, being obviously mutually
> benefitial.
>
> As to the mutability of pages, both from directory
> to directory and site to site, it's really an issue that
> effects all information retrieval on the Internet, and as
> such I don't believe that massive mirroring is the
> answer. And there's the simple fact that if you're
> going to loose track of it for linking, you're also going
> to loose track of it for mirroring. You could end
> up in the dreadful position where one site holds
> a complete archive and another only a few issues,
> and still another a different selection: you're never
> sure whether you're looking at all the issues of
> those titles available. This has already happened
> for quite a few journals. Once again, I think human
> and system resources are better spent in tracking
> changes to remote resources, rather than trying
> to replicate them. It's an issue that becomes more
> pressing every day in the indexing, and particularly
> cataloguing, of academically-oriented electronic
> texts. I think there comes a point where you have
> to accept the fact that you can't guarantee the
> continued existence of any resource on the 'net,
> and that the best thing you can do is try to find
> mechanisms for coping with the plastic nature of
> your references.
>
> I look forward to continued dicussion of this issue!
> Bill, I'm particulary interested in finding out on
> exactly what scale you envision mirroring, as
> it's not quite clear to me.
>
> Regards,
> Aaron Bradley
> [log in to unmask]
> (BTW - http://www.cfcsc.dnd.ca/links/per/pera.html - my
> sad list of journals in my subject area)
>
>
> Date: Thu, 11 Apr 1996 15:23:14 -0700
> From: Robb Scholten <[log in to unmask]>
> To: Multiple recipients of list <[log in to unmask]>
> Subject: Re: "Archiving" e-journals
>
>
> To add to the discussion so far, there may be a way to utilize what is
> currently under consideration in the federal Digital Library Project
> "central repository idea". This would be a place where one registers
> themselves along with a mode of payment for resources expended. From that
> one site, they will be able to transparently search all other registered
> sites for materials. Whatever is retrieved would be charged to their
> central account, and copyright would be recorded in some CCC database.
>
> This would eliminate two problems, 1. having to possess dozens of
> passwords to different resources (and having to send your credit card the
> same number of times over not-secure networks, even when an institutional
> purchase order is what is wanted.)
> 2. and forcing some one company to become a "Dialog of the internet"
> somekind of superstore that would acquire and locally store all important
> databases from other vendors.
>
> This is the obvious next step from the Z39.50 solution for using local
> interfaces to search remote databases. This would also force a certain
> kind of standard protocol on the part of the information creators. Once
> you have a critical mass of databases all participating in this project,
> those folks who refuse to build their data using the normal protocols
> would be hurting.
>
> I am looking forward to some solution for our present conundrum.
>
> Robb Scholten
>
> www.paperchase.com
> Beth Israel Hospital
>
>
>
>
--
Chris Rusbridge
Programme Director, Electronic Libraries Programme
The Library, University of Warwick, Coventry CV4 7AL, UK
Phone 01203 524979 Fax 01203 524981
Email [log in to unmask]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|