I believed we were talking about information, not data.
It is as Phil writes, very easy to generate huge amounts of data. Whether
one puts this on the Internet depends on the deepness of one's pockets and
its value to others. In the case of the Large Hadron Collider, for example,
this raw data is of interest only to a small cadre of physicists. (A much
larger group are interested in any derived information.) I don't expect it
to be online, nor Google to index it. I do expect it to flow across the
Internet on demand, because that is simply the highway.
BTW, the list of data generators is much wider: include marine science,
distributed sensor networks, all the video-collecting resources, finance,
plus many others. But this is undigested data, properly the subject of
e-research. Simulation experiments generate large amounts of data too, and
this is why climate science appears in Phil's list.
Arthur Sale
-----Original Message-----
From: Phil Barker [mailto:[log in to unmask]]
Sent: Wednesday, 16 September 2009 6:35 PM
To: Arthur Sale
Cc: [log in to unmask]
Subject: Re: Digital Preservation - The Planets Way: 17-19 November 2009,
Swiss Federal Archives, Bern, Switzerland
Arthur Sale wrote:
>
> Back of envelope engineering-style estimate (ie very rough):
>
> Assume 25,000 research journals worldwide, 10 articles per issue, 4
> issues per year, each article = 200kB on average. Total = 200 x 10^9 ,
> 0.2 PB (Peta-bytes) annually.
>
That's a very narrow view of information in the academic/research world.
Contrast with
"The Large Hadron Collider will produce roughly 15 petabytes (15 million
gigabytes) of data annually"
http://public.web.cern.ch/public/en/LHC/Computing-en.html
I believe climate science, astronomy and genomics experiments will often
kick out a terabyte of data before you've got your lab book and pencil
ready to start jotting it down :-)
Phil.
>
> Even more crudely, double to allow for all theses, conference articles
> and books = 0.4 PB annually.
>
> The academic+research world is a small contributor to the quantity of
> 'information' on the Internet. It takes too much time to generate each
> item! Now if we were asking about quality... But then we would have to
> address the question of the missing quality information (ie non-OA) as
> well.
>
> Arthur Sale
>
> -----Original Message-----
> From: Repositories discussion list
> [mailto:[log in to unmask]] On Behalf Of Leslie Carr
> Sent: Wednesday, 16 September 2009 12:17 AM
> To: [log in to unmask]
> Subject: Re: Digital Preservation - The Planets Way: 17-19 November
> 2009, Swiss Federal Archives, Bern, Switzerland
>
> On 15 Sep 2009, at 14:29, Planets Project News Update wrote:
>
> > There has been an explosion in the volume of information world-wide
>
> > which will grow to 180 exabytes by 2011.
>
> >
>
> Does anyone have any estimates for the amount of information in the
>
> academic/research world?
>
> --
>
> Les
>
--
Phil Barker Learning Technology Adviser
ICBL, School of Mathematical and Computer Sciences
Mountbatten Building, Heriot-Watt University,
Edinburgh, EH14 4AS
Tel: 0131 451 3278 Fax: 0131 451 3327
Web: http://www.icbl.hw.ac.uk/~philb/
--
Heriot-Watt University is a Scottish charity
registered under charity number SC000278.
|