It depends on whether we mean knowledge, information or data! The original quote of 180 exabytes spoke of 180 exabytes of information, whereas I think it meant 180 exabytes of data!
I think I would argue that the knowledge content of the LHC would simply either that the existence of the Higgs Boson is proven, or (unfortunately) still uncertain (well, actually I wouldn't argue that too strongly since it seems pretty likely that they'll discover some other things along the way, and anyway I've a tendency to incline towards scientific instrumentalism).
The information content of the LHC would be the data which supports the claim that the Higgs Boson exists, which would still be many order of magnitudes less than the total data output - the rest of the LHC data output would be "noise".
However, there is the problem that future analysis or theory might discover some additional information (and hence knowledge) from the parts of the data currently regarded as "noise". However, at the petabyte scale, you can't keep everything just in case it proved useful (a maxim I don't personally observe which is why my house is so cluttered).
I'd also argue that duplication of data may increase the amount of data stored (and is generally the way most people do backups/preservation) but does not itself increase the amount of information or knowledge. I expect that 180 exabyte figure includes a lot of duplication (especially of audio and video!)
Matthew
> -----Original Message-----
> From: Andrew Treloar [mailto:[log in to unmask]]
> Sent: 16 September 2009 13:01
> To: Matthew Dovey
> Cc: [log in to unmask]
> Subject: Re: amount of academic data (was re:Digital Preservation - The
> Planets Way)
>
> Only if people believe the bit without going back and re-analysing (and re-re-
> analysing) the data. I don't think a paper that just says "Yes" will get accepted
> on its own...
>
> On 16/09/2009, at 21:58 , Matthew Dovey wrote:
>
> > You could argue that the entire knowledge content of the 100s
> > Petabytes of output from the LHC will be a single bit (indicating if
> > the Higgs Boson was found).
> >
> > Matthew
> >
> >> -----Original Message-----
> >> From: Repositories discussion list [mailto:JISC-
> >> [log in to unmask]] On Behalf Of Leslie Carr
> >> Sent: 16 September 2009 12:55
> >> To: [log in to unmask]
> >> Subject: Re: amount of academic data (was re:Digital Preservation -
> >> The Planets Way)
> >>
> >> On 16 Sep 2009, at 12:33, Andrew Treloar wrote:
> >>
> >>>> But as others have said, research data swamps the articles.
> >>>
> >>> And will increasingly do so. In fact, I can see a day when the size
> >>> of the entire journal literature will be a rounding error on the
> >>> total size of all research outputs. In some disciplines we are there
> >>> already.
> >>
> >> Lest we forget - size isn't everything. Journal papers are valuable
> >> precisely because they summarise scientific observations, turning
> >> Petabytes of data and information into a fraction of a megabyte of
> >> knowledge.
> >>
> >> Well, a fraction of a megabyte of PDF. Probably only a kilobyte of
> >> mathML.
> >> It's just the opposite of a picture being worth a thousand words :-)
> >> --
> >> Les
>
>
> --
> Andrew Treloar, PhD, MACS PCP, FRYE (2005) - http://andrew.treloar.net/
> Deputy Director, Australian National Data Service - http://ands.org.au/
> Monash University, Room 156, 700 Blackburn Rd, Clayton, 3168, Australia
> [P: +61 (0)3 990 20572 | M: +61 (0)407 202 501 | F: +61 (0)3 990 20599]
> *NOTE: Availability for meetings at http://andrew.treloar.net/calendar/
|