JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for DIGITALCLASSICIST Archives


DIGITALCLASSICIST Archives

DIGITALCLASSICIST Archives


DIGITALCLASSICIST@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

DIGITALCLASSICIST Home

DIGITALCLASSICIST Home

DIGITALCLASSICIST  October 2009

DIGITALCLASSICIST October 2009

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: How much server space would the Classical world occupy?

From:

Scott <[log in to unmask]>

Reply-To:

The Digital Classicist List <[log in to unmask]>

Date:

Sat, 17 Oct 2009 01:08:45 -0400

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (262 lines)

I think that James Cummings missed my point by a mile. Perhaps I did not
word it to his understanding

My problem was two-fold: 
1.  How to store data electronically and permanently in a manner that will
allow future retrieval.  Katrina took my rather large selection of 5 1/4"
floppies that I had not completed copying onto 3 1/2" floppies.  What do you
know?   My new PC does not have a 5 1/4 drive and when I replace my current
PC I am advised that few will offer 3 1/2" drives--just CDs...unless they
decide to offer only DVD drives.  A friend of mine a while back was able to
purchase the complete works of Beethoven and several others of his ilk to
play on his tape recorder.  I hope that he maintained his tape recorder
well.  Right now I cannot get an answer when I start asking about how long
certain data recorders and players will be available other than "You will
just have to keep up with the times."  That answer, to my mind, makes the
storing of petabytes and exabytes problematical--NOT because we cannot store
them right now but because I fear that we will not be able to retrieve the
data in the future.  
2.  Whom do we trust to maintain the storage?  I use the SF librarian as
a bad example. I did not expect to have someone agreeing with the librarian
that Danielle Steele and Barbara Bradford should replace Jane Austen and the
Bronte sisters.  There goes the library as a source of cultural history
and community cultural literacy. 

Furthermore, I consider the size issue to be somewhat of a red herring:
I do not believe that more is always better nor do I know of any Classicist
or Medievalist who does.  I was a Mensa group that later incorporated itself
as a museum of antiquities.  The guys in charge had a very large collection
of 2000-year-old oil lamps in good condition.  Several lamps in very good
condition were set aside for display at the future museum--the rest were for
sale because they added nothing.  A note that they were common in that
period sufficed for everything else.

We need representative samples and we need diversity and both samples and
comments on the type, degree, and location of the diverse samples and of how
wide-spread was the diversity indicated.  We cannot learn that much more
from 100,000 identical oil lamps from Jerusalem than we can from 10;
however, any and all fragments of clay seals and vellum or parchment might
well be of great value. 
   

Scott Catledge

-----Original Message-----
From: The Digital Classicist List [mailto:[log in to unmask]]
On Behalf Of James Cummings
Sent: Thursday, October 15, 2009 5:27 AM
To: [log in to unmask]
Subject: Re: [DIGITALCLASSICIST] How much server space would the Classical
world occupy?

Hi all,  [Long, apologies]

I think I'm still concerned about the notion of 'size' with 
respect to textual resources.  Ok, let's say the IADM database is 
1gb, which these days is indeed a trivial amount of space.  Is 
that 1gb in the database's format? What about an SQL or XML 
export? That would probably be bigger since one assumes the 
binary format that the database uses achieves some optimization. 
  But what if I used a clever compression algorithm on the SQL 
dump?  That might make it significantly smaller.  I suppose there 
is nothing wrong with having a rough idea, especially to compare 
it to another discipline where a similarly collected rough idea 
is being used.  And the various eResearch people will tell us 
that they are dealing with data streams in the order of gigabytes 
per second... somehow implying that means their data is somehow 
more important or better.  Let's not fall into the trap of 
believing that more is better... how many shards of  Samian Ware 
are there that introduce no real new knowledge other than a sense 
of scale?  I think size might be a poor comparator for usefulness 
in preservation of knowledge about our cultural heritage.  While 
a stream of astro-physics data may return a huge amount of data 
in size, the number of factors being measured may be quite 
limited, whereas a single textual resource may contain a huge 
number of new pieces of information, corroborations of existing 
knowledge, or  contradicting and problematising details.  I think 
the complexity of the data is relevant to the perceived worth and 
its clamouring for long-term preservation.

I disagree with Scott Catledge that storing Petabytes and 
Exabytes of information will be that problematic.  Such storage 
(in redundant manners) is already possible should the desire for 
preservation be enough to produce the funding to secure it.  It 
is the funding of centralised storage for humanities disciplines 
which is unlikely in the medium term, not the ability to do it. 
The librarian you mention is doing what librarians do, they have 
a set amount of space and weed their collection to fit -- 
librarians are not (necessarily) archivists.  Is the answer then 
for each of us (where 'us' includes individuals, projects, and 
institutions) to store and make available the data that is 
perceived as important by us? Then the challenges are in the area 
of interoperability and the exposure, authentication, and 
integration of metadata and data in some file-sharing 
cloud-computing data web wonderland where we each fund the 
preservation of what we believe to be the 'important' data. 
There is little technical challenge in doing this on a much more 
significant scale than digital classicism has reached on the web, 
the challenges are mostly human, political, and financial.  We'd 
only have ourselves to blame for the non-preservation of (and/or 
failure to fully expose and properly license) data if it is not 
available to successive generations, and in many ways that is 
already the case.  However, this introduces the same sampling 
errors that you were mentioning but instead shifts the blame away 
from some archivist and onto the community itself.  I guess I'm 
more of a 'preserve everything we can' sort of person rather than 
a 'carefully weed out redundancies' which probably explains the 
size of the network attached storage I have at home. :-(

Sorry for the long rambling post,

-James


Melissa Terras wrote:
> Just a follow up on how big the Silchester data set is, from Mike
> Rains at York Archaeological Trust
> 
> The current size of the
>> Silchester IADB database is just over one gigabyte (1080mb),
approximately
>> 350,000 records split between Finds, Contexts, Photos, Plans, etc. To
this
>> you could add the large amount of stuff (high resolution images,
>> spreadsheets, Word documents, Illustrator files, etc) on the file server
in
>> Reading which hasn't reached the IADB yet. We're probably about two
thirds
>> of the way through Silchester, so that by the end of the project I
wouldn't
>> be surprised if the total goes over 2gb (or more - every year I think we
>> generate more digital data than in previous years). I would guess that
the
>> Insula IX excavation represents less than 1% of the total area of
>> Silchester.
> 
> M
> 
> 2009/10/14 Scott <[log in to unmask]>:
>> I believe that we need to prepare for exabytes of information; petabytes
>> will run out too quickly.
>>
>> The most crucial question should be how are we going to store this amount
>> of data--and whom can we trust to decide what to maintain and what to
>> discard.  I heard a secondary school librarian brag that she had rid her
>> library of all the obsolete books by Hardy, Austen, and the Bronte
sisters
>> to make room for the great classics being written now that are pertinent
>> to our new society.  I remarked to her, "Adolf Hitler and Stalin would be
>> proud of you."  Before I am bombarded by angry librarians, they should
know
>> that I have only the greatest respect for librarians in general and the
ones
>> that I know now in particular.  The librarian in question was from San
>> Francisco.
>>
>> Despite the alleged claim by some Classicists that they only need a small
>> sample on which they can build their theories, I have read too many
satires
>> on the USA in which extraterrestrial or 3rd or 4th millennium Terran
>> archeologists explore our civilization and try to recreate our times and
>> manners, usually with very credible answers that could hardly be more
>> inaccurate.  A poll was done in 1957 on geographic knowledge of college
>> freshman by selecting one school in each state and interviewing the first
>> willing freshman.  I was selected at MSC and answered all of the
questions
>> correctly and pointed that two of the questions had more than one answer
>> (capitals of The Netherlands and Bolivia)--I was very interested in
>> political geography from elementary school on.  Mississippi was rated as
>> having the most knowledgeable students in the USA in the field of
>> geography--and I went to high school in FL.  This poll, which was quickly
>> discounted for many reasons--not all valid--is an excellent example of
>> making wide statements on an invalid sample.  Just how do any
historians--I
>> am more of a Medievalist than a Classicist--decide that their limited
>> samples are sufficient to make a conclusion.  I hedge my bets by making
my
>> sample equal to my population (e.g., the listed names on a particular
codex)
>> or by generalizing (e.g., I wrote an article on the quadra nomina just to
>> show than agnomina and other such names were used in a reply to an
article
>> that stressed the tria nomina and ignored the existence of the other name
>> forms.   That they existed was my point--not what percentage of the
>> population had them--that would be another paper.
>>
>> N. Scott Catledge, PhD/STD
>> Professor Emeritus
>> history & languages
>>
>> -----Original Message-----
>> From: The Digital Classicist List
[mailto:[log in to unmask]]
>> On Behalf Of Willard McCarty
>> Sent: Tuesday, October 13, 2009 10:15 AM
>> To: [log in to unmask]
>> Subject: Re: [DIGITALCLASSICIST] How much server space would the
Classical
>> world occupy?
>>
>> I'd guess that people here know about the genre to which this question
>> belongs, perhaps best exemplified by Michael Lesk's asking "how much
>> information is there in the world?" (Googling for his name and the
>> question will turn up some things which illustrate.) Lesk used to count
>> it in terabytes, but I suppose the figure has gone up somewhat, now that
>> we commonly have terabyte discs. It strikes me, however, that one should
>> also be asking what we would not have if all that can be stored on a
>> hard disc in whatever format were all that there is. What would happen
>> to the library if ALL that we had was the buildings and the books and
>> other resources in them?
>>
>> Yours,
>> WM
>>
>> Melissa Terras wrote:
>>> But you may also want to make the comment that Classicists are *used* to
>>> dealing with data loss, and extrapolating findings from the smallest
>>> scrap available. For example, pay packets and the Roman Army - someone
>>> out there will know better than me, but I remember reading somewhere a
>>> calculation of how many payslips would have been created (millions) and
>>> how many have survived (a handful) - yet we can understand a lot from
>>> the extant material.
>>>
>>> Additionally, its not good archival practice to keep everything... you
>>> have to make choices about what you will save, and what you will
discard!
>>>
>>> M
>>>
>>> Paradoxographer wrote:
>>>> Hello everyone, and thank you all for your contributions and help.
>>>>
>>>> To answer James' question about motivation ... I'm currently working
>>>> in research in the field of records and information management (though
>>>> a classicist by education and inclination, hence my continued
>>>> membership of this list). I am trying to get a feel for the volume of
>>>> material involved to inform a case I intend to argue in a paper /
>>>> article against the view - common in the records and archives field -
>>>> that we are entering a 'digital dark age' beacause of our current
>>>> inability to preserve more than a tiny fraction of born-digital
>>>> material. I know that the figures for current rates of information
>>>> creation are not exactly models of precision either, but they are
>>>> frequently bandied about in journals and conferences, and for my
>>>> purposes orders of magnitude will suffice.
>>>>
>>>> And I entirely agree that images, archaeological reports / records,
>>>> etc would have to be taken into consideration for any proper
>>>> assessment: the reason I provisionally excluded them was that I feared
>>>> it was too much like asking 'how long is a piece of string?' and did
>>>> not want to try the patience of the list with impossible questions!
>>>>
>>>> Kind Regards,
>>>>
>>>> Rachel Hardiman
>>>>
>> --
>> Willard McCarty, Professor of Humanities Computing,
>> King's College London: staff.cch.kcl.ac.uk/~wmccarty/
>>
> 
> 
> 

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

May 2024
April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
January 2006
December 2005
October 2005
September 2005
August 2005
July 2005
June 2005


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager