JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for CCP4BB Archives


CCP4BB Archives

CCP4BB Archives


CCP4BB@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

CCP4BB Home

CCP4BB Home

CCP4BB  August 2007

CCP4BB August 2007

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: Richard Reid and the PDB

From:

Bernhard Rupp <[log in to unmask]>

Reply-To:

[log in to unmask]

Date:

Fri, 17 Aug 2007 13:24:45 -0700

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (145 lines)

The PDB is missing a business opportunity. If authors pay
1000s of dollars for publication in high impact journals,
they might as well pay a few bucks for image deposition.
If I could get my images stored reliably and perpetually 
for something like $20-50 a pop, I'd do it. Do you know
where your favourite frames from 1998 are? 

Image storage is a good idea *in itself*, but as an enforcement tool
it only will make the *exceedingly few* Reids more inventive.

PS: Frames for sale. 
http://www.ruppweb.org/new_comp/frame_maker.html

-----Original Message-----
From: CCP4 bulletin board [mailto:[log in to unmask]] On Behalf Of Kim
Henrick
Sent: Friday, August 17, 2007 7:04 AM
To: [log in to unmask]
Subject: [ccp4bb] Richard Reid and the PDB

After Richard Reid more than 100 million people each year have to have their
shoes examined and one effect is that older buildings like Heathrow Terminal
3 is the most painful place on earth, the cost of someone trying light their
shoelaces has affect us all.


The discussion on archiving image data sets -  I guess that less than 1% of
the image sets for PDB entries
   are useful to software development (and can be got privately)  I guess
that maybe 1 in 10,000 entries have a series problem that
   may require referees to look at the images (and can be
   accessed upon demand)


The cost of disks for your PC - kitchen table disks from a supermarket, may
be $1 per Gbyte on USB i/o but an archive centre required to maintain the
data will probably need RAID 0/1 - RAID 10, this has high performance, and
highest data protection, i.e. can tolerate multiple drive failures, but has
high redundancy cost overhead, if you havent noticed a large collection of
disks has failures. Look up the problems that the series of Landsat
satellites have had from 1980 onwards with the problems arising out of the
volume of data and the short life of computer compatible tapes and optical
discs. Archiving data lacks glamour it’s the boring day to day rectification
and storage of information, very little money gets spent on this task,for
remote sensing the most significant cost is transmission/correction and
archiving the data - Three semi-trailer loads of Landsat tapes were found
(literally) moldering in a damp basement in Baltimore after people and
funding agencies lost interest. Oh yes and detectors change every 5 years
and processing software gets lost.

At the EBI before we even get a single disk we pay £100,000 for a cabinet
- disks cost around £500 for 300gigbytes (and not the best disks these are
around the same cost for 146 Gigbytes). Disk technology changes every 5
years - an archive cost is to recover the data ever 5 years onto the next
generation of hardware. Molecular Biology and structure research is carried
out by 1000's of groups not centrally by a single international treaty setup
of a telescope that is run centrally and financed to do the data archiving.
Molecular biology uses some in-house data collection, most is carried at
sync - despite the fact that there are many beamlines, most data again is
from less than 10 sites - these major synchrotron sites are committed to
data storage by various methods of Storage Hierarchy, and a better solution
to a central archive is issuing a doi or set of doi's to the data associated
with a PDB entry and associating the doi with a PDB entry. Many countries
have spent over the last 5-7 years billion dollars on GRID and distributed
data storage - use this technology to leave the data where it is and pick it
up on demand. Googles solution to large datasets such as single file
tomograms - is to ship disks - there is no simple cheap FTP/WWW solution to
large datasets.

The cost of a central archive is several million dollars per year to setup
and run long term and who will pay - 40% of the pdb comes from the USA (the
biggest single contributor) but with the difficulting in funding from the EU
and national funding priorities is the USA to carry this cost? Is the cost
to be shared as in the table below? So far only the USA, Japan and Europe
(through UK, EU and EMBL), pays for the PDB.
The USA also pays for UniProt and other large scale data gathering areas are
carried out by nationally funded centres not by the large number of
individuals and countries that the PDB comes from.

The administration to get all the datasets is far higher than the
$1/gigabyte on a USB disk that is next to useless for an archive.
The costs of storage are rapidly decreasing but there has not been a great
change in Latencies and bandwidth - If everything gets faster&cheaper at the
same rate then nothing really changes i.e.
more structures are done.

Why inspect the shoes of every PDB entry and every structural biologist when
if we can detect the very rare suspect problem and get an agreed course of
action?

kim

PDB Depositions (1 January 1999 to 26 June 2007)
Country        1999 2000 2001 2002 2003 2004 2005 2006 2007 Total
ARGENTINA        0    0    0    0    0   2     1    6    7    16
AUSTRALIA       52   46   45   59   59   75   94   91   51   572
AUSTRIA         13    2    7    1    2   22   26   20    5    98
BELGIUM         29   28   41   24   38   27   36   50   29   302
BRAZIL           7    2   12   16   34   24   34   78   30   237
CANADA         109  117  131  115  157  185  280  334  183  1611
CHILE            0    1    0    0    0    1    2    0    0     4
CHINA           22   28   32   29   50   66  132  121   61   541
CROATIA          0    1    0    0    1    0    0    5    0     7
CZECH_REPUBLIC   2    1    4    6    5    4   12    3    4    41
CUBA             0    0    0    0    0    1    0    0    0     1
DENMARK         19   34   26   31   44   45   37   58    9   303
FINLAND         14   10   11   23   20   28   37   41   20   204
FRANCE         144  183  183  177  208  254  281  243  138  1811
GERMANY        198  234  222  207  263  315  343  436  220  2438
GREECE           6   20    8    7   17   12   16   12    8   106
HONG_KONG        2    3    7    3    7   11    5    8    9    55
HUNGARY          2    1    5    3    4    5    5    9    1    35
INDIA           35   39   45   71   67   86  112  174   65   694
IRELAND          0    2    1    0    1    2    3    7    0    16
ISRAEL          25   13   32   27   30   38   28   33   24   250
ITALY           35   56   80   80  115  100  127  118   54   765
JAPAN          150  220  240  279  528  702 1102  889 1119  5229
LITHUANIA        0    0    1    0    0    0    0    0    0     1
MEXICO           3    5    2    4    5    3    3    1    2    28
NETHERLANDS     42   20   28   21   32   34   29   30   18   254
NEW_ZEALAND     15   20   14   12   13   16   15   18   12   135
NORWAY          10    5    5   10   14    9   25   19   20   117
PAKISTAN         0    0    0    7    3    0    0    3    0    13
PERU             0    0    0    0    0    1    0    0    0     1
POLAND           3    4   16   10    5   17   11   23   10    99
PORTUGAL         8   15    7   10   15   19   14   10   11   109
RUSSIA           6    7    5    8   13   18   10   26   15   108
SINGAPORE        0    2    3    2   15   13   34   37   22   128
SLOVAKIA         0    0    4    3    2    5    1    0    1    16
SLOVENIJA        0    1    2    3    1    5    0    6    0    18
SOUTH_AFRICA     0    0    0    1    0    1    1    0    1     4
SOUTH_KOREA     43   27   30   34   66   56   61   90   43   450
SPAIN           27   36   38   34   33   54   70   81   34   407
SWEDEN          56   48   92   67   93   90  119  109   92   766
SWITZERLAND     49   29   29   35   53   46   58   98   29   426
TAWAIN           7   16   14   22   41   56   60   88   35   339
THAILAND         0    0    0    0    3    0    4    0    0     7
UNITED_KINGDOM 241  314  286  342  390  427  538  598  295  3431
UNITED_STATES 1148 1210 1322 1387 1765 2119 2295 2573 1425 15244
COMMERCIAL     173  156  169  284  465  363  467  576  276  2929
UNKNOWN         45    4    0    0    0    0    0    0    0    49
VENEZUELA        1    0    0    0    1    0    0    0    0     2
ORGANISATION    65   51   74   97  100  100  151  163   71   872
TOTAL         2806 3011 3273 3551 4778 5457 6679 7285 4449 41289

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager