JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for CCP4BB Archives


CCP4BB Archives

CCP4BB Archives


CCP4BB@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

CCP4BB Home

CCP4BB Home

CCP4BB  April 2012

CCP4BB April 2012

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: very informative - Trends in Data Fabrication

From:

aaleshin <[log in to unmask]>

Reply-To:

aaleshin <[log in to unmask]>

Date:

Thu, 5 Apr 2012 15:30:06 -0700

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (225 lines)

Well, looks like my opinion about importance of data validation at the moment of their submission does not catch much support, it is sad but understandable. 

Automatic redoing the pdb structures by professionals is a good idea, I myself suggested a similar thing 10 years ago at Accelrys (we were developing a tool that allowed detecting and remodeling changes in protein-ligand structures due to ligand binding), but  there was not much financial interest. How much the raw images would enhance the remodeling process is an open question, but good luck in getting it funded. 

> c) Discarding your primary data is generally considered bad form...
Agreed, but it is a big burden on labs to maintain archives of their raw data indefinitely. Even IRS allows to discard them after some time. What is wrong with partially integrated data in terms of structure validation? 

> @AlexA:  Arguing with the PDB is not really useful. 
I did not argue yet, but I'll take your advice.

> They did not generate the bad data.
This is a genuine American thinking! But they might create conditions that would prevent their deposition.

I think I should stop heating up this discussion. 

Regards,
Alex

On Apr 5, 2012, at 2:11 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote:

> I also don't really worry about the images as a primary means of fraud
> prevention, although such may be
> a useful side effect. These cases are spectacular but so rare that it indeed
> would not primarily justify the effort.
> That it can be a useful political instrument to make that argument and get
> funding, may be, but that is a bit
> of a double edged sword and harm can be done see (5)
> 
> The real point to me seems -
> a) is there something in the images and in between casually indexed main
> reflections we do not use
> right now that allows us to ultimately get better structures?
> I think there is, and it has been told before, from superstructures,
> modulation, diffuse contributions etc etc.
> A processed data file does not help here. But do we need the old image data
> for that or rather use new ones from
> modern detectors? Where is the cost/benefit cutoff here?
> 
> b) looking at how some structures are refined, there is little reason to
> believe that data processing would be done more
> competently by untrained casual users (except that much of the data
> processing is done with the help of beam
> line personnel who rather know how to do it). Had we images, the next step
> then could be PDB_reprocess.
> A processed data file does not help much there either.
> 
> c) Discarding your primary data is generally considered bad form...
> 
> @AlexA:  Arguing with the PDB is not really useful. They did not generate
> the bad data.
> 
> Best, BR
> 
> -----Original Message-----
> From: CCP4 bulletin board [mailto:[log in to unmask]] On Behalf Of Ronald
> E Stenkamp
> Sent: Thursday, April 05, 2012 1:04 PM
> To: [log in to unmask]
> Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication
> 
> This discussion has been interesting, and it's provided an interesting forum
> for those interested in dealing with fraud in science.  I've not contributed
> anything to this thread, but the message from Alexander Aleshin prodded me
> to say some things that I haven't heard expressed before.
> 
> 1.  The sky is not falling!  The errors in the birch pollen antigen pointed
> out by Bernhard are interesting, and the reasons behind them might be
> troubling.  However, the self-correcting functions of scientific research
> found the errors, and current publication methods permitted an airing of the
> problem.  It took some effort, but the scientific method prevailed.
> 
> 2.  Depositing raw data frames will make little difference in identifying
> and correcting structural problems like this one.  Nor will new requirements
> for deposition of this or that detail.  What's needed for finding the
> problems is time and interest on the part of someone who's able to look at a
> structure critically.  Deposition of additional information could be
> important for that critical look, but deposition alone (at least with
> today's software) will not be sufficient to find incorrect structures.
> 
> 3.  The responsibility for a fraudulent or wrong or poorly-determined
> structure lies with the investigator, not the society of crystallographers.
> My political leanings are left-of-central, but I still believe in individual
> responsibility for behavior and actions.  If someone messes up a structure,
> they're accountable for the results.
> 
> 4.  Adding to the deposition requirements will not make our science more
> efficient.  Perhaps it's different in other countries, but the
> administrative burden for doing research in the United States is growing.
> It would be interesting to know the balance between the waste that comes
> from a wrong structure and the waste that comes from having each of us deal
> with additional deposition requirements.
> 
> 5.  The real danger that arises from cases of wrong or fraudulent science is
> that it erodes the trust we have in each others results.  No one has time or
> resources to check everything, so science is based on trust.  There are
> efforts underway outside crystallographic circles to address this larger
> threat to all science, and we should be participating in those discussions
> as much as possible.
> 
> Ron
> 
> On Thu, 5 Apr 2012, aaleshin wrote:
> 
>> Dear John,Thank you for a very informative letter about the IUCr
>> activities towards archiving the experimental data. I feel that I did
>> not explain myself properly. I do not object archiving the raw data, I
> just believe that current methodology of validating data at PDB is
> insufficiently robust and requires a modification.
>> Implementation of the raw image storage and validation will take a
>> considerable time, while the recent incidents of a presumable data
>> frauds demonstrate that the issue is urgent. Moreover, presenting the
>> calculated structural factors in place of the experimental data is not
>> the only abuse that the current validation procedure encourages to do.
>> There might be more numerous occurances of data "massaging" like
>> overestimation of the resolution or data quality, the system does not
>> allow to verify them. IUCr and PDB follows the American taxation
>> policy, where the responsibility for a fraud is placed on people, and
>> the agency does not take sufficient actions to prevent it. I believe
>> it is inefficient and inhumane. Making a routine
>> check of submitted data at a bit lower level would reduce a
>> temptation to overestimate the unclearly defined quality statistics
>> and make the model fabrication more difficult to accomplish. Many people
> do it unknowingly, and catching them afterwards makes no good.
>> 
>> I suggested to turn the current incidence, which might be too complex
>> for burning heretics, into something productive that is done as soon as
> possible, something that will prevent fraud from occurring.
>> 
>> Since my persistent "trolling" at ccp4bb did not take any effect
>> (until now), I wrote a "bad-English" letter to the PDB administration,
>> encouraging them to take urgent actions. Those who are willing to count
> grammar mistakes in it can reading the message below.
>> 
>> With best regards,
>> Alexander Aleshin, staff scientist
>> Sanford-Burnham Medical Research Institute
>> 10901 North Torrey Pines Road
>> La Jolla, California 92037
>> 
>> Dear PDB administrators;
>> 
>> I am wringing to you regarding the recently publicized story about
>> submission of calculated structural factors to the PDB entry 3k79
>> (http://journals.iucr.org/f/issues/2012/04/00/issconts.html). This
> presumable fraud (or a mistake) occurred just several years after another,
> more massive fabrication of PDB structures (Acta Cryst.
>> (2010). D66, 115) that affected many scientists including myself. The
>> repetitiveness of these events indicates that the current mechanism of
>> structure validation by PDB is not sufficiently robust. Moreover, it is
> completely incapable of detecting smaller mischief such as overestimation of
> the data resolution and quality.
>> 
>>            There are two approaches to handling fraud problems: (1)
>> raising policing and punishment, or (2) making a fraud too difficult to
> implement. Obviously, the second approach is more humane and efficient.
>> 
>>            This issue has been discussed on several occasions by the
>> ccp4bb community, and some members began promoting the idea of
>> submitting raw crystallographic images as a fraud repellent. However,
>> this validation approach is not easy and cheap, moreover, it requires a
> considerable manpower to conduct it on a day-to-day basis. Indeed, indexing
> data sets is sometimes a nontrivial problem and cannot be accomplished
> automatically.
>> For this reason, submitting the indexed and partially integrated data
>> (such as .x files from HKL2000 or the output.mtz file from Mosfilm)
> appears as a cheaper substitute to the image storing/validating.
>> 
>>            Analysis of the partially integrated data provides almost
>> same means to the fraud prevention as the images.  Indeed, the
>> observed cases of data fraud suggest that they would likely be
>> attempted by a biochemist-crystallographer, who is insufficiently
>> educated to fabricate the partially processed data. A method
>> developer, on contrary, does not have a reasonable incentive to forge a
> particular structure, unless he teams up with a similarly minded biologist.
> But the latter scenario is very improbable and has not been detected yet.
>> 
>>            The most valuable benefit in using the partially processed
>> data as a validation tool would be the standardization of definition
>> for the data resolution and detection of inappropriate massaging of
> experimental data.
>> 
>>            Implementation of this approach requires minuscule
>> adaptation of the current system, which most of practicing
>> crystallographers would accept (in my humble opinion). The requirement
>> to the data storage would be only ~1000 fold higher than the current one,
> and transferring the new data to PDB could be still done over the Internet.
> Moreover, storing the raw data is not required after the validation is done.
>> 
>>            A program such as Scala of CCP4 could be easily adopted to
>> process the validation data and compare them with a conventional set of
> structural factors.  Precise consistency of the two sets is not necessary.
>> They only need to agree within statistically meaningful boundaries,
>> and if they don?t, the author could be asked to provide a detailed
>> algorithm of his/her data processing. Finally, the standardized method
>> could be used to determine the resolution of submitted data, which could
> be reported together with values provided by the author.
>> 
>>            To implement this validation approach, PDB would need to
>> raise some funds, but small enough to be sacrificed out of our common
>> feeder. Anyway, it is easier and cheaper than the raw image approach
>> and can serve as a basis for a transfer to it in a future (if
>> required). Since it appears to be a joined project to
>> CCP4 and PDB, I ask all crystallographers, who feel an urgent need for
>> upgrading the structure validation protocol, to encourage them to
>> consider this issue as quickly as possible. People who commit crimes are
> not always bad people; lets show our governments a good way to handle this
> problem.
>> 
>> 
>> 
>> Sincerely,
>> 
>> Alexander Aleshin, Staff Scientist
>> 
>> Sanford-Burnham Institute for Medical Research,
>> 
>> La Jolla, CA, USA.
>> 
>> 
>> 
>> 
>> 
>> 
>> 

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager