JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for CCP4BB Archives


CCP4BB Archives

CCP4BB Archives


CCP4BB@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

CCP4BB Home

CCP4BB Home

CCP4BB  March 2016

CCP4BB March 2016

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: Eigers, and local CBF formats

From:

Gerard Bricogne <[log in to unmask]>

Reply-To:

Gerard Bricogne <[log in to unmask]>

Date:

Mon, 14 Mar 2016 17:10:12 +0000

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (211 lines)

Dear Herb,

     Thank you for your message. Before we respond to your suggestion
about "opening" our code, we would like to double-check that there is
no misunderstanding about what it does. We didn't think that what we
wrote in two different e-mails could be misunderstood, but of course
it might still be :-) .

     In our message of Thursday morning, addressed to Graeme, it was
clearly stated that

  "[...] even if the primary use of our converter is also to provide
  intermediate files (hidden from the user) for processing with
  autoPROC/XDS, at least these files are intended to (and _should_) be
  fully standardised and allow processing with any other package that
  supports mini-cbf/CBF files. Regarding autoPROC itself, we are not
  proposing that users convert HDF5 files into mini-cbf/CBF files
  before running it - the documentation is very clear about that:
  users should give autoPROC the HDF5 data directly."
  
and similarly in our Friday afternoon message, addressed to Herman,
that 

  "autoPROC uses HDF5 files directly as input and doesn't leave
  miniCBF files around that might get archived and incur storage
  costs. That point might not have been clear and [could] therefore
  [be] causing confusion."

     There should therefore not be any ambiguity about the fact that
"autoPROC deals directly with the HDF5 file" in the sense that the
user doesn't have to pre-convert its contents into mini-CBF files that
might be thought to give rise to an extra archiving burden: autoPROC
provides this conversion efficiently on-the-fly for XDS, while always
extracting all required metadata, needed by itself or by XDS, directly
from the original HDF5 file. No need arises to archive those miniCBFs
in order to be able to repeat the processing, later or elsewhere, from
the HDF5 file. At no time did we claim that everything within autoPROC
works on the HDF5 input "natively" - it couldn't, as it uses XDS as
the main processing engine - and yet we feel that this is perhaps what
you might be assuming we have claimed when you formulate hopes that we
might help in getting rid of the computational overhead of these
conversions.

     Because of the above (metadata always read from original HDF data
and miniCBF files only present temporarily and hidden from users), our
converter in principle doesn't *need* to write a fully populated
miniCBF header. We could also have gone for the bare-bones approach as
in e.g. the Dectris H5ToXds tool, and just provided binaries for our
other, supported platforms. However, we thought that if we already
wrote a miniCBF file, we should do it properly right from the start -
following available Eiger/HDF5/NXmx and CBF/miniCBF specifications as
much as possible.

     Incidentally, users of autoPROC were very appreciative that our
converter does write correct miniCBF images since that allowed them to
use those files directly in their existing workflows and (internal)
deposition systems - thus avoiding a breakdown of procedure with the
appearance of Eiger/HDF5 datasets. Achieving this "transferability of
(re)processing" for Eiger (or any) datasets outside the synchrotron
where they were collected seemed to us a top-priority imperative from
the start, and is especially so now, as serious efforts are being made
to archive raw data that could play in testing improvements of data
processing programs the same role that archived merged data played in
testing improvements of refinement programs.


     Apologies if we are the ones who are misunderstanding the
assumptions behind your enquiry and are misinterpreting them as a
possible misunderstanding on your part :-) - we would be grateful if
you could confirm what you have in mind.


     With best wishes,
 
Gerard, Clemens, Claus & Peter


--
On Fri, Mar 11, 2016 at 01:06:14PM -0500, Herbert J. Bernstein wrote:
> Dear Colleagues,
> 
>   I am very pleased to hear that "autoPROC uses HDF5 files directly as
> input".  Might it be possible to "open" that portion of your code to the
> developer community  as a useful worked example for others?
> 
>   For Eiger 16M images we see a very large fraction of processing time
> going into conversions to CBFs now. The more people who adapt their code to
> work directly with the HDF5 format version of these files, the less use of
> computer resources in conversions we will face and the more resources will
> then be available for actual processing.
> 
>   Regards,
>     Herbert
> 
> On Fri, Mar 11, 2016 at 12:20 PM, Clemens Vonrhein <
> [log in to unmask]> wrote:
> 
> > Dear Herman,
> >
> > thank you very much for the supportive message, which describes very
> > well the environment we found ourselves in with regard to Eiger/HDF5
> > data. This is why we implemented a method into autoPROC so that our
> > users can process data coming off these exciting, new detectors. Far
> > from being "ad hoc", we tried to make the method as generally
> > applicable across beamlines as possible, and also to populate the
> > image headers as completely and accurately as we could (as a point of
> > reference: the very first solution had to be done in just over 1 week
> > between access to test data and users wanting to collect real data).
> >
> > To avoid any confusion that might arise from the very valid arguments
> > Herb is making about required lifetime of datasets for the need of
> > (re)processing at a later stage: autoPROC uses HDF5 files directly as
> > input and doesn't leave miniCBF files around that might get archived
> > and incur storage costs. That point might not have been clear and
> > therefore causing confusion. However, we are trying to open up the
> > capabilities and features of our software to as many users as possible
> > - even if we are not providing our software in an open-source model -
> > to avoid the impression of autoPROC as a "black box". This was the
> > reason for sending our initial reply to this thread.
> >
> > From a (partially) outside viewpoint it seems, that there are several
> > reasons for having e.g. those miniCBF variants written at different
> > beamlines - as described and rightly lamented by Herb. There are of
> > course very practical restrictions and pressures that everyone is
> > reacting to. But there exists also the unique luxury at a beamline to
> > ignore image headers and metadata completely, not least because XDS
> > provides the great feature of being image header agnostic and
> > completely general. The beamline control software responsible for
> > populating the image header or metadata (through some detector API or
> > by generating them already beforehand) can at the same time write
> > e.g. a XDS.INP file - or any other input/command for some other data
> > processing software. This means there is no necessity to have
> > complete, self-consistent and correct image headers or metadata in
> > place in order to process the data at that point.
> >
> > This is nothing new: we've seen that for a long time e.g. with
> > beamlines producing completely wrong image headers when it was assumed
> > that processing would be done with Denzo and a def.site file. However,
> > this shifts focus into a direction whereby data can only (reliably) be
> > processed at the time of data collection and within a particular
> > software environment at the synchrotron. Of course, it is a very
> > important and valuable feature to process data while the crystal or
> > other samples are still available and decisions can be made for this
> > or the next data collection strategies. It is where a great strength
> > of beamline-specific solutions lies.
> >
> > Some approaches to processing just worry about producing an input file
> > to the processing program that has all the necessary information
> > harvested from whatever local way metadata are stored. If the images
> > are now archived in that way they can be reprocessed, but only at that
> > synchrotron or in the same way - making it nearly impossible to do the
> > same in different and new ways, for instance with another program.
> >
> > It is crucial to achieve true transferability of reprocessing by
> > providing complete and correct metadata (not something that is just a
> > derived product of these metadata). In that respect, autoPROC is a
> > useful external tool to provide extensive checking of metadata by
> > looking directly at them (e.g. in HDF5) and not at a derived subset
> > (e.g. an XDS.INP file archived with the data).
> >
> > We don't think there is any synchrotron-independent developer of
> > processing software that is happy about having to support all the
> > variants regarding image headers and metadata. We would be happy not
> > having to provide a list of beamline specifics [1] or workarounds
> > regarding buggy image headers or incomplete HDF5 metadata [2] in order
> > to provide users with a workable solution for their project.
> >
> > It is also important to recognize that "the user" means different
> > things to different people. For the detector manufacturer it is
> > typically the synchrotron and beamline staff, while for the beamline
> > scientist it is the actual people coming to collect data on their
> > samples. What is left out of that picture is the processing software
> > and its developers - whereas in a very real sense it is only through
> > those external packages and developers that a synchrotron user truly
> > interacts with the data collected at the beamline, as shown by the
> > fact that it is most often the data processing package that gets the
> > blame if something doesn't work, when the cause of such hiccups might
> > actually lie further upstream.
> >
> > So maybe adjusting your last sentence to
> >
> >   When detector producers, beamlines and processing packages speak one
> >   common language, the users will very quickly follow.
> >
> > It has been a very useful discussion and we think that a number of
> > important matters have been brought up.
> >
> > Cheers
> >
> > Clemens, Gerard, Andrew, Claus & Peter
> >
> > [1] http://www.globalphasing.com/autoproc/wiki/index.cgi?BeamlineSettings
> > [2]
> > http://www.globalphasing.com/autoproc/wiki/index.cgi?DataProcessingHdf5
> >
> > On Fri, Mar 11, 2016 at 01:37:02PM +0000, Herman Schreuder wrote:
> > > I fully agree. For me the Madness lies in the development of a new
> > > detector and image format without consulting beforehand the relevant
> > > software developers and in each beamline apparently implementing
> > > their own, mutually incompatible local format. All major data
> > > processing programs have a way to unambiguously describe detector
> > > and goniometer geometry and I see no reason why such information
> > > cannot written into the headers.
> > >
> > > Once users have images, they want to have them processed as quickly
> > > as possible and when a chaos with new image formats has been
> > > created, one cannot blame Gerard and others for solving this problem
> > > in a maybe ad hoc matter. When the beamlines and detector producers
> > > speak one common language, the users will very quickly follow.
> >

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

May 2024
April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager