JISCMail - CCPEM Archives

On Fri, 15 Mar 2019, 14:27 Chris Dagdigian, <[log in to unmask]> wrote:

Thank you for allowing me into this list!

I'm an HPC/scientific-computing person tasked with tuning future HPC
systems for better Relion/EM support in future designs.

My particular focus is to stress test and benchmark various large-scale
storage offerings including very large parallel filesystems to see which
platforms and which configuration options are most suitable for
supporting large-scale Relion usage. I know coming from the genomics
world that storage design has a large impact on research throughput and
there are key metrics like small-file performance that are indicators
for how a storage platform will handle a genomics-heavy workload -- I
want to learn similar optimizations and key metrics for EM related
scientific workflows.

I've been reading the documentation, papers, tutorials and published
benchmarks and it looks like:

- The overwhelming focus of published benchmarks centers on CPU vs GPU
performance on single-node and MPI-connected systems with little to no
reported data about storage related benchmarks and optimizations

- The standard benchmark data set used in various papers and sites
online appears pretty small - small enough now to fit in RAM on larger
nvlinked GPU or large memory compute systems and small enough to not
really put much stress on a very large or very fast parallel filesystem
when writing output or reading in particles or maps

If this is not too intrusive of a query I'd welcome some advice and
guidance on ...

1) Relion-friendly datasets structured similarly to the popular
benchmarking data where particles and maps are already present and can
be easily fed into command-line invocations of relion so that I can go
out and hammer some big filesystems with reproducible benchmarking runs

2) Guidance on which portions of the relion3 workflow are most
storage-intensive (reads and writes, ideally). I think I have a good
idea of this from the online tutorial and other published materials.
Since others have already focused on GPU vs CPU vs Mixed I figured I can
focus a bit more on storage and IO optimization

And in the interest of reproducibility if someone has already done
large/parallel filesystem testing and tuning I'd love to use the same
methods & input data so that I can add more data to what has already
been collected.

Regards
Chris

########################################################################

To unsubscribe from the CCPEM list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCPEM&A=1