On your question 2) one thing we noticed is that new Bayesian polishing
overloads our nfs storage server if we run a job with more than 2 mpi
(with 30 threads each).
Any other 10-20 jobs would run OK simultaneously, but 1 polishing job
easily overloads the server.
May be there are some options in polishing to rectify this in case if
anybody knows?
Best
Leonid
Prof. Leonid Sazanov
IST Austria
Am Campus 1
A-3400 Klosterneuburg
Austria
Phone: +43 2243 9000 3026
E-mail: [log in to unmask]
On 15.03.19 15:15, Chris Dagdigian wrote:
> Thank you for allowing me into this list!
>
> I'm an HPC/scientific-computing person tasked with tuning future HPC
> systems for better Relion/EM support in future designs.
>
> My particular focus is to stress test and benchmark various
> large-scale storage offerings including very large parallel
> filesystems to see which platforms and which configuration options are
> most suitable for supporting large-scale Relion usage. I know coming
> from the genomics world that storage design has a large impact on
> research throughput and there are key metrics like small-file
> performance that are indicators for how a storage platform will handle
> a genomics-heavy workload -- I want to learn similar optimizations and
> key metrics for EM related scientific workflows.
>
> I've been reading the documentation, papers, tutorials and published
> benchmarks and it looks like:
>
> - The overwhelming focus of published benchmarks centers on CPU vs GPU
> performance on single-node and MPI-connected systems with little to no
> reported data about storage related benchmarks and optimizations
>
> - The standard benchmark data set used in various papers and sites
> online appears pretty small - small enough now to fit in RAM on larger
> nvlinked GPU or large memory compute systems and small enough to not
> really put much stress on a very large or very fast parallel
> filesystem when writing output or reading in particles or maps
>
>
> If this is not too intrusive of a query I'd welcome some advice and
> guidance on ...
>
> 1) Relion-friendly datasets structured similarly to the popular
> benchmarking data where particles and maps are already present and can
> be easily fed into command-line invocations of relion so that I can go
> out and hammer some big filesystems with reproducible benchmarking runs
>
>
> 2) Guidance on which portions of the relion3 workflow are most
> storage-intensive (reads and writes, ideally). I think I have a good
> idea of this from the online tutorial and other published materials.
> Since others have already focused on GPU vs CPU vs Mixed I figured I
> can focus a bit more on storage and IO optimization
>
>
> And in the interest of reproducibility if someone has already done
> large/parallel filesystem testing and tuning I'd love to use the same
> methods & input data so that I can add more data to what has already
> been collected.
>
>
> Regards
> Chris
>
> ########################################################################
>
> To unsubscribe from the CCPEM list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCPEM&A=1
########################################################################
To unsubscribe from the CCPEM list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCPEM&A=1
|