JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for CCP4BB Archives


CCP4BB Archives

CCP4BB Archives


CCP4BB@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

CCP4BB Home

CCP4BB Home

CCP4BB  December 2018

CCP4BB December 2018

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: buying a cluster

From:

James Holton <[log in to unmask]>

Reply-To:

James Holton <[log in to unmask]>

Date:

Tue, 4 Dec 2018 12:15:57 -0800

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (191 lines)

Graeme's suggestion of a standard benchmarking dataset is a good one, 
but I'm not so sure a 24 GB download size is going to get a lot of hits. 
In fact, in some countries you have to pay for internet by the GB, and 
it costs as much as mobile phone data! The large size also makes it hard 
to separate two important aspects of data processing: the CPU vs the 
disk.  My goal was to isolate the CPU as much as possible, so I made a 
"standard" image processing benchmark data set that is hyper-compressed: 
11 MB download size.  This is similar to the size of XDS package 
itself.  This hyper-compression is possible because it is a simulated 
data set and that allows me to add noise on the client side.  I also 
expand the data by 10x by repeating the same 360 deg with symbolic 
links.  The footprint on local disk is only 2.3 GB, so it can usually 
fit into the ramdisk on /dev/shm of most linux systems. The whole 
hyper-decompression process is done automatically by my XDS and DIALS 
benchmarking scripts, or you can just use this one script directly:
http://bl831.als.lbl.gov/~jamesh/benchmarks/get_test_data.com
It will automatically download and decompress the 3600-image test set on 
most Linux and Mac systems.

Whether you use my benchmark or not, the important thing about any 
benchmark is to run the exact same benchmark on as many machines as 
possible, only then can you have enough "controls" to isolate what the 
important features are.

As for GHz, everything "should" scale as GHz unless something else is 
holding it back, like disk I/O or the network or a myriad of other 
things.  The most important factor for data processing, however, has 
always been and still is the last-level CPU cache size.  I think the 
reason multi-socket machines are faster than single-chip multi-core 
machines is because all the sockets are really just a way to get more 
cache into the box.

Data processing is kind of a different animal from most other 
crystallographic applications.  Perhaps because of the pressure of 
modern detectors many image processing programs have now embraced the 
multi-CPU revolution.  The problem, however, is that the more cores you 
have in your processor the lower the GHz will be for a single core.  
This seems to be a thermal management constraint. What that means is if 
you get a processor with lots and lots of cores you can process data 
really fast, but almost all the downstream steps like molecular 
replacement and refinement will be slower.  In fact, even different 
phases of data processing benefit alternately from lots of cores vs lots 
of GHz, so ideally you'd like to have two machines: one with a lot of 
cores, and the other with a single really fast processor, and alternate 
between these two machines using scripts.

But if you can only get one machine, I currently recommend the Xeon 
W-2155 as a good general-purpose crystallography processor.  It has 10 
cores, so runs DIALS and other multi-CPU programs nicely up to this 
puzzling 8-10 core ceiling, but still has the GHz to run single-threaded 
programs nice and fast.  It's not ridiculously expensive either.

My $1,440 worth,

-James Holton
MAD Scientist


On 12/2/2018 10:56 PM, [log in to unmask] wrote:
> Re: publishing benchmarks - great idea - expand on what James described earlier.
>
> Most programs are GHz dependent (for most “sensible” definitions of GHz (not the mega-hyper-pipeline stall prone P4 say) however I see your point that “threaded” and “optimised for vector systems (e.g. AVX512)” would be very useful.
>
> I am certainly not advocating that computers > 3 years old should be thrown away ;-) I am one of those folks with a bad hoarding instinct, “it’s good for parts” “it still works fine” are all in my lexicon. If you are coot-ing and want to refine a modest structure probably most machines < 10 years old will be fine.
>
> What I was trying to say is that your experience of how fast something is will depend on your use case, and that the boffins in Santa Clara and Sunnyvale have not been sitting on their hands this past decade.
>
> Finally, processing “modern” data sets can be a challenge even on fairly hefty machines - if you pull data04 from https://zenodo.org/record/1443110 you will find a 3 minute data set [1] which (even with XDS; tweaked for speed script) can take a long while on a modern-ish machine. 10 year old core2 duo will not get this done in the same kind of time-frame.
>
> best wishes Graeme
>
> PS kudos to folks for sharing the data online
>
> [1] which would make a fun challenge benchmark :-)
>
> On 3 Dec 2018, at 02:16, Markus Heckmann <[log in to unmask]<mailto:[log in to unmask]>> wrote:
>
> Hi Graeme,
>
> I suspect that this conclusions depends very closely on (i) the shape of the problem and (ii) the extent to which the binary has been optimised for the given platform.
>
> I do hope some of these info are analyzed and either published or at least put at ccp4 wiki.
>
> I am pretty sure that there are some applications (heavily threaded, making extensive use of vector operations) which would be massively quicker on 2018 hardware than something a decade old. Certainly though, if you are comparing a not-highly-optimised single threaded binary then your conclusion is probably a valid one
>
> I really request all the program developers (in the ccp4bb) to clearly have a table in the website mentioning if certain program is purely GHz dependent and not multi-threaded.
>
>
>
> Also how much power the machines take to get work done is a non-trivial factor…
>
> But what about the environment? Trashing a decent machine from 2015 for the latest threadripper2? These old maches have 80-90 + gold power supply. Many (like Apple's planned obsolescence) are *forcibly* destroyed not refurbished at all.
>
> Does DIALS run that much quicker? How much time is saved for a phd student in their career if data processing speeds up from 15 min to 10 min?
>   Sure perfect for use @synchrotron but otherwise?
>
> May the beamlines/synchrotons should allow for remote data processing and even refinement. May be all program devs need to put benchmarks - will help users greatly.
>
> These days i have a feeling science copied the typical electron/website framework programmers? Programs/website getting fatter not efficient and hoping everyone has 128GB RAM.
>
> Markus
>
>
> Cheerio Graeme
>
>
>
>> On 30 Nov 2018, at 19:32, James Holton <[log in to unmask]<mailto:[log in to unmask]>> wrote:
>>
>> I have a dissenting opinion about computers "moving on a bit".  At least when it comes to most crystallography software.
>>
>> Back in the late 20th century I defined some benchmarks for common crystallographic programs with the aim of deciding which hardware to buy.  By about 2003 the champion of my refmac benchmark (https://bl831.als.lbl.gov/~jamesh/benchmarks/index.html#refmac) was the new (at the time) AMD "Opteron" at 1.4 GHz.  That ran in 74 seconds.
>>
>> Last year, I bought a rather expensive 4-socket Intel Xeon E7-8870 v3 (turbos to 3.0 GHz), which is the current champion of my XDS benchmark.  The same old refmac benchmark on this new machine, however, runs in 68.6 seconds.  Only a smidge faster than that old Opteron (which I threw away years ago).
>>
>> The Xeon X5550 in consideration here takes 74.1 seconds to run this same refmac benchmark, so price/performance wise I'd say that's not such a bad deal.
>>
>> The fastest time I have for refmac to date is 41.4 seconds on a Xeon W-2155, but if you scale by GHz you can see this is mostly due to its fast clock speed (turbo to 4.5 GHz). With a few notable exceptions like XDS, HKL2k and shelx, which are multi-processing and optimized to take advantage of the latest processor features using intel compilers, most crystallographic software is either written in Python or compiled with gcc.  In both these cases you end up with performance pretty much scaling with GHz.  And GHz is heat.
>>
>> Admittedly, the correlation is not perfect, and software has changed a wee bit over the years, so comparisons across the decades are not exactly fair, but the lesson I have learned from all my benchmarking is that single-core raw performance has not changed much in the last ~10 years or so.  Almost all the speed increase we have seen has come from parallelization.
>>
>> And one should not be too quick to dismiss clusters in favor of a single box with a high core count. The latter can be held back by memory contention and other hard-to-diagnose problems.  Even with parallel execution many crystallography programs don't get any faster beyond using about 8-10 cores.  Don't let 100% utilization fool you!  Use a timer and you'll see.  I'm not really sure why that is, but it is the reason that same Xeon W-2155 that leads my refmac benchmark is also my champion system for running DIALS and phenix.refine.
>>
>> My two cents,
>>
>> -James Holton
>> MAD Scientist
>>
>>
>> On 11/26/2018 1:10 AM, V F wrote:
>>> Dear all,
>>> Thanks for all the off/list replies.
>>>
>>>> To be honest, how much are they paying you to take it? Can you sell it for
>>>> scrap?
>>> May be I will give it a pass.
>>>
>>>> To compare, two dual CPU servers with Skylake Gold 6148 - that is 40 cores -
>>>> will probably beat the whole lot even if you could keep the cluster going.
>>>> And keeping clusters busy is a time consuming challenge... I know!
>>>> If they are 250W servers, then you are looking at £8000 per year to power
>>>> and cool it. The two modern servers will be more like £1500 per year to run.
>>>> And the servers will only cost about £6000... the economics and planet don't
>>>> stack up!
>>> By servers do you mean tower/standalone?
>>>
>>> Thanks for the detailed explanation. From 2012, we already have many
>>> dell precision T5600 with 2 x Xeon E5-2643 (8 Cores) (16 threads) and
>>> I was hoping parallellisation with clusters maybe of some help. Looks
>>> not.
>>>
>>> These are running so well (takes about 45 min for a typical dataset
>>> reduction with DIALS) I am not sure buying new ones is useful.
>>>
>>> ########################################################################
>>>
>>> To unsubscribe from the CCP4BB list, click the following link:
>>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>> ########################################################################
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>
> --
> This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail.
> Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd.
> Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.
> Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
>
>
> ########################################################################
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>
> ________________________________
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>
>
> ########################################################################
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager