I checked the random_seed were the same for Relion-1.4 and vlion run. I've used run2_submit.script that came with relion13_tutorial/betagal/PrecalculatedResults:

mpiexec -n 50  --bynode `which relion_refine_mpi` --o Refine3D/run2 --auto_refine --split_random_halves --i particles_autopick_sort_class2d_class3d.star --particle_diameter 200 --angpix 3.54 --ref 3i3e_lp50A.mrc --firstiter_cc --ini_high 60 --ctf --ctf_corrected_ref --flatten_solvent --zero_mask --oversampling 1 --healpix_order 2 --auto_local_healpix_order 4 --offset_range 4 --offset_step 2 --sym D2 --low_resol_join_halves 40 --norm --scale  --j 2 --random_seed 1401784870

I did another run with 50 mpi processes. This time the final resolution (without masking) came out the same for both vlion and Relion 1.4: 8.23256. Timing result shows that Relion 1.4 run faster in this case:

With vlion:

resources_used.cput=44:38:09
resources_used.walltime=00:41:46

With Relion 1.4:
resources_used.cput=31:05:21
resources_used.walltime=00:30:18

Would be interesting to see what kind of results others are getting. Thanks!

From: Sargis Dallakyan <[log in to unmask]>
To: Dimitry Tegunov <[log in to unmask]>, <[log in to unmask]>
Sent: 1/25/2016 4:16 PM
Subject: Re: Vectorized RELION

Hi Dimitry,

Thanks, I used double-precision for both Relion-1.4 and vlion. I haven't looked at actual maps yet. I'm now running both with 50 mpi processes where the speed-up might be even better.

I'll try to figure out how to compute an FSC between the final map computed using the original code and the optimized version as Steve Ludtke suggested to see what's the difference there.

Cheers,
Sargis

From: Dimitry Tegunov <[log in to unmask]>
To: <[log in to unmask]>, Sargis Dallakyan <[log in to unmask]>
Sent: 1/25/2016 3:26 PM
Subject: Re: Vectorized RELION

Hi Sargis,

glad to hear it works! And sorry to see the speed-up is less substantial than in my benchmark and/or on a single machine. I hope the IT will get it installed on our cluster later this week so I can clock it there.

Really curious about the difference in resolution. Was that in single-precision mode? In double, the values shouldn't have changed enough to justify a difference like this. The algorithm is still exactly the same.

Cheers,
Dimitry