Hello,
On the first point, if you have a lot of peaks (in total) then Python will
eventually become slow. (That is the one part of the data model which has
this problem.) So that in itself could be causing problems. But of
course if you have large numbers of spectra then the program also has to
load all the data from disk for contouring, which takes time. It's a hard
problem to know what to do (except to throw more memory / CPU at it).
(There is another performance issue which hits you after N hours, and that
is that not everything that should be being garbage collected is garbage
collected, and on that front we have some ideas what the problems are, but
just haven't gotten around to sorting them out yet.)
On the second point, Tim just did some timings on one of his projects. He
found that he could clone peaks at 185 per second if they didn't have
assignments but that dropped to around 5 per second if they did. (When
you have assignments you can multiply the number of objects involved in
the cloning by an order of magnitude.) Now the clone peaks functionality
calls a function called copySubTree() in memops.general.Util. That is a
highly complex function, so you might not want to look at it (much). But
one thing it does at the end is to check the validity of the copied tree:
top.checkAllValid()
If you take that one line out then the peak cloning with assignments is
around 60 per second. (Obviously all these timings are on his computer so
the relative timings are what to pay attention to.) So if you want to
play slightly dangerously you could comment out that line. It's a hard
one to call that one.
Wayne
On Thu, 16 Aug 2007, Gary S. Thompson wrote:
> Dear All (especially developers)
>
> we have some quite large projects with large peak lists at Leeds and i
> noticed a couple of bottlnecks /performance problems
>
> 1. starting projects with large numbers of spectra and peak lists can be
> quite slow (several minutes, I can time it if youe want ;-))
> 2. more importantly cloning of peak lists can be incredibly slow. I am
> currently cloning a peak list with 4000 peaks in it it is going to take
> ~ 50 minutes giving a throughput of
> 4000/50*60 = 1.3 peaks per second...
>
>
>
> regards
> gary
>
> --
> -------------------------------------------------------------------
> Dr Gary Thompson
> Astbury Centre for Structural Molecular Biology,
> University of Leeds, Astbury Building,
> Leeds, LS2 9JT, West-Yorkshire, UK Tel. +44-113-3433024
> email: [log in to unmask] Fax +44-113-2331407
> -------------------------------------------------------------------
>
|