Wayne Boucher wrote:
>Hello,
>
>On the first point, if you have a lot of peaks (in total) then Python will
>eventually become slow. (That is the one part of the data model which has
>this problem.) So that in itself could be causing problems. But of
>course if you have large numbers of spectra then the program also has to
>load all the data from disk for contouring, which takes time. It's a hard
>problem to know what to do (except to throw more memory / CPU at it).
>
>
Does analysis load the whole data model into memory each time it starts?
could it do some lazy loading e.g. load proxies for top level objects
such as spectra peak lists which provide basic information etc until
they are used?
>(There is another performance issue which hits you after N hours, and that
>is that not everything that should be being garbage collected is garbage
>collected, and on that front we have some ideas what the problems are, but
>just haven't gotten around to sorting them out yet.)
>
>
presumably these are circular references and data structures in lists
that have been orphaned... so they only appear in the lists?
>On the second point, Tim just did some timings on one of his projects. He
>found that he could clone peaks at 185 per second if they didn't have
>assignments but that dropped to around 5 per second if they did. (When
>you have assignments you can multiply the number of objects involved in
>the cloning by an order of magnitude.) Now the clone peaks functionality
>calls a function called copySubTree() in memops.general.Util. That is a
>highly complex function, so you might not want to look at it (much). But
>one thing it does at the end is to check the validity of the copied tree:
>
> top.checkAllValid()
>
>If you take that one line out then the peak cloning with assignments is
>around 60 per second. (Obviously all these timings are on his computer so
>the relative timings are what to pay attention to.) So if you want to
>play slightly dangerously you could comment out that line. It's a hard
>one to call that one.
>
>
One question here is do you need validity checks during a clone
operation? surely the peak list you start with should be valid anyway...
so as long as you make a faithful copy (i.e. there isn't a bug) the
resulting peak list should also be valid. So would this change really
be dangerous?
regards
gary
>Wayne
>
>On Thu, 16 Aug 2007, Gary S. Thompson wrote:
>
>
>
>>Dear All (especially developers)
>>
>>we have some quite large projects with large peak lists at Leeds and i
>>noticed a couple of bottlnecks /performance problems
>>
>>1. starting projects with large numbers of spectra and peak lists can be
>>quite slow (several minutes, I can time it if youe want ;-))
>>2. more importantly cloning of peak lists can be incredibly slow. I am
>>currently cloning a peak list with 4000 peaks in it it is going to take
>>~ 50 minutes giving a throughput of
>> 4000/50*60 = 1.3 peaks per second...
>>
>>
>>
>>regards
>>gary
>>
>>--
>>-------------------------------------------------------------------
>>Dr Gary Thompson
>>Astbury Centre for Structural Molecular Biology,
>>University of Leeds, Astbury Building,
>>Leeds, LS2 9JT, West-Yorkshire, UK Tel. +44-113-3433024
>>email: [log in to unmask] Fax +44-113-2331407
>>-------------------------------------------------------------------
>>
>>
>>
>
>.
>
>
>
--
-------------------------------------------------------------------
Dr Gary Thompson
Astbury Centre for Structural Molecular Biology,
University of Leeds, Astbury Building,
Leeds, LS2 9JT, West-Yorkshire, UK Tel. +44-113-3433024
email: [log in to unmask] Fax +44-113-2331407
-------------------------------------------------------------------
|