Right, Brian thinks it's worth updating the v1 code so I will do that once
we see if the dust settles on this change in the v2 code (that readlink
error was a bit of a surprise!).
And Brian will point out (because he did last week in Cambridge!) that
there are two sorts of people who might want to stick with v1: those who
are close to finishing a project (in particular Ph.D. students) and those
people who are close to doing structure calculations with ARIA, because
the current publicly available ARIA does not work with v2 (that will
hopefully be sorted soon).
Wayne
On Fri, 23 Jan 2009, Andrew Fowler wrote:
> And again, thanks to Wayne for figuring all this out.
>
> I'll add that while it would be nice to have these large files working under
> v1, you guys should probably just focus on v2. It's easy enough to work
> around for this case, and I'll certainly upgrade for my next project. I see
> no real reason for anyone (including me) not to upgrade other than
> inertia...
>
> Andrew
>
>
> On 1/23/09 8:18 AM, "Wayne Boucher" <[log in to unmask]> wrote:
>
> > Ah, it turns out I spoke too soon. I made a dumb mistake in the looping
> > code and when I fixed it I discovered that the OSX fseek function is
> > broken (at least with default compiler flags) for files >= 2 Gb, no matter
> > how you try to use it (so you cannot incrementally just keep seeking, it
> > fails at 2 Gb). So I've replaced fseek with fseeko, which is not broken.
> > I've also tested this on 64-bit Linux now.
> >
> > And I should have mentioned in the first email that on 64-bit Linux I
> > think that v1 of the code should work for these large files. It is only
> > 32-bit Linux and OSX where there are problems. (And I think the problems
> > could be solved on both with clever use of platform-dependent compiler
> > flags but then you have to start worrying about whether Python has also
> > been compiled with these flags. I think it's not worth going there.)
> >
> > Wayne
> >
> > On Fri, 23 Jan 2009, Wayne Boucher wrote:
> >
> >> Hello,
> >>
> >> As Andrew discovered yesterday, there is a problem with importing large
> >> (>= 2 Gb) files in Analysis. I have now investigated further and it was
> >> indeed a 32-bit problem. So some numbers which should have been positive
> >> were coming out negative.
> >>
> >> It turns out there were two problems. First of all, one of the types in
> >> the C code should have been "long long" rather than just "long". (The
> >> latter on many operating systems, including the default on OSX, is 4
> >> bytes, wherease the former is 8 bytes, and you need 8 bytes to cope
> >> withthese large files.)
> >>
> >> The second problem was that the system function we use to skip around the
> >> data file on disk (fseek) uses long, not long long, for the offset. I've
> >> gotten around this by adding a function which will skip at most 2^30 bytes
> >> (= 1 Gb) in one go.
> >>
> >> As it happens, in v2 the first change had already been made. And I've
> >> just added the second change to the update server. So v2 users should be
> >> able to use >= 2 Gb files now.
> >>
> >> In v1 neither change had been made so I've done that in our internal code
> >> but I haven't put the changes on the update server for two reasons. One,
> >> our client for uploading the code is broken in v1 because it uses ftp and
> >> our server no longer allows that. And two, some other code has changed in
> >> the relevant files and although I think the changes are consistent I'd
> >> rather play safe.
> >>
> >> If any v1 users want to load >= 2 Gb files then let me know and I'll sort
> >> out the two issues above.
> >>
> >> As Andrew discovered, there is a work-around, namely to split your data
> >> files up (in his case, by re-processing, and that is the best way).
> >>
> >> As it happens the CCPN data model means that in Analysis you can have
> >> multiple spectra inside one experiment. So that makes this work-around
> >> slightly less nasty.
> >>
> >> Wayne
> >>
>
|