Hi,
I'm using IMPUTE2 (v2.3.2 on Linux) for imputing data from Illumina HumanOmni5 and I have some problems with the .gen outputs.
When I tried to transform the .gen files to binary PLINK I got the following error in chromosome 3:
--data: 76k variants converted.Error: Line 76889 of .gen file has fewer tokens than expected.
So I tried read it into R, and again I got an error involving that same line:
Read 40.2% of 174306 rowsError in fread("chr3_chunkAll") :
embedded nul in string: '\0\0\0\0\0\ ...
In addition: Warning message:
In fread("chr3_chunkAll") :
Bumped column 2041 to type character on data row 76889, field contains ''. Coercing previously read values in this column from logical, integer or numeric back to character which may not be lossless; e.g., if '00' and '000' occurred before they will now be just '0', and there may be inconsistencies with treatment of ',,' and ',NA,' too (if they occurred in this column before the bump). If this matters please rerun and set 'colClasses' to 'character' for this column. Please note that column type detection uses a sample of 1,000 rows (100 rows at 10 points) so hopefully this message should be very rare. If reporting to datatable-help, please rerun and include the output from verbose=TRUE.
When I visualize the specific line with the problem (using less in terminal), everything seems fine but when I use gedit to open the problematic SNP you see this:
3 SNPXXX posXXX G A 1 0 0 1 ... 1 0 0 1 \00\00\00\00\00\00\00\00 ... 1 0 0
Can anyone tell me if you encounter a similar situation? When I run IMPUTE2 again in the specific chunk that had the issue things are working fine again but I there is similar problems in other chromosomes. Could this be just a memory allocation problem during the imputation process?
My summary and warning files seem to be normal (compared to chunks that do not have this specific error).
Thanks,
Constanza
To unsubscribe from the list visit this webpage https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=OXSTATGEN&A=1
|