Hi
I am trying to process some bgen format files in qctool but coming up with an odd error. I have a set of bgen files - one per chromosome and then a shared .sample file. Chromosomes 3 to 23 run in qctool as normal but when I try to process chr 1 and chr 2, I get the following warnings:
!! WARNING: There are no SNPs in the source files (after exclusions, translation, aligning and matching between cohorts where relevant).
!! Warnings were encountered. To proceed anyway, please run again with the -force option.
The bgen files I am working with were derived from another set of bgens which all run fine in qctool. The derivation step involved removing a small number of individuals with -excl-samples and applying the -bgen-omit-sample-identifier-block option.
Some commands will fail completely unless I use the -force option with them (e.g. snp-stats), whereas other commands run anyway (e.g. -incl-rsids). So clearly there are SNPs in the source files.
I have included below some annotated code that shows what runs and what does not, as well as the log file from running that code.
Plink throws a similar error (no SNPs) when I try to read these files in.
I am running qctool on a University supercomputer but the error has been replicated on a different machine by external collaborators.
Any guidance you could offer would be much appreciated!
KR
Laura
====================================================================================
CODE:
# code to check issues with chr 1 in hrc freeze
# add qctools
module add apps/qctool/2.2.0
# check md5sum
md5sum ../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/filtered_01.bgen
# try to run snp summary
# the following errors with no SNP warning
qctool -g ../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/filtered_01.bgen -s ../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/swapped.sample -snp-stats -osnp ../output/chr1_snp_summary_bgen_all.txt
# the following runs
qctool -force -g ../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/filtered_01.bgen -s ../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/swapped.sample -snp-stats -osnp ../output/chr1_snp_summary_bgen_all.txt
# extract some SNPs
# the following runs
qctool -g ../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/filtered_01.bgen \
-s ../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/swapped.sample \
-incl-rsids ../input/grs_snp_list.txt \
-og ../output/chr1_selection.gen \
-os ../output/chr1_selection.sample
# try to run snp summary
# the following runs
qctool -g ../output/chr1_selection.gen -s ../output/chr1_selection.sample -snp-stats -osnp ../output/chr1_snp_summary_gen_subset.txt
# try to convert to vcf
# the following errors with no SNP warning
qctool -g ../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/filtered_01.bgen -s ../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/swapped.sample -og ../output/chr1_complete.vcf
# the following runs
#qctool -force -g ../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/filtered_01.bgen -s ../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/swapped.sample -og ../output/chr1_complete.vcf
# the following runs (vcf from derived snp selection)
qctool -g ../output/chr1_selection.gen -s ../output/chr1_selection.sample -og ../output/chr1_selection.vcf
====================================================================================
LOG FILE:
Welcome to qctool
(version: 2.2.0, revision: unknown)
(C) 2009-2020 University of Oxford
Opening genotype files : [ ] (0/1,0.0s,0.0/s)
Opening genotype files : [******************************] (1/1,0.0s,94.3/s)
Opening genotype files : [******************************] (1/1,0.0s,90.3/s)
========================================================================
Input SAMPLE file(s): "../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/swapped.sample"
Output SAMPLE file: "(n/a)".
Sample exclusion output file: "(n/a)".
Input GEN file(s):
( 0 snps) "../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/filtered_01.bgen (bgen v1.2; 17451 unnamed samples; zlib compression)"
(total 0 snps in 1 sources).
Number of samples: 17451
Output GEN file(s): (n/a)
Output SNP position file(s): (n/a)
Sample filter: .
# of samples in input files: 17451.
# of samples after filtering: 17451 (0 filtered out).
========================================================================
!! WARNING: There are no SNPs in the source files (after exclusions, translation, aligning and matching between cohorts where relevant).
!! Warnings were encountered. To proceed anyway, please run again with the -force option.
Thank you for using qctool.
Welcome to qctool
(version: 2.2.0, revision: unknown)
(C) 2009-2020 University of Oxford
Opening genotype files : [ ] (0/1,0.0s,0.0/s)
Opening genotype files : [******************************] (1/1,0.0s,95.5/s)
Opening genotype files : [******************************] (1/1,0.0s,91.5/s)
========================================================================
Input SAMPLE file(s): "../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/swapped.sample"
Output SAMPLE file: "(n/a)".
Sample exclusion output file: "(n/a)".
Input GEN file(s):
( 0 snps) "../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/filtered_01.bgen (bgen v1.2; 17451 unnamed samples; zlib compression)"
(total 0 snps in 1 sources).
Number of samples: 17451
Output GEN file(s): (n/a)
Output SNP position file(s): (n/a)
Sample filter: .
# of samples in input files: 17451.
# of samples after filtering: 17451 (0 filtered out).
========================================================================
!! WARNING: There are no SNPs in the source files (after exclusions, translation, aligning and matching between cohorts where relevant).
!! Warnings were encountered, but proceeding anyway as -force was supplied.
========================================================================
SNPSummaryComponent: the following components are in place:
HWEComputation
AlleleFrequencyComputation
InfoComputation
MissingnessComputation
Processing SNPs : (0/?,0.0s,0.0/s)
Processing SNPs : (41/?,1.0s,40.4/s)
Processing SNPs : (454/?,2.0s,223.9/s)
....REDACTED TO SAVE SPACE ...
Processing SNPs : (2860447/?,7039.5s,406.3/s)
Processing SNPs : (2860846/?,7040.5s,406.3/s)qctool: ../../genfile/include/genfile/zlib.hpp:70: void genfile::zlib_uncompress(const byte_t*, const byte_t*, std::vector<T>*) [with T = unsigned char; genfile::byte_t = unsigned char]: Assertion `result == 0' failed.
chr1_error_inv.sh: line 13: 4768 Aborted qctool -force -g ../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/filtered_01.bgen -s ../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/swapped.sample -snp-stats -osnp ../output/chr1_snp_summary_bgen_all.txt
Welcome to qctool
(version: 2.2.0, revision: unknown)
(C) 2009-2020 University of Oxford
Opening genotype files : [ ] (0/1,0.0s,0.0/s)
Opening genotype files : [******************************] (1/1,0.0s,64.5/s)
Opening genotype files : [******************************] (1/1,0.0s,62.5/s)
========================================================================
Input SAMPLE file(s): "../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/swapped.sample"
Output SAMPLE file: "../output/chr1_selection.sample".
Sample exclusion output file: "(n/a)".
Input GEN file(s):
(not computed) "snp-id-data-filtered:../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/filtered_01.bgen (bgen v1.2; 17451 unnamed samples; zlib compression)"
(total 1 sources, number of snps not computed).
Number of samples: 17451
Output GEN file(s): "../output/chr1_selection.gen"
Output SNP position file(s): (n/a)
Sample filter: .
SNP filter: RSID in { set of 773 }.
# of samples in input files: 17451.
# of samples after filtering: 17451 (0 filtered out).
========================================================================
Processing SNPs : (0/?,0.0s,0.0/s)
Processing SNPs : (16/?,3.0s,5.4/s)
Processing SNPs : (30/?,4.1s,7.3/s)
Processing SNPs : (47/?,6.9s,6.8/s)
Processing SNPs : (57/?,2.2s,26.3/s)
Total: 57SNPs.
========================================================================
Number of SNPs:
-- in input file(s): (not computed).
-- in output file(s): 57
Number of samples in input file(s): 17451.
Output GEN files: (57 snps) "../output/chr1_selection.gen"
(total 57 snps).
Output SAMPLE files: "../output/chr1_selection.sample" (17451 samples)
========================================================================
Thank you for using qctool.
Welcome to qctool
(version: 2.2.0, revision: unknown)
(C) 2009-2020 University of Oxford
Opening genotype files : [ ] (0/1,0.0s,0.0/s)
Opening genotype files : [******************************] (1/1,0.1s,7.3/s)
Opening genotype files : [******************************] (1/1,0.1s,7.2/s)
========================================================================
Input SAMPLE file(s): "../output/chr1_selection.sample"
Output SAMPLE file: "(n/a)".
Sample exclusion output file: "(n/a)".
Input GEN file(s):
(not computed) "../output/chr1_selection.gen"
(total 1 sources, number of snps not computed).
Number of samples: 17451
Output GEN file(s): (n/a)
Output SNP position file(s): (n/a)
Sample filter: .
# of samples in input files: 17451.
# of samples after filtering: 17451 (0 filtered out).
========================================================================
SNPSummaryComponent: the following components are in place:
HWEComputation
AlleleFrequencyComputation
InfoComputation
MissingnessComputation
Processing SNPs : (0/?,0.0s,0.0/s)
Processing SNPs : (20/?,1.0s,19.7/s)
Processing SNPs : (41/?,2.0s,20.0/s)
Processing SNPs : (57/?,2.8s,20.2/s)
Total: 57SNPs.
========================================================================
Number of SNPs:
-- in input file(s): (not computed).
-- in output file(s): 0
Number of samples in input file(s): 17451.
========================================================================
Thank you for using qctool.
Welcome to qctool
(version: 2.2.0, revision: unknown)
(C) 2009-2020 University of Oxford
Opening genotype files : [ ] (0/1,0.0s,0.0/s)
Opening genotype files : [******************************] (1/1,0.0s,93.6/s)
Opening genotype files : [******************************] (1/1,0.0s,89.5/s)
========================================================================
Input SAMPLE file(s): "../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/swapped.sample"
Output SAMPLE file: "(n/a)".
Sample exclusion output file: "(n/a)".
Input GEN file(s):
( 0 snps) "../../../datasets/dataset_gi_hrc_g0m_g1/freeze/out/filtered_01.bgen (bgen v1.2; 17451 unnamed samples; zlib compression)"
(total 0 snps in 1 sources).
Number of samples: 17451
Output GEN file(s): "../output/chr1_complete.vcf"
Output SNP position file(s): (n/a)
Sample filter: .
# of samples in input files: 17451.
# of samples after filtering: 17451 (0 filtered out).
========================================================================
!! WARNING: There are no SNPs in the source files (after exclusions, translation, aligning and matching between cohorts where relevant).
!! Warnings were encountered. To proceed anyway, please run again with the -force option.
Thank you for using qctool.
Welcome to qctool
(version: 2.2.0, revision: unknown)
(C) 2009-2020 University of Oxford
Opening genotype files : [ ] (0/1,0.0s,0.0/s)
Opening genotype files : [******************************] (1/1,0.1s,7.3/s)
Opening genotype files : [******************************] (1/1,0.1s,7.3/s)
========================================================================
Input SAMPLE file(s): "../output/chr1_selection.sample"
Output SAMPLE file: "(n/a)".
Sample exclusion output file: "(n/a)".
Input GEN file(s):
(not computed) "../output/chr1_selection.gen"
(total 1 sources, number of snps not computed).
Number of samples: 17451
Output GEN file(s): "../output/chr1_selection.vcf"
Output SNP position file(s): (n/a)
Sample filter: .
# of samples in input files: 17451.
# of samples after filtering: 17451 (0 filtered out).
========================================================================
VCFFormatSNPDataSink::write_header(): FORMAT entries are:
##FORMAT=<ID=GP,Type=Float,Number=G,Description="Genotype call probabilities">
Processing SNPs : (0/?,0.0s,0.0/s)
Processing SNPs : (4/?,1.1s,3.6/s)
Processing SNPs : (8/?,2.2s,3.7/s)
Processing SNPs : (13/?,3.4s,3.8/s)
Processing SNPs : (54/?,4.6s,11.7/s)
Processing SNPs : (57/?,-3.7s)
Total: 57SNPs.
========================================================================
Number of SNPs:
-- in input file(s): (not computed).
-- in output file(s): 57
Number of samples in input file(s): 17451.
Output GEN files: (57 snps) "../output/chr1_selection.vcf"
(total 57 snps).
========================================================================
Thank you for using qctool.
To unsubscribe from the list visit this webpage https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=OXSTATGEN&A=1
|