Skip to content

allel.read_vcf RuntimeError: VCF file is missing mandatory header line ("#CHROM...") #376

Open
@nathanvranken

Description

@nathanvranken

When trying to read some data from a vcf file of simulated data using msprime, I keep running into the same error.

Here is the head of the vcf file (20220112_msprime_length10000000.0_split100000.0_gene_flow100.0_chr2.vcf.gz):

##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##source=tskit 0.2.3
##contig=<ID=2,length=10000000>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  S.0     S.1     S.2     S.3     S.4     I.0     I.1     I.2     I.3     I.4     M.0     M.1     M.2     M.3     M.4     C.0     C.1     C.2     C.3     C.4
2       785     .       0       1       .       PASS    .       GT      0|0     0|0     0|0     0|0     0|0     0|0     0|0     0|0     0|0     0|0     0|0     0|0     0|0     0|0     0|0     1|1     1|1     1|1     1|1     1|1
2       1002    .       0       1       .       PASS    .       GT      0|0     0|0     0|0     0|0     0|0     0|0     0|0     0|0     0|0     0|0     0|0     0|0     0|1     0|0     0|0     0|0     0|0     0|0     0|0     0|0

When running the following code:
allel.read_vcf(vcf, fields=['calldata/GT'], samples=None, alt_number=1, region='2:0-10000', tabix='tabix')

I consistently get the following error:
RuntimeError: VCF file is missing mandatory header line ("#CHROM...")

Does anyone have an idea what might be causing this error?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions