Skip to content

Indexing Error for VariantTable, requires values to be monotonically increasing #384

Open
@npb596

Description

@npb596

Hello,

I have been receiving the below error:

    vcf_first_vt = allel.VariantTable({'CHROM' : vcf_first['variants/CHROM'], 'POS' : vcf_first['variants/POS'], 'REF' : vcf_first['variants/REF'], 'ALT' : vcf_first['variants/ALT'][:,0], 'GT' : vcf_first['calldata/GT'][:,0,0], 'PS' : vcf_first['calldata/PS'][:,0]}, index=('CHROM','POS'))
  File "/home/nbailey/anaconda3/lib/python3.9/site-packages/allel/model/ndarray.py", line 4517, in __init__
    self.set_index(index)
  File "/home/nbailey/anaconda3/lib/python3.9/site-packages/allel/model/ndarray.py", line 4542, in set_index
    index = SortedMultiIndex(self[index[0]], self[index[1]],
  File "/home/nbailey/anaconda3/lib/python3.9/site-packages/allel/model/ndarray.py", line 4036, in __init__
    l1 = SortedIndex(l1, copy=copy)
  File "/home/nbailey/anaconda3/lib/python3.9/site-packages/allel/model/ndarray.py", line 3384, in __init__
    raise ValueError('values must be monotonically increasing')
ValueError: values must be monotonically increasing

For some clarity, my python script has the vcf_first_vt definition given above, and this causes the subsequent errors. It seems I can avoid this error so long as I use lexicographic sorting of numbers (e.g. chr1, chr10, chr2 instead of chr1, chr2, chr10) and remove chromosome names without numbers (e.g. chrX and chrY). This is odd to me as I assume something like "chr1" should be treated as a string (as per the example here: https://scikit-allel.readthedocs.io/en/stable/model/ndarray.html?highlight=sortedmultiindex#sortedmultiindex). I suppose the lexicographic sorting makes sense when the numbers are treated as strings, though I don't understand why they necessarily need to be sorted in any particular order at all. Is there a way of defining a VariantTable that I'm missing that would allow chromosomes to be sorted in any particular order? If not, would it be possible to make this kind of issue more explicit?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions