Skip to content

ov.pp.qc dtype is not supported: bool #243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
joseagraz opened this issue Dec 24, 2024 · 1 comment
Open

ov.pp.qc dtype is not supported: bool #243

joseagraz opened this issue Dec 24, 2024 · 1 comment

Comments

@joseagraz
Copy link

joseagraz commented Dec 24, 2024

Describe the bug
Executing the command below triggers a dtype error. The incoming adata looks fine, no boolean definitions. Stepping into the OmicVerse source code shows no issues; the error occurs only when returning to the main script. I’ve tried using Scrublet and omitting the batch key, but the same error persists. Any guidance on pinpointing the problem would be greatly appreciated.

Futher digging suggest there is a mix of CPU-GPU data. Regardless of the setting on my script, ov.pp.qc forces GPU execution and I can't see how to disable it.

Screenshot from 2024-12-23 19-44-22

Command:
adata=ov.pp.qc(adata,
tresh={'mito_perc': 0.1, 'nUMIs': 500, 'detected_genes': 250},
mt_genes=None,
doublets_method='sccomposite',
batch_key='batch')

--------------------------------------------------------------

Error:
Exception has occurred: TypeError
dtype is not supported: bool
File "pre-process-omicverse.py", line 618, in
adata=ov.pp.qc(adata,
^^^^^^^^^^^^^^^
TypeError: dtype is not supported: bool

--------------------------------------------------------------

Incoming adata types:
print(adata.var.dtypes)

gene_ids object
feature_types category
genome category
dtype: object

print(adata.obs.dtypes)

batch category
dtype: object

adata.obs.batch.unique()
['1', '2']
Categories (2, object): ['1', '2']

adata.var.index
Index(['MIR1302-2HG', 'FAM138A', 'OR4F5', 'AL627309.1', 'AL627309.3',
'AL627309.2', 'AL627309.5', 'AL627309.4', 'AP006222.2', 'AL732372.1',
...
'AC133551.1', 'AC136612.1', 'AC136616.1', 'AC136616.3', 'AC136616.2',
'AC141272.1', 'AC023491.2', 'AC007325.1', 'AC007325.4', 'AC007325.2'],
dtype='object', length=36601)
adata.var.gene_ids
MIR1302-2HG ENSG00000243485
FAM138A ENSG00000237613
OR4F5 ENSG00000186092
AL627309.1 ENSG00000238009
AL627309.3 ENSG00000239945
...
AC141272.1 ENSG00000277836
AC023491.2 ENSG00000278633
AC007325.1 ENSG00000276017
AC007325.4 ENSG00000278817
AC007325.2 ENSG00000277196
Name: gene_ids, Length: 36601, dtype: object
adata.var.feature_types
MIR1302-2HG Gene Expression
FAM138A Gene Expression
OR4F5 Gene Expression
AL627309.1 Gene Expression
AL627309.3 Gene Expression
...
AC141272.1 Gene Expression
AC023491.2 Gene Expression
AC007325.1 Gene Expression
AC007325.4 Gene Expression
AC007325.2 Gene Expression
Name: feature_types, Length: 36601, dtype: category
Categories (1, object): ['Gene Expression']
adata.var.genome
MIR1302-2HG GRCh38
FAM138A GRCh38
OR4F5 GRCh38
AL627309.1 GRCh38
AL627309.3 GRCh38
...
AC141272.1 GRCh38
AC023491.2 GRCh38
AC007325.1 GRCh38
AC007325.4 GRCh38
AC007325.2 GRCh38
Name: genome, Length: 36601, dtype: category
Categories (1, object): ['GRCh38']

--------------------------------------------------------------

Resulting adata types:
print(adata.var.dtypes)

gene_ids object
feature_types category
genome category
mt bool
n_cells_by_counts int32
total_counts float32
mean_counts float32
pct_dropout_by_counts float64
log1p_total_counts float32
log1p_mean_counts float32
dtype: object

print(adata.obs.dtypes)

batch category
n_genes_by_counts int32
total_counts float32
log1p_n_genes_by_counts float64
log1p_total_counts float32
total_counts_mt float32
pct_counts_mt float32
log1p_total_counts_mt float32
nUMIs float32
mito_perc float32
detected_genes int32
cell_complexity float64
sccomposite_doublet int64
sccomposite_consistency int64
dtype: object

--------------------------------------------------------------

print(f"Last run with scvi-tools version: {ov.version}")
Last run with scvi-tools version: 1.6.11

--------------------------------------------------------------

tinycss2==1.4.0
tokenizers==0.21.0
tomli==2.2.1
toolz==1.0.0
torch==2.4.1
torch-geometric==2.6.1
torchvision==0.19.1
tornado==6.4.2
tqdm==4.67.1
traitlets==5.14.3
transformers==4.47.1
treelite==4.3.0
truststore==0.8.0
typeguard==4.4.1
types-python-dateutil==2.9.0.20241206
typing-extensions==4.12.2
typing-utils==0.1.0
tzdata==2024.2
uc-micro-py==1.0.3
ucx-py==0.39.2
ucxx==0.39.1
versions.txt
uri-template==1.3.0
urllib3==1.26.19
uvicorn==0.34.0
wcwidth==0.2.13
webcolors==24.11.1
webencodings==0.5.1
websocket-client==1.8.0
websockets==10.4
werkzeug==3.1.3
wget==3.2
wheel==0.43.0
widgetsnbextension==4.0.13
wrapt==1.17.0
xarray==2024.11.0
xgboost==2.1.1
xyzservices==2024.9.0
yarl==1.18.3
zict==3.0.0
zipp==3.21.0
zope.interface==7.2
zstandard==0.23.0

Dockerfile
FROM ghcr.io/scverse/rapids_singlecell:latest

USER root
WORKDIR /opt
SHELL ["/bin/bash", "-c"]

RUN apt-get update
&& apt-get install -y --no-install-recommends
libgraphviz-dev
&& apt-get clean
&& rm -rf /var/lib/apt/lists/*

base

RUN pip install --no-cache-dir eeisp==0.5.0 debugpy uvicorn

Add the script to the Docker image

COPY ./start_conda_enviroment.sh /usr/local/bin/start_conda_enviroment.sh
RUN chmod +x /usr/local/bin/start_conda_enviroment.sh

RUN conda install scrublet pymde harmonypy phate pandas>=2.1 \
-c nvidia -c conda-forge -c bioconda
-n base
&& pip install metatime mofax pydeseq2>=0.4.1
pygam==0.8.0 phate multiprocess ctxcore ktplotspy
griffe python-dotplot datetime cython git+https://github.com/Starlitnightly/omicverse.git
torch_geometric

start the server

CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "80", "--reload"]

docker-composer.yml
services:
app:
build: .
container_name: python-server
command: uvicorn src.main:app --host 0.0.0.0 --port 80 --reload
ports:
- 80:80 # port for the server, helpful for troubleshooting
- 5678:5678 # port for debbuging communication
# working_dir:
# - /opt
volumes:
- /media/KPMP_Data/Privately_Available_Data:/media/KPMP_Data/Privately_Available_Data
runtime: nvidia # Essential for GPU access
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all # "all" allocates all GPUs; replace with a number to limit
capabilities: [gpu]

@Starlitnightly
Copy link
Owner

I will update the GPU mode in future version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants