Skip to content

RecursionError when downloading datasets with python 3.12: set requirements accordingly? #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
cleong110 opened this issue Mar 26, 2024 · 7 comments

Comments

@cleong110
Copy link
Contributor

cleong110 commented Mar 26, 2024

image

Steps to reproduce on my own machine:

conda create -n sign_language_datasets pip 
conda activate sign_language_datasets 
python --version # 3.12 by default
python -m pip install sign-language-datasets webvtt-py

# create a download_dgs_corpus.py file with the following contents
import tensorflow_datasets as tfds
import sign_language_datasets.datasets
from sign_language_datasets.datasets.config import SignDatasetConfig

import itertools
import sys
print(sys.getrecursionlimit())
# sys.setrecursionlimit(50)
# default settings includes both pose and video
dgs_corpus = tfds.load('dgs_corpus')

# run it
python download_dgs_corpus.py 

It works in colab (Python 3.10), but not on my machine in an env with python 3.12. When I create a conda env with 3.10 it works without issue.

When I create an env with 3.11, I get "no module named lxml" but that's a different issue edit: I was installing in my base environment, never mind this part

tensorflow/datasets#4666 upstream issue, apparently.

@cleong110
Copy link
Contributor Author

cleong110 commented Mar 26, 2024

OK, installed lxml and now I'm getting "Failed to get url https://nlp.biu.ac.il/~amit/datasets/dgs.json. HTTP code: 404.", which seems new but unrelated to this never mind, python 3.11 seems to work fine, I was installing in my conda base env

@cleong110
Copy link
Contributor Author

So it really does seem that Python 3.12 is the issue, as noted in tensorflow/datasets#4666.

@cleong110
Copy link
Contributor Author

cleong110 commented Mar 26, 2024

Never mind the nevermind, if you have python 3.11 you need to manually install lxml or dgs corpus downloading crashes when using default config. But that's a DGS-corpus-specific issue I suppose, so never mind the neverminding of the nevermind maybe?
image

@abir-g
Copy link

abir-g commented May 25, 2024

Thanks for this.

@cleong110
Copy link
Contributor Author

According to tensorflow/datasets#4666 (comment), this is now fixed in the latest version of tfds.

If we can confirm that, we can close this issue.

@cleong110
Copy link
Contributor Author

Gave it a go. New conda env, python 3.12, pip install sign_language_datasets. Ended up with tfds-nightly-4.9.5.dev202406050044, not 4.9.6, the version of tfds which supposedly solves this.

@cleong110
Copy link
Contributor Author

Did some shenanigans - uninstalled tfds-nightly, and then pip install tensorflow-datasets, and then it couldn't import it, so pip install -U --force-reinstall tensorflow-datasets and then now it seems to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants