Skip to content

CUDA memory misused #12

Open
Open
@why986

Description

@why986

Hi Zielon,
When I used your codes to train INSTA on my own video, an error occurred. It is important to note that to reduce VRAM requirements I disabled -O and just used --fp16 --cuda_ray according to #2.

Below is the error report.

  0% 0/1500 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/home/wanghy/INSTA-pytorch/main_insta.py", line 160, in <module>
    trainer.train(train_loader, valid_loader, max_epoch)
  File "/home/wanghy/INSTA-pytorch/insta/utils.py", line 625, in train
    self.train_one_epoch(train_loader)
  File "/home/wanghy/INSTA-pytorch/insta/utils.py", line 838, in train_one_epoch
    self.model.update_extra_state(loader)
  File "/home/wanghy/miniconda3/envs/insta-pytorch/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/wanghy/INSTA-pytorch/insta/renderer.py", line 572, in update_extra_state
    cas_xyzs, faces = self.warp_position(triangles, cas_xyzs)
  File "/home/wanghy/INSTA-pytorch/insta/renderer.py", line 293, in warp_position
    distances, closest_points, closest_faces, closest_bcs = m(triangles, xyzs.unsqueeze(dim=0))
  File "/home/wanghy/miniconda3/envs/insta-pytorch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/wanghy/miniconda3/envs/insta-pytorch/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/wanghy/INSTA-pytorch/bvh/bvh_distance_queries/bvh_search_tree.py", line 109, in forward
    output = BVHFunction.apply(
  File "/home/wanghy/miniconda3/envs/insta-pytorch/lib/python3.9/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/home/wanghy/INSTA-pytorch/bvh/bvh_distance_queries/bvh_search_tree.py", line 42, in forward
    outputs = bvh_distance_queries_cuda.distance_queries(
RuntimeError: triangles must be a CUDA tensor

PS: My machine is NVIDIA G-RTX 3090 with 24GB VRAM and over 200G RAM, so I am also confused why it cannot meet the memory requirement when using -O.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions