You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When using e.g. warp_perspective or remap with the REPLICATE border mode, certain transformations can result in an illegal cuda memory access.
Additional context
After spending a bit of time debugging, it appears to me that the crash is caused by an integer overflow in InterpolationWrap.hpp. When the given transformation is quite extreme, as in the repro above, some of the computed source coordinates may be outside the range of a signed 32-bit int. When those coordinates are cast to an int32_t, it seems to get clamped to INT32_MIN/INT32_MAX, e.g. at
constexprinline IndexType __host__ __device__ GetIndexForInterpolation(float c)
Calling InterpolationWrap::operator[] with INT32_MAX can result in an integer overflow when it attempts to compute the coordinates of the neighboring pixels, e.g. at
I believe the reason this triggers an illegal memory access when using the REPLICATE border mode is that the c < 0 check in GetIndexWithBorder is skipped when called with x1 + 1, since the function is inlined and has already been called with x1 and therefore the compiler treats the check as unreachable code for x1 + 1:
Uh oh!
There was an error while loading. Please reload this page.
Describe the bug
When using e.g. warp_perspective or remap with the REPLICATE border mode, certain transformations can result in an illegal cuda memory access.
Steps/Code to reproduce bug
I was able to reproduce the illegal memory access on RTX 3090 and RTX 4090 cards, on both Intel and AMD cpus.
When compiling cvcuda with
-DCMAKE_CUDA_FLAGS_RELEASE=-O0
, instead of an illegal memory access, the following assert triggers instead:CV-CUDA/src/cvcuda/include/cvcuda/cuda_tools/BorderWrap.hpp
Line 123 in 56a4d2a
Expected behavior
It would be great if these transformations didn't trigger an illegal memory access.
Environment overview
Environment details
Click here to see environment details
Additional context
After spending a bit of time debugging, it appears to me that the crash is caused by an integer overflow in InterpolationWrap.hpp. When the given transformation is quite extreme, as in the repro above, some of the computed source coordinates may be outside the range of a signed 32-bit int. When those coordinates are cast to an int32_t, it seems to get clamped to INT32_MIN/INT32_MAX, e.g. at
CV-CUDA/src/cvcuda/include/cvcuda/cuda_tools/InterpolationWrap.hpp
Line 53 in 56a4d2a
Calling
InterpolationWrap::operator[]
with INT32_MAX can result in an integer overflow when it attempts to compute the coordinates of the neighboring pixels, e.g. atCV-CUDA/src/cvcuda/include/cvcuda/cuda_tools/InterpolationWrap.hpp
Line 434 in 56a4d2a
I believe the reason this triggers an illegal memory access when using the REPLICATE border mode is that the
c < 0
check in GetIndexWithBorder is skipped when called withx1 + 1
, since the function is inlined and has already been called withx1
and therefore the compiler treats the check as unreachable code forx1 + 1
:CV-CUDA/src/cvcuda/include/cvcuda/cuda_tools/BorderWrap.hpp
Line 86 in 56a4d2a
The text was updated successfully, but these errors were encountered: