Skip to content

Failed to build TensorRT-LLM backend for Triton server. #728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sdecoder opened this issue Mar 24, 2025 · 1 comment
Open

Failed to build TensorRT-LLM backend for Triton server. #728

sdecoder opened this issue Mar 24, 2025 · 1 comment

Comments

@sdecoder
Copy link

Greetings, I have come across following issue when trying to build TensorRT-LLM backend for Triton server:
/home/nvidia/projects/triton-inference-server/tensorrtllm_backend/inflight_batcher_llm/../tensorrt_llm/cpp/include/tensorrt_llm/common/dataType.h:40:30: error: ‘kFP4’ is not a member of ‘nvinfer1::DataType’; did you mean ‘kFP8’?

I followed the instruction found here:
https://github.com/triton-inference-server/tensorrtllm_backend/blob/main/docs/build.md
1.1. The command I am using is:
/home/nvidia/projects/triton-inference-server/tensorrtllm_backend/inflight_batcher_llm
bash scripts/build.sh
1.2. Platform: Jetson AGX Orin;
1.3. TensorRT version: TensorRT-10.3.0.26
1.4. CUDA version: 12.6
I do believe the TensorRT version and CUDA version is compatible.
Is anyone will to take a look at this and give me a hint?
Thank you everyone.

@dog-dev-mel
Copy link

I guess TensorRT-LLM backend is not avaliable on Jetson Device.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants