-
-
Notifications
You must be signed in to change notification settings - Fork 8.5k
[V1] Add VLLM_ALLOW_INSECURE_SERIALIZATION env var #17490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[V1] Add VLLM_ALLOW_INSECURE_SERIALIZATION env var #17490
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
This changes `vllm.v1.serial_utils` to disable the pickle fallback by default. Add an environment variable that can be used to turn the pickle fallback back on. Follow-up to vllm-project#17427. Signed-off-by: Russell Bryant <[email protected]>
381b54f
to
f12b92a
Compare
@njhill I updated this to disable pickle by default and enabled CI. We'll see what happens .. |
It broke. I'll look for a bit to see if it's easy ... |
Signed-off-by: Nick Hill <[email protected]>
The new test failure is something that's been fixed on main.. let's wait for the rest to finish to see if anything else pops up though before rebasing. |
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
The V1 test failures are structured output and unrelated. It looks like the model didn't terminate the JSON object before it hit the output length limit. I'll try to add something to the prompt telling the model to keep its response as short as possible to see if that helps make it less likely to fail. |
Update all prompts with an instruction to keep responses short. We see ocassional failures when output has not reached the end before hitting the output length limit, particularly in the JSON cases. Hopefully this helps. Signed-off-by: Russell Bryant <[email protected]>
I am trying to repro the other legitimate failure locally |
Signed-off-by: Nick Hill <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good to me!
This change is related to vllm-project#17490. This test didn't run on that PR, so the failure was missed. This code currently relies on the pickle fallback, so pickle support must be enabled to use it for now. Signed-off-by: Russell Bryant <[email protected]>
This is a follow-up to vllm-project#17490 and improves related logging. - If serialization fails, include a pointer to the environment variable that will enable `pickle` fallback serialization. - Only log the warning about using insecure serialization once. Previously we were logging it several times. Signed-off-by: Russell Bryant <[email protected]>
Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Nick Hill <[email protected]> Signed-off-by: 汪志鹏 <[email protected]>
Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Nick Hill <[email protected]> Signed-off-by: Mu Huai <[email protected]>
We occasionally see the JSON format structured output tests fail in CI. PR vllm-project#17490 included a change to the prompts asking to make the response as short as possible. This change includes a couple more things to help: - Increase the output length limit. The failures occur when we cut off the output before a JSON object is properly terminated. - Set `additionalProperties` to `False` in each JSON schema used. This should restrict the model from adding properties not specified in the schemas, unnecessarily increasing the size of the JSON object output and making it more likely to hit the length limit before it finishes. Signed-off-by: Russell Bryant <[email protected]>
Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Nick Hill <[email protected]>
Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Nick Hill <[email protected]> Signed-off-by: Yuqi Zhang <[email protected]>
Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Nick Hill <[email protected]> Signed-off-by: minpeter <[email protected]>
commit f12b92a
Author: Russell Bryant [email protected]
Date: Tue Apr 29 21:09:15 2025 -0400