Skip to content

Agent does not see message argument in ToolMock #492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jmatejcz opened this issue Mar 28, 2025 · 2 comments
Open

Agent does not see message argument in ToolMock #492

jmatejcz opened this issue Mar 28, 2025 · 2 comments
Labels
bug Something isn't working invalid This doesn't seem right priority/minor Lower-priority tasks that can be picked up when time allows or planned for later.

Comments

@jmatejcz
Copy link
Contributor

Describe the bug
Mock of PublishROS2MessageTool was created in PR #487.
Agent, when given this tool does not see message argument at all.

To Reproduce
Steps to reproduce the behavior:

  1. Go to branch jm/feat/tool-benchmark-custom-interfaces
  2. Run
python src/rai_bench/rai_bench/examples/tool_calling_agent_test_bench.py
  1. Check logs for agent output

Expected behavior
Agent should properly see all arguments

Screenshots

Image

output of the agent when asked what arguments the tool has:

AIMessage(content='The `publish_ros2_message` tool has the following arguments:\n\n1. **topic**: A string that specifies the topic to which the message will be published.\n2. **message_type**: A string that represents the type of the message that will be published (e.g., `std_msgs/msg/String`). \n\nIf you need further assistance

When declared explicitly in the mock class and added new field to args_shema:

Image

The agent outputs:

AIMessage(content='The `publish_ros2_message` tool has the following arguments:\n\n1. **topic**: The topic to which the message will be published (as a string).\n2. **message_type**: The type of the message being published (as a string).\n3. **costam**: This appears to be a placeholder for the actual message content that you want to publish (as a string).\n\nIf you need

So for some reason agent always ignores only message argument

When agent uses original tool, the problem does not occur:

Image

Platform

  • OS: Ubuntu 22.04
  • ROS 2 Version - Humble

Version
Release number or commit hash.

@jmatejcz jmatejcz added the bug Something isn't working label Mar 28, 2025
@maciejmajek
Copy link
Member

maciejmajek commented Mar 29, 2025

Seems like a problem with the openai model.
gpt-4o-mini: https://smith.langchain.com/public/36837843-cf82-40de-9775-89af3e242201/r
qwen2.5-coder:32b: https://smith.langchain.com/public/75ee8985-f24d-4c1c-9471-1d4f82c5aea1/r

Details

from rai_bench.tool_calling_agent_bench.mocked_tools import MockPublishROS2MessageTool

from rai.utils.model_initialization import get_llm_model, get_tracing_callbacks
from langchain_community.tools.convert_to_openai import format_tool_to_openai_tool
from langchain_core.runnables import RunnableConfig
from pprint import pprint
llm = get_llm_model(model_type="complex_model")

tool = MockPublishROS2MessageTool(
available_topics=["/ari_test_topic"],
available_message_types=["std_msgs/msg/String"],
)

llm_with_tool = llm.bind_tools([tool])

response = llm_with_tool.invoke("Publish a message to the /ari_test_topic topic with the message 'Hello, world!'", config=RunnableConfig(callbacks=get_tracing_callbacks()))

pprint(response.model_dump())

@jmatejcz
Copy link
Contributor Author

i've tested it further - it seems like openai models (4o, 4o-mini) do not recognize arguments of type dict. Other types like int, str, list do not cause this problem.
Other models i've tested like llama3.2 or qwen2.5-coder do not have this issue

@maciejmajek maciejmajek added invalid This doesn't seem right priority/minor Lower-priority tasks that can be picked up when time allows or planned for later. labels Apr 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working invalid This doesn't seem right priority/minor Lower-priority tasks that can be picked up when time allows or planned for later.
Projects
None yet
Development

No branches or pull requests

2 participants