From 0f77ac86a9f711749ae59f32bdd770fc6e92d6af Mon Sep 17 00:00:00 2001 From: Angus Jelinek Date: Thu, 22 May 2025 15:44:05 -0700 Subject: [PATCH] Add all supported otel mappings --- .../trace_with_opentelemetry.mdx | 204 ++++++++++++++---- 1 file changed, 166 insertions(+), 38 deletions(-) diff --git a/docs/observability/how_to_guides/trace_with_opentelemetry.mdx b/docs/observability/how_to_guides/trace_with_opentelemetry.mdx index 54af1856a..a91e1e702 100644 --- a/docs/observability/how_to_guides/trace_with_opentelemetry.mdx +++ b/docs/observability/how_to_guides/trace_with_opentelemetry.mdx @@ -111,47 +111,175 @@ if __name__ == "__main__": You should see a trace in your LangSmith dashboard [like this one](https://smith.langchain.com/public/4f2890b1-f105-44aa-a6cf-c777dcc27a37/r). -## Supported OpenTelemetry Attribute Mapping +## Supported OpenTelemetry Attribute and Event Mapping When sending traces to LangSmith via OpenTelemetry, the following attributes are mapped to LangSmith fields. -| OpenTelemetry Attribute | LangSmith Field | Notes | -| ---------------------------------- | ------------------------------------- | ---------------------------------------------------------------------------- | -| `langsmith.trace.name` | Run Name | Overrides the span name for the run | -| `langsmith.span.kind` | Run Type | Values: `llm`, `chain`, `tool`, `retriever`, `embedding`, `prompt`, `parser` | -| `langsmith.span.id` | Run ID | Unique identifier for the span | -| `langsmith.trace.id` | Trace ID | Unique identifier for the trace | -| `langsmith.span.dotted_order` | Dotted Order | Position in the execution tree | -| `langsmith.span.parent_id` | Parent Run ID | ID of the parent span | -| `langsmith.trace.session_id` | Session ID | Session identifier for related traces | -| `langsmith.trace.session_name` | Session Name | Name of the session | -| `langsmith.span.tags` | Tags | Custom tags attached to the span | -| `gen_ai.system` | `metadata.ls_provider` | The GenAI system (e.g., "openai", "anthropic") | -| `gen_ai.prompt` | `inputs` | The input prompt sent to the model | -| `gen_ai.completion` | `outputs` | The output generated by the model | -| `gen_ai.prompt.{n}.role` | `inputs.messages[n].role` | Role for the nth input message | -| `gen_ai.prompt.{n}.content` | `inputs.messages[n].content` | Content for the nth input message | -| `gen_ai.completion.{n}.role` | `outputs.messages[n].role` | Role for the nth output message | -| `gen_ai.completion.{n}.content` | `outputs.messages[n].content` | Content for the nth output message | -| `gen_ai.request.model` | `invocation_params.model` | The model name used for the request | -| `gen_ai.response.model` | `invocation_params.model` | The model name returned in the response | -| `gen_ai.request.temperature` | `invocation_params.temperature` | Temperature setting | -| `gen_ai.request.top_p` | `invocation_params.top_p` | Top-p sampling setting | -| `gen_ai.request.max_tokens` | `invocation_params.max_tokens` | Maximum tokens setting | -| `gen_ai.request.frequency_penalty` | `invocation_params.frequency_penalty` | Frequency penalty setting | -| `gen_ai.request.presence_penalty` | `invocation_params.presence_penalty` | Presence penalty setting | -| `gen_ai.request.seed` | `invocation_params.seed` | Random seed used for generation | -| `gen_ai.request.stop_sequences` | `invocation_params.stop` | Sequences that stop generation | -| `gen_ai.request.top_k` | `invocation_params.top_k` | Top-k sampling parameter | -| `gen_ai.request.encoding_formats` | `invocation_params.encoding_formats` | Output encoding formats | -| `gen_ai.usage.input_tokens` | `usage_metadata.input_tokens` | Number of input tokens used | -| `gen_ai.usage.output_tokens` | `usage_metadata.output_tokens` | Number of output tokens used | -| `gen_ai.usage.total_tokens` | `usage_metadata.total_tokens` | Total number of tokens used | -| `gen_ai.usage.prompt_tokens` | `usage_metadata.input_tokens` | Number of input tokens used (deprecated) | -| `gen_ai.usage.completion_tokens` | `usage_metadata.output_tokens` | Number of output tokens used (deprecated) | -| `input.value` | `inputs` | Full input value, can be string or JSON | -| `output.value` | `outputs` | Full output value, can be string or JSON | -| `langsmith.metadata.{key}` | `metadata.{key}` | Custom metadata | +### Core LangSmith Attributes + +:::caution Run Hierarchy Attributes +The following attributes that determine run hierarchy (`langsmith.span.id`, `langsmith.trace.id`, `langsmith.span.dotted_order`, `langsmith.span.parent_id`) should generally not be set manually when using OpenTelemetry. These are primarily used internally by the LangSmith SDK when tracing with OpenTelemetry. While setting these attributes can improve performance, it's not recommended for most use cases as they can interfere with proper run tree construction. For more details on how these attributes work, see the [Run Data Format documentation](https://docs.smith.langchain.com/reference/data_formats/run_data_format#what-is-dotted_order). +::: + +| OpenTelemetry Attribute | LangSmith Field | Notes | +| ------------------------------ | ---------------- | ---------------------------------------------------------------------------- | +| `langsmith.trace.name` | Run Name | Overrides the span name for the run | +| `langsmith.span.kind` | Run Type | Values: `llm`, `chain`, `tool`, `retriever`, `embedding`, `prompt`, `parser` | +| `langsmith.span.id` | Run ID | Unique identifier for the span, | +| `langsmith.trace.id` | Trace ID | Unique identifier for the trace | +| `langsmith.span.dotted_order` | Dotted Order | Position in the execution tree | +| `langsmith.span.parent_id` | Parent Run ID | ID of the parent span | +| `langsmith.trace.session_id` | Session ID | Session identifier for related traces | +| `langsmith.trace.session_name` | Session Name | Name of the session | +| `langsmith.span.tags` | Tags | Custom tags attached to the span (comma-separated) | +| `langsmith.metadata.{key}` | `metadata.{key}` | Custom metadata with langsmith prefix | + +### GenAI Standard Attributes + +| OpenTelemetry Attribute | LangSmith Field | Notes | +| --------------------------------------- | ----------------------------- | ------------------------------------------------------------- | +| `gen_ai.system` | `metadata.ls_provider` | The GenAI system (e.g., "openai", "anthropic") | +| `gen_ai.operation.name` | Run Type | Maps "chat"/"completion" to "llm", "embedding" to "embedding" | +| `gen_ai.prompt` | `inputs` | The input prompt sent to the model | +| `gen_ai.completion` | `outputs` | The output generated by the model | +| `gen_ai.prompt.{n}.role` | `inputs.messages[n].role` | Role for the nth input message | +| `gen_ai.prompt.{n}.content` | `inputs.messages[n].content` | Content for the nth input message | +| `gen_ai.prompt.{n}.message.role` | `inputs.messages[n].role` | Alternative format for role | +| `gen_ai.prompt.{n}.message.content` | `inputs.messages[n].content` | Alternative format for content | +| `gen_ai.completion.{n}.role` | `outputs.messages[n].role` | Role for the nth output message | +| `gen_ai.completion.{n}.content` | `outputs.messages[n].content` | Content for the nth output message | +| `gen_ai.completion.{n}.message.role` | `outputs.messages[n].role` | Alternative format for role | +| `gen_ai.completion.{n}.message.content` | `outputs.messages[n].content` | Alternative format for content | +| `gen_ai.tool.name` | `invocation_params.tool_name` | Tool name, also sets run type to "tool" | + +### GenAI Request Parameters + +| OpenTelemetry Attribute | LangSmith Field | Notes | +| ---------------------------------- | ------------------------------------- | --------------------------------------- | +| `gen_ai.request.model` | `invocation_params.model` | The model name used for the request | +| `gen_ai.response.model` | `invocation_params.model` | The model name returned in the response | +| `gen_ai.request.temperature` | `invocation_params.temperature` | Temperature setting | +| `gen_ai.request.top_p` | `invocation_params.top_p` | Top-p sampling setting | +| `gen_ai.request.max_tokens` | `invocation_params.max_tokens` | Maximum tokens setting | +| `gen_ai.request.frequency_penalty` | `invocation_params.frequency_penalty` | Frequency penalty setting | +| `gen_ai.request.presence_penalty` | `invocation_params.presence_penalty` | Presence penalty setting | +| `gen_ai.request.seed` | `invocation_params.seed` | Random seed used for generation | +| `gen_ai.request.stop_sequences` | `invocation_params.stop` | Sequences that stop generation | +| `gen_ai.request.top_k` | `invocation_params.top_k` | Top-k sampling parameter | +| `gen_ai.request.encoding_formats` | `invocation_params.encoding_formats` | Output encoding formats | + +### GenAI Usage Metrics + +| OpenTelemetry Attribute | LangSmith Field | Notes | +| -------------------------------- | ------------------------------ | ----------------------------------------- | +| `gen_ai.usage.input_tokens` | `usage_metadata.input_tokens` | Number of input tokens used | +| `gen_ai.usage.output_tokens` | `usage_metadata.output_tokens` | Number of output tokens used | +| `gen_ai.usage.total_tokens` | `usage_metadata.total_tokens` | Total number of tokens used | +| `gen_ai.usage.prompt_tokens` | `usage_metadata.input_tokens` | Number of input tokens used (deprecated) | +| `gen_ai.usage.completion_tokens` | `usage_metadata.output_tokens` | Number of output tokens used (deprecated) | + +### TraceLoop Attributes + +| OpenTelemetry Attribute | LangSmith Field | Notes | +| ---------------------------------------- | ---------------- | ------------------------------------------------ | +| `traceloop.entity.input` | `inputs` | Full input value from TraceLoop | +| `traceloop.entity.output` | `outputs` | Full output value from TraceLoop | +| `traceloop.entity.name` | Run Name | Entity name from TraceLoop | +| `traceloop.span.kind` | Run Type | Maps to LangSmith run types | +| `traceloop.llm.request.type` | Run Type | "embedding" maps to "embedding", others to "llm" | +| `traceloop.association.properties.{key}` | `metadata.{key}` | Custom metadata with traceloop prefix | + +### OpenInference Attributes + +| OpenTelemetry Attribute | LangSmith Field | Notes | +| ------------------------- | ------------------------ | ----------------------------------------- | +| `input.value` | `inputs` | Full input value, can be string or JSON | +| `output.value` | `outputs` | Full output value, can be string or JSON | +| `openinference.span.kind` | Run Type | Maps various kinds to LangSmith run types | +| `llm.system` | `metadata.ls_provider` | LLM system provider | +| `llm.model_name` | `metadata.ls_model_name` | Model name from OpenInference | +| `tool.name` | Run Name | Tool name when span kind is "TOOL" | +| `metadata` | `metadata.*` | JSON string of metadata to be merged | + +### LLM Attributes + +| OpenTelemetry Attribute | LangSmith Field | Notes | +| ---------------------------- | ------------------------------------- | ------------------------------------ | +| `llm.input_messages` | `inputs.messages` | Input messages | +| `llm.output_messages` | `outputs.messages` | Output messages | +| `llm.token_count.prompt` | `usage_metadata.input_tokens` | Prompt token count | +| `llm.token_count.completion` | `usage_metadata.output_tokens` | Completion token count | +| `llm.token_count.total` | `usage_metadata.total_tokens` | Total token count | +| `llm.usage.total_tokens` | `usage_metadata.total_tokens` | Alternative total token count | +| `llm.invocation_parameters` | `invocation_params.*` | JSON string of invocation parameters | +| `llm.presence_penalty` | `invocation_params.presence_penalty` | Presence penalty | +| `llm.frequency_penalty` | `invocation_params.frequency_penalty` | Frequency penalty | +| `llm.request.functions` | `invocation_params.functions` | Function definitions | + +### Prompt Template Attributes + +| OpenTelemetry Attribute | LangSmith Field | Notes | +| ------------------------------- | --------------- | ------------------------------------------------ | +| `llm.prompt_template.variables` | Run Type | Sets run type to "prompt", used with input.value | + +### Retriever Attributes + +| OpenTelemetry Attribute | LangSmith Field | Notes | +| ------------------------------------------- | ----------------------------------- | --------------------------------------------- | +| `retrieval.documents.{n}.document.content` | `outputs.documents[n].page_content` | Content of the nth retrieved document | +| `retrieval.documents.{n}.document.metadata` | `outputs.documents[n].metadata` | Metadata of the nth retrieved document (JSON) | + +### Tool Attributes + +| OpenTelemetry Attribute | LangSmith Field | Notes | +| ----------------------- | ---------------------------------- | ----------------------------------------- | +| `tools` | `invocation_params.tools` | Array of tool definitions | +| `tool_arguments` | `invocation_params.tool_arguments` | Tool arguments as JSON or key-value pairs | + +### Logfire Attributes + +| OpenTelemetry Attribute | LangSmith Field | Notes | +| ----------------------- | ------------------ | ------------------------------------------------ | +| `prompt` | `inputs` | Logfire prompt input | +| `all_messages_events` | `outputs` | Logfire message events output | +| `events` | `inputs`/`outputs` | Logfire events array, splits input/choice events | + +## OpenTelemetry Event Mapping + +| Event Name | LangSmith Field | Notes | +| --------------------------- | -------------------- | ---------------------------------------------------------------- | +| `gen_ai.content.prompt` | `inputs` | Extracts prompt content from event attributes | +| `gen_ai.content.completion` | `outputs` | Extracts completion content from event attributes | +| `gen_ai.system.message` | `inputs.messages[]` | System message in conversation | +| `gen_ai.user.message` | `inputs.messages[]` | User message in conversation | +| `gen_ai.assistant.message` | `outputs.messages[]` | Assistant message in conversation | +| `gen_ai.tool.message` | `outputs.messages[]` | Tool response message | +| `gen_ai.choice` | `outputs` | Model choice/response with finish reason | +| `exception` | `status`, `error` | Sets status to "error" and extracts exception message/stacktrace | + +### Event Attribute Extraction + +For message events, the following attributes are extracted: + +- `content` → message content +- `role` → message role +- `id` → tool_call_id (for tool messages) +- `gen_ai.event.content` → full message JSON + +For choice events: + +- `finish_reason` → choice finish reason +- `message.content` → choice message content +- `message.role` → choice message role +- `tool_calls.{n}.id` → tool call ID +- `tool_calls.{n}.function.name` → tool function name +- `tool_calls.{n}.function.arguments` → tool function arguments +- `tool_calls.{n}.type` → tool call type + +For exception events: + +- `exception.message` → error message +- `exception.stacktrace` → error stacktrace (appended to message) ## Logging Traces with the Traceloop SDK