Skip to content

Commit a17e9ef

Browse files
Add collector proxy docs (#802)
1 parent 4cf31c7 commit a17e9ef

File tree

2 files changed

+177
-0
lines changed

2 files changed

+177
-0
lines changed
Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
# [Beta] LangSmith Collector-Proxy
2+
3+
:::tip Note
4+
The LangSmith Collector-Proxy feature is currently in Beta and subject to change. [View the source code on GitHub](https://github.com/langchain-ai/langsmith-collector-proxy).
5+
:::
6+
7+
The LangSmith Collector-Proxy is a middleware service designed to efficiently aggregate, compress, and bulk-upload OTEL tracing data from your applications to LangSmith. It's optimized for large-scale, parallel environments generating high volumes of spans.
8+
9+
## When to Use the Collector-Proxy
10+
11+
The Collector-Proxy is particularly valuable when:
12+
13+
- You're running multiple instances of your application in parallel and need to efficiently aggregate traces
14+
- You want more efficient tracing than direct OTEL API calls to LangSmith (the collector optimizes batching and compression)
15+
- You're using a language that doesn't have a native LangSmith SDK
16+
17+
## Key Features
18+
19+
- **Efficient Data Transfer**
20+
Batches multiple spans into fewer, larger uploads.
21+
- **Compression**
22+
Uses zstd to minimize payload size.
23+
- **OTLP Support**
24+
Accepts OTLP JSON and Protobuf over HTTP POST.
25+
- **Semantic Translation**
26+
Maps GenAI/OpenInference conventions to the LangSmith Run model.
27+
- **Flexible Batching**
28+
Flush by span count or time interval.
29+
30+
## Configuration
31+
32+
Configure via environment variables:
33+
34+
| Variable | Description | Default |
35+
| -------------------- | --------------------------------- | --------------------------------- |
36+
| `HTTP_PORT` | Port to run the proxy server | `4318` |
37+
| `LANGSMITH_ENDPOINT` | LangSmith backend URL | `https://api.smith.langchain.com` |
38+
| `LANGSMITH_API_KEY` | API key for LangSmith | **Required** (env var or header) |
39+
| `LANGSMITH_PROJECT` | Default tracing project | Default project if not specified |
40+
| `BATCH_SIZE` | Spans per upload batch | `100` |
41+
| `FLUSH_INTERVAL_MS` | Flush interval in milliseconds | `1000` |
42+
| `MAX_BUFFER_BYTES` | Max uncompressed buffer size | `10485760` (10 MB) |
43+
| `MAX_BODY_BYTES` | Max incoming request body size | `209715200` (200 MB) |
44+
| `MAX_RETRIES` | Retry attempts for failed uploads | `3` |
45+
| `RETRY_BACKOFF_MS` | Initial backoff in milliseconds | `100` |
46+
47+
### Project Configuration
48+
49+
The Collector-Proxy supports LangSmith project configuration with the following priority:
50+
51+
1. If a project is specified in the request headers (`Langsmith-Project`), that project will be used
52+
2. If no project is specified in headers, it will use the project set in the `LANGSMITH_PROJECT` environment variable
53+
3. If neither is set, it will trace to the `default` project.
54+
55+
### Authentication
56+
57+
The API key can be provided either:
58+
59+
- As an environment variable (`LANGSMITH_API_KEY`)
60+
- In the request headers (`X-API-Key`)
61+
62+
## Deployment (Docker)
63+
64+
You can deploy the Collector-Proxy with Docker:
65+
66+
1. **Build the image**
67+
68+
```bash
69+
docker build \
70+
-t langsmith-collector-proxy:beta .
71+
72+
```
73+
74+
2. **Run the container**
75+
76+
```bash
77+
docker run -d \
78+
-p 4318:4318 \
79+
-e LANGSMITH_API_KEY=<your_api_key> \
80+
-e LANGSMITH_PROJECT=<your_project> \
81+
langsmith-collector-proxy:beta
82+
```
83+
84+
## Usage
85+
86+
Point any OTLP-compatible client or the OpenTelemetry Collector exporter at:
87+
88+
```bash
89+
export OTEL_EXPORTER_OTLP_ENDPOINT=http://<host>:4318/v1/traces
90+
export OTEL_EXPORTER_OTLP_HEADERS="X-API-Key=<your_api_key>,Langsmith-Project=<your_project>"
91+
```
92+
93+
Send a test trace:
94+
95+
```bash
96+
curl -X POST http://localhost:4318/v1/traces \
97+
-H "Content-Type: application/json" \
98+
--data '{
99+
"resourceSpans": [
100+
{
101+
"resource": {
102+
"attributes": [
103+
{
104+
"key": "service.name",
105+
"value": { "stringValue": "test-service" }
106+
}
107+
]
108+
},
109+
"scopeSpans": [
110+
{
111+
"scope": {
112+
"name": "example/instrumentation",
113+
"version": "1.0.0"
114+
},
115+
"spans": [
116+
{
117+
"traceId": "T6nh/mMkIONaoHewS9UWIw==",
118+
"spanId": "0tEqJwCpvU0=",
119+
"name": "parent-span",
120+
"kind": "SPAN_KIND_INTERNAL",
121+
"startTimeUnixNano": 1747675155185223936,
122+
"endTimeUnixNano": 1747675156185223936,
123+
"attributes": [
124+
{
125+
"key": "gen_ai.prompt",
126+
"value": {
127+
"stringValue": "{\"text\":\"Hello, world!\"}"
128+
}
129+
},
130+
{
131+
"key": "gen_ai.usage.input_tokens",
132+
"value": {
133+
"intValue": "5"
134+
}
135+
},
136+
{
137+
"key": "gen_ai.completion",
138+
"value": {
139+
"stringValue": "{\"text\":\"Hi there!\"}"
140+
}
141+
},
142+
{
143+
"key": "gen_ai.usage.output_tokens",
144+
"value": {
145+
"intValue": "3"
146+
}
147+
}
148+
],
149+
"droppedAttributesCount": 0,
150+
"events": [],
151+
"links": [],
152+
"status": {}
153+
}
154+
]
155+
}
156+
]
157+
}
158+
]
159+
}'
160+
```
161+
162+
## Health & Scaling
163+
164+
- **Liveness**: `GET /live` → 200
165+
- **Readiness**: `GET /ready` → 200
166+
167+
## Horizontal Scaling
168+
169+
To ensure full traces are batched correctly, route spans with the same trace ID to the same instance (e.g., via consistent hashing).
170+
171+
## Fork & Extend
172+
173+
Fork the [Collector-Proxy repo on GitHub](https://github.com/langchain-ai/langsmith-collector-proxy) and implement your own converter:
174+
175+
- Create a custom `GenAiConverter` in `internal/translator`
176+
- Register it in `cmd/collector` to handle bespoke OTLP conventions.

docs/observability/how_to_guides/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ Set up LangSmith tracing to get visibility into your production applications.
4242
- [Upload files with traces](./how_to_guides/upload_files_with_traces)
4343
- [Print out logs from the LangSmith SDK (Python Only)](./how_to_guides/output_detailed_logs)
4444
- [Troubleshooting: Missing or Misrouted Traces](./how_to_guides/toubleshooting_variable_caching)
45+
- [Using the LangSmith Collector Proxy](./how_to_guides/collector_proxy)
4546

4647
## Tracing projects UI & API
4748

0 commit comments

Comments
 (0)