7.8 KiB
7.8 KiB
OpenTelemetry LLM Observability Integration
This MCP server now includes comprehensive OpenTelemetry support for LLM observability, compatible with any OpenTelemetry backend including Jaeger, New Relic, Grafana, Datadog, Honeycomb, and more.
Features
- Universal Compatibility: Works with any OpenTelemetry-compatible backend
- Comprehensive Metrics: Request counts, token usage, latency, error rates
- Distributed Tracing: Full request lifecycle tracking with spans
- Flexible Configuration: Environment-based configuration for different backends
- Zero-Code Integration: Drop-in replacement for existing observability tools
Quick Start
1. Install Dependencies
The OpenTelemetry dependencies are already included in the package.json:
npm install
2. Configure Your Backend
Jaeger (Local Development)
# Start Jaeger locally
docker run -d --name jaeger \
-e COLLECTOR_OTLP_ENABLED=true \
-p 16686:16686 \
-p 4317:4317 \
-p 4318:4318 \
jaegertracing/all-in-one:latest
# Configure the MCP server
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
export OTEL_SERVICE_NAME=llm-observability-mcp
New Relic
export OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp.nr-data.net:4318
export OTEL_EXPORTER_OTLP_HEADERS="api-key=YOUR_NEW_RELIC_LICENSE_KEY"
export OTEL_SERVICE_NAME=llm-observability-mcp
Grafana Cloud
export OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-prod-us-central-0.grafana.net/otlp
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic $(echo -n YOUR_INSTANCE_ID:YOUR_API_KEY | base64)"
export OTEL_SERVICE_NAME=llm-observability-mcp
Datadog
export OTEL_EXPORTER_OTLP_ENDPOINT=https://api.datadoghq.com/api/v2/series
export OTEL_EXPORTER_OTLP_HEADERS="DD-API-KEY=YOUR_DD_API_KEY"
export OTEL_SERVICE_NAME=llm-observability-mcp
3. Start the MCP Server
# Start with stdio transport
npm run mcp:stdio
# Start with HTTP transport
npm run mcp:http
Usage
Using the OpenTelemetry Tool
The MCP server provides a new tool: capture_llm_observability_opentelemetry
Required Parameters
userId: The distinct ID of the usermodel: The model used (e.g., "gpt-4", "claude-3")provider: The LLM provider (e.g., "openai", "anthropic")
Optional Parameters
traceId: Trace ID for grouping related eventsinput: The input to the LLM (messages, prompt, etc.)outputChoices: The output from the LLMinputTokens: Number of tokens in the inputoutputTokens: Number of tokens in the outputlatency: Latency of the LLM call in secondshttpStatus: HTTP status code of the LLM callbaseUrl: Base URL of the LLM APIoperationName: Name of the operation being performederror: Error message if the request failederrorType: Type of error (e.g., "rate_limit", "timeout")mcpToolsUsed: List of MCP tools used during the request
Example Usage
{
"userId": "user-123",
"model": "gpt-4",
"provider": "openai",
"inputTokens": 150,
"outputTokens": 75,
"latency": 2.5,
"httpStatus": 200,
"operationName": "chat-completion",
"traceId": "trace-abc123"
}
Configuration Reference
Environment Variables
| Variable | Description | Default |
|---|---|---|
OTEL_SERVICE_NAME |
Service name for OpenTelemetry | llm-observability-mcp |
OTEL_SERVICE_VERSION |
Service version | 1.0.0 |
OTEL_ENVIRONMENT |
Environment name | development |
OTEL_EXPORTER_OTLP_ENDPOINT |
Default OTLP endpoint | - |
OTEL_EXPORTER_OTLP_METRICS_ENDPOINT |
Metrics endpoint | - |
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT |
Traces endpoint | - |
OTEL_EXPORTER_OTLP_LOGS_ENDPOINT |
Logs endpoint | - |
OTEL_EXPORTER_OTLP_HEADERS |
Headers for authentication (format: "key1=value1,key2=value2") | - |
OTEL_METRIC_EXPORT_INTERVAL |
Metrics export interval in ms | 10000 |
OTEL_METRIC_EXPORT_TIMEOUT |
Metrics export timeout in ms | 5000 |
OTEL_TRACES_SAMPLER_ARG |
Sampling ratio (0.0-1.0) | 1.0 |
Backend-Specific Configuration
New Relic
export OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp.nr-data.net:4318
export OTEL_EXPORTER_OTLP_HEADERS="api-key=YOUR_LICENSE_KEY"
Jaeger
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
Grafana Cloud
export OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-prod-us-central-0.grafana.net/otlp
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic $(echo -n YOUR_INSTANCE_ID:YOUR_API_KEY | base64)"
Honeycomb
export OTEL_EXPORTER_OTLP_ENDPOINT=https://api.honeycomb.io/v1/traces
export OTEL_EXPORTER_OTLP_HEADERS="x-honeycomb-team=YOUR_API_KEY"
Lightstep
export OTEL_EXPORTER_OTLP_ENDPOINT=https://ingest.lightstep.com:443/api/v2/otel/trace
export OTEL_EXPORTER_OTLP_HEADERS="lightstep-access-token=YOUR_ACCESS_TOKEN"
Metrics Collected
Counters
llm.requests.total: Total number of LLM requestsllm.tokens.total: Total tokens used (input + output)
Histograms
llm.latency.duration: Request latency in milliseconds
Gauges
llm.requests.active: Number of active requests
Trace Attributes
llm.model: The model usedllm.provider: The provider namellm.user_id: The user IDllm.operation: The operation namellm.input_tokens: Input token countllm.output_tokens: Output token countllm.total_tokens: Total token countllm.latency_ms: Latency in millisecondsllm.http_status: HTTP status codellm.base_url: API base URLllm.error: Error message (if any)llm.error_type: Error type classificationllm.input: Input content (optional)llm.output: Output content (optional)llm.mcp_tools_used: MCP tools used
Testing with Jaeger
1. Start Jaeger
docker run -d --name jaeger \
-e COLLECTOR_OTLP_ENABLED=true \
-p 16686:16686 \
-p 4317:4317 \
-p 4318:4318 \
jaegertracing/all-in-one:latest
2. Configure MCP Server
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
export OTEL_SERVICE_NAME=llm-observability-mcp
npm run mcp:stdio
3. View Traces
Open http://localhost:16686 to view traces in Jaeger UI.
Migration from PostHog
The OpenTelemetry tool is designed to be a drop-in replacement for the PostHog tool. Both tools can coexist, allowing for gradual migration:
- PostHog Tool:
capture_llm_observability - OpenTelemetry Tool:
capture_llm_observability_opentelemetry
Both tools accept the same parameters, making migration straightforward.
Troubleshooting
Common Issues
No Data in Backend
- Verify endpoint URLs are correct
- Check authentication headers
- Ensure network connectivity
- Check server logs for errors
High Resource Usage
- Adjust sampling ratio:
OTEL_TRACES_SAMPLER_ARG=0.1 - Increase export intervals:
OTEL_METRIC_EXPORT_INTERVAL=30000
Missing Traces
- Verify OpenTelemetry is enabled (check for endpoint configuration)
- Check for initialization errors in logs
- Ensure proper service name configuration
Debug Mode
Enable debug logging:
export DEBUG=true
npm run mcp:stdio
Advanced Configuration
Custom Headers
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer token,Custom-Header=value"
Multiple Backends
Configure different endpoints for metrics and traces:
export OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=https://metrics.example.com
export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=https://traces.example.com
Sampling Configuration
# Sample 10% of traces
export OTEL_TRACES_SAMPLER_ARG=0.1
Support
For issues or questions:
- Check the troubleshooting section above
- Review server logs with
DEBUG=true - Verify OpenTelemetry configuration
- Test with Jaeger locally first