Files

Stefano 1f201a093f Initial implementation of opentelemetry-llm tool

2025-07-14 16:27:29 -05:00

7.8 KiB

Raw Blame History

OpenTelemetry LLM Observability Integration

This MCP server now includes comprehensive OpenTelemetry support for LLM observability, compatible with any OpenTelemetry backend including Jaeger, New Relic, Grafana, Datadog, Honeycomb, and more.

Features

Universal Compatibility: Works with any OpenTelemetry-compatible backend
Comprehensive Metrics: Request counts, token usage, latency, error rates
Distributed Tracing: Full request lifecycle tracking with spans
Flexible Configuration: Environment-based configuration for different backends
Zero-Code Integration: Drop-in replacement for existing observability tools

Quick Start

1. Install Dependencies

The OpenTelemetry dependencies are already included in the package.json:

npm install

2. Configure Your Backend

Jaeger (Local Development)

# Start Jaeger locally
docker run -d --name jaeger \
  -e COLLECTOR_OTLP_ENABLED=true \
  -p 16686:16686 \
  -p 4317:4317 \
  -p 4318:4318 \
  jaegertracing/all-in-one:latest

# Configure the MCP server
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
export OTEL_SERVICE_NAME=llm-observability-mcp

New Relic

export OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp.nr-data.net:4318
export OTEL_EXPORTER_OTLP_HEADERS="api-key=YOUR_NEW_RELIC_LICENSE_KEY"
export OTEL_SERVICE_NAME=llm-observability-mcp

Grafana Cloud

export OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-prod-us-central-0.grafana.net/otlp
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic $(echo -n YOUR_INSTANCE_ID:YOUR_API_KEY | base64)"
export OTEL_SERVICE_NAME=llm-observability-mcp

Datadog

export OTEL_EXPORTER_OTLP_ENDPOINT=https://api.datadoghq.com/api/v2/series
export OTEL_EXPORTER_OTLP_HEADERS="DD-API-KEY=YOUR_DD_API_KEY"
export OTEL_SERVICE_NAME=llm-observability-mcp

3. Start the MCP Server

# Start with stdio transport
npm run mcp:stdio

# Start with HTTP transport
npm run mcp:http

Usage

Using the OpenTelemetry Tool

The MCP server provides a new tool: capture_llm_observability_opentelemetry

Required Parameters

userId: The distinct ID of the user
model: The model used (e.g., "gpt-4", "claude-3")
provider: The LLM provider (e.g., "openai", "anthropic")

Optional Parameters

traceId: Trace ID for grouping related events
input: The input to the LLM (messages, prompt, etc.)
outputChoices: The output from the LLM
inputTokens: Number of tokens in the input
outputTokens: Number of tokens in the output
latency: Latency of the LLM call in seconds
httpStatus: HTTP status code of the LLM call
baseUrl: Base URL of the LLM API
operationName: Name of the operation being performed
error: Error message if the request failed
errorType: Type of error (e.g., "rate_limit", "timeout")
mcpToolsUsed: List of MCP tools used during the request

Example Usage

{
  "userId": "user-123",
  "model": "gpt-4",
  "provider": "openai",
  "inputTokens": 150,
  "outputTokens": 75,
  "latency": 2.5,
  "httpStatus": 200,
  "operationName": "chat-completion",
  "traceId": "trace-abc123"
}

Configuration Reference

Environment Variables

Variable	Description	Default
`OTEL_SERVICE_NAME`	Service name for OpenTelemetry	`llm-observability-mcp`
`OTEL_SERVICE_VERSION`	Service version	`1.0.0`
`OTEL_ENVIRONMENT`	Environment name	`development`
`OTEL_EXPORTER_OTLP_ENDPOINT`	Default OTLP endpoint	-
`OTEL_EXPORTER_OTLP_METRICS_ENDPOINT`	Metrics endpoint	-
`OTEL_EXPORTER_OTLP_TRACES_ENDPOINT`	Traces endpoint	-
`OTEL_EXPORTER_OTLP_LOGS_ENDPOINT`	Logs endpoint	-
`OTEL_EXPORTER_OTLP_HEADERS`	Headers for authentication (format: "key1=value1,key2=value2")	-
`OTEL_METRIC_EXPORT_INTERVAL`	Metrics export interval in ms	`10000`
`OTEL_METRIC_EXPORT_TIMEOUT`	Metrics export timeout in ms	`5000`
`OTEL_TRACES_SAMPLER_ARG`	Sampling ratio (0.0-1.0)	`1.0`

Backend-Specific Configuration

New Relic

export OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp.nr-data.net:4318
export OTEL_EXPORTER_OTLP_HEADERS="api-key=YOUR_LICENSE_KEY"

Jaeger

export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318

Grafana Cloud

export OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-prod-us-central-0.grafana.net/otlp
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic $(echo -n YOUR_INSTANCE_ID:YOUR_API_KEY | base64)"

Honeycomb

export OTEL_EXPORTER_OTLP_ENDPOINT=https://api.honeycomb.io/v1/traces
export OTEL_EXPORTER_OTLP_HEADERS="x-honeycomb-team=YOUR_API_KEY"

Lightstep

export OTEL_EXPORTER_OTLP_ENDPOINT=https://ingest.lightstep.com:443/api/v2/otel/trace
export OTEL_EXPORTER_OTLP_HEADERS="lightstep-access-token=YOUR_ACCESS_TOKEN"

Metrics Collected

Counters

llm.requests.total: Total number of LLM requests
llm.tokens.total: Total tokens used (input + output)

Histograms

llm.latency.duration: Request latency in milliseconds

Gauges

llm.requests.active: Number of active requests

Trace Attributes

llm.model: The model used
llm.provider: The provider name
llm.user_id: The user ID
llm.operation: The operation name
llm.input_tokens: Input token count
llm.output_tokens: Output token count
llm.total_tokens: Total token count
llm.latency_ms: Latency in milliseconds
llm.http_status: HTTP status code
llm.base_url: API base URL
llm.error: Error message (if any)
llm.error_type: Error type classification
llm.input: Input content (optional)
llm.output: Output content (optional)
llm.mcp_tools_used: MCP tools used

Testing with Jaeger

1. Start Jaeger

docker run -d --name jaeger \
  -e COLLECTOR_OTLP_ENABLED=true \
  -p 16686:16686 \
  -p 4317:4317 \
  -p 4318:4318 \
  jaegertracing/all-in-one:latest

2. Configure MCP Server

export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
export OTEL_SERVICE_NAME=llm-observability-mcp
npm run mcp:stdio

3. View Traces

Open http://localhost:16686 to view traces in Jaeger UI.

Migration from PostHog

The OpenTelemetry tool is designed to be a drop-in replacement for the PostHog tool. Both tools can coexist, allowing for gradual migration:

PostHog Tool: capture_llm_observability
OpenTelemetry Tool: capture_llm_observability_opentelemetry

Both tools accept the same parameters, making migration straightforward.

Troubleshooting

Common Issues

No Data in Backend

Verify endpoint URLs are correct
Check authentication headers
Ensure network connectivity
Check server logs for errors

High Resource Usage

Adjust sampling ratio: OTEL_TRACES_SAMPLER_ARG=0.1
Increase export intervals: OTEL_METRIC_EXPORT_INTERVAL=30000

Missing Traces

Verify OpenTelemetry is enabled (check for endpoint configuration)
Check for initialization errors in logs
Ensure proper service name configuration

Debug Mode

Enable debug logging:

export DEBUG=true
npm run mcp:stdio

Advanced Configuration

Custom Headers

export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer token,Custom-Header=value"

Multiple Backends

Configure different endpoints for metrics and traces:

export OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=https://metrics.example.com
export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=https://traces.example.com

Sampling Configuration

# Sample 10% of traces
export OTEL_TRACES_SAMPLER_ARG=0.1

Support

For issues or questions:

Check the troubleshooting section above
Review server logs with DEBUG=true
Verify OpenTelemetry configuration
Test with Jaeger locally first

7.8 KiB Raw Blame History

OpenTelemetry LLM Observability Integration

Features

Quick Start

1. Install Dependencies

2. Configure Your Backend

Jaeger (Local Development)

New Relic

Grafana Cloud

Datadog

3. Start the MCP Server

Usage

Using the OpenTelemetry Tool

Required Parameters

Optional Parameters

Example Usage

Configuration Reference

Environment Variables

Backend-Specific Configuration

New Relic

Jaeger

Grafana Cloud

Honeycomb

Lightstep

Metrics Collected

Counters

Histograms

Gauges

Trace Attributes

Testing with Jaeger

1. Start Jaeger

2. Configure MCP Server

3. View Traces

Migration from PostHog

Troubleshooting

Common Issues

No Data in Backend

High Resource Usage

Missing Traces

Debug Mode

Advanced Configuration

Custom Headers

Multiple Backends

Sampling Configuration

Support

7.8 KiB

Raw Blame History