Initial implementation of opentelemetry-llm tool

2025-07-14 16:27:29 -05:00
parent 63cf87a6c6
commit 1f201a093f
14 changed files with 3191 additions and 61 deletions
--- a/README.md
+++ b/README.md
@@ -1,21 +1,24 @@
-# LLM Observability MCP for PostHog
+# LLM Observability MCP Server

 [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)

-A Model Context Protocol (MCP) server that provides a tool to capture LLM Observability events and send them to PostHog.
+A Model Context Protocol (MCP) server that provides comprehensive LLM observability tools supporting both PostHog and OpenTelemetry backends.

 ## Overview

-This project is an MCP server designed to track and observe Large Language Model (LLM) interactions using [PostHog's LLM Observability](https://posthog.com/docs/llm-observability) features. It allows you to capture detailed information about LLM requests, responses, performance, and costs, providing valuable insights into your AI-powered applications.
+This project is an MCP server designed to track and observe Large Language Model (LLM) interactions using both [PostHog's LLM Observability](https://posthog.com/docs/llm-observability) and **OpenTelemetry** for universal observability across any backend that supports OpenTelemetry (Jaeger, New Relic, Grafana, Datadog, Honeycomb, etc.).

 The server can be run as a local process communicating over `stdio` or as a remote `http` server, making it compatible with any MCP client, such as AI-powered IDEs (e.g., VS Code with an MCP extension, Cursor) or custom applications.

 ## Features

- **Capture LLM Metrics**: Log key details of LLM interactions, including model, provider, latency, token counts, and more.
- **Flexible Transport**: Run as a local `stdio` process for tight IDE integration or as a standalone `http` server for remote access.
- **Dynamic Configuration**: Configure the server easily using environment variables.
- **Easy Integration**: Connect to MCP-compatible IDEs or use the programmatic client for use in any TypeScript/JavaScript application.
+- **Dual Backend Support**: Choose between PostHog or OpenTelemetry (or use both)
+- **Universal OpenTelemetry**: Works with any OpenTelemetry-compatible backend
+- **Comprehensive Metrics**: Request counts, token usage, latency, error rates
+- **Distributed Tracing**: Full request lifecycle tracking with spans
+- **Flexible Transport**: Run as local `stdio` process or standalone `http` server
+- **Dynamic Configuration**: Environment-based configuration for different backends
+- **Zero-Code Integration**: Drop-in replacement for existing observability tools

 ## Installation for Development

@@ -46,10 +49,29 @@ Follow these steps to set up the server for local development.

 The server is configured via environment variables.

+### PostHog Configuration
+
+| Variable          | Description                                                                 | Default   | Example                               |
+| ----------------- | --------------------------------------------------------------------------- | --------- | ------------------------------------- |
+| `POSTHOG_API_KEY` | Your PostHog Project API Key (required for PostHog tool)                    | -         | `phc_...`                             |
+| `POSTHOG_HOST`    | The URL of your PostHog instance                                            | -         | `https://us.i.posthog.com`            |
+
+### OpenTelemetry Configuration
+
+| Variable                        | Description                                                                 | Default                    | Example                               |
+| ------------------------------- | --------------------------------------------------------------------------- | -------------------------- | ------------------------------------- |
+| `OTEL_EXPORTER_OTLP_ENDPOINT`   | OpenTelemetry collector endpoint                                            | -                          | `http://localhost:4318`               |
+| `OTEL_EXPORTER_OTLP_HEADERS`    | Headers for authentication (comma-separated key=value pairs)                | -                          | `api-key=YOUR_KEY`                    |
+| `OTEL_SERVICE_NAME`             | Service name for traces and metrics                                         | `llm-observability-mcp`    | `my-llm-app`                          |
+| `OTEL_SERVICE_VERSION`          | Service version                                                             | `1.0.0`                    | `2.1.0`                               |
+| `OTEL_ENVIRONMENT`              | Environment name                                                            | `development`              | `production`                          |
+| `OTEL_TRACES_SAMPLER_ARG`       | Sampling ratio (0.0-1.0)                                                    | `1.0`                      | `0.1`                                 |
+| `OTEL_METRIC_EXPORT_INTERVAL`   | Metrics export interval in milliseconds                                     | `10000`                    | `30000`                               |
+
+### General Configuration
+
 | Variable          | Description                                                                 | Default   | Example                               |
 | ----------------- | --------------------------------------------------------------------------- | --------- | ------------------------------------- |
-| `POSTHOG_API_KEY` | **Required.** Your PostHog Project API Key.                                 | -         | `phc_...`                             |
-| `POSTHOG_HOST`    | **Required.** The URL of your PostHog instance.                             | -         | `https://us.i.posthog.com`            |
 | `TRANSPORT_MODE`  | The transport protocol to use. Can be `http` or `stdio`.                    | `http`    | `stdio`                               |
 | `DEBUG`           | Set to `true` to enable detailed debug logging.                             | `false`   | `true`                                |

@@ -177,25 +199,98 @@ async function main() {
 main().catch(console.error);
 ```

-## Tool Reference: `capture_llm_observability`
+## Available Tools

-This is the core tool provided by the server. It captures LLM usage in PostHog for observability, including requests, responses, and performance metrics.
+### PostHog Tool: `capture_llm_observability`

-### Parameters
+Captures LLM usage in PostHog for observability, including requests, responses, and performance metrics.

-| Parameter       | Type                | Required | Description                                     |
-| --------------- | ------------------- | -------- | ----------------------------------------------- |
-| `userId`        | `string`            | Yes      | The distinct ID of the user.                    |
-| `model`         | `string`            | Yes      | The model used (e.g., `gpt-4`, `claude-3`).     |
-| `provider`      | `string`            | Yes      | The LLM provider (e.g., `openai`, `anthropic`). |
-| `traceId`       | `string`            | No       | The trace ID to group related AI events.        |
-| `input`         | `any`               | No       | The input to the LLM (e.g., messages, prompt).  |
-| `outputChoices` | `any`               | No       | The output choices from the LLM.                |
-| `inputTokens`   | `number`            | No       | The number of tokens in the input.              |
-| `outputTokens`  | `number`            | No       | The number of tokens in the output.             |
-| `latency`       | `number`            | No       | The latency of the LLM call in seconds.         |
-| `httpStatus`    | `number`            | No       | The HTTP status code of the LLM API call.       |
-| `baseUrl`       | `string`            | No       | The base URL of the LLM API.                    |
+### OpenTelemetry Tool: `capture_llm_observability_opentelemetry`
+
+Captures LLM usage using OpenTelemetry for universal observability across any OpenTelemetry-compatible backend.
+
+### Parameters Comparison
+
+| Parameter       | Type                | Required | Description                                     | PostHog | OpenTelemetry |
+| --------------- | ------------------- | -------- | ----------------------------------------------- | ------- | ------------- |
+| `userId`        | `string`            | Yes      | The distinct ID of the user.                    | ✅      | ✅            |
+| `model`         | `string`            | Yes      | The model used (e.g., `gpt-4`, `claude-3`).     | ✅      | ✅            |
+| `provider`      | `string`            | Yes      | The LLM provider (e.g., `openai`, `anthropic`). | ✅      | ✅            |
+| `traceId`       | `string`            | No       | The trace ID to group related AI events.        | ✅      | ✅            |
+| `input`         | `any`               | No       | The input to the LLM (e.g., messages, prompt).  | ✅      | ✅            |
+| `outputChoices` | `any`               | No       | The output choices from the LLM.                | ✅      | ✅            |
+| `inputTokens`   | `number`            | No       | The number of tokens in the input.              | ✅      | ✅            |
+| `outputTokens`  | `number`            | No       | The number of tokens in the output.             | ✅      | ✅            |
+| `latency`       | `number`            | No       | The latency of the LLM call in seconds.         | ✅      | ✅            |
+| `httpStatus`    | `number`            | No       | The HTTP status code of the LLM API call.       | ✅      | ✅            |
+| `baseUrl`       | `string`            | No       | The base URL of the LLM API.                    | ✅      | ✅            |
+| `operationName` | `string`            | No       | The name of the operation being performed.      | ❌      | ✅            |
+| `error`         | `string`            | No       | Error message if the request failed.            | ❌      | ✅            |
+| `errorType`     | `string`            | No       | Type of error (e.g., rate_limit, timeout).      | ❌      | ✅            |
+| `mcpToolsUsed`  | `string[]`          | No       | List of MCP tools used during the request.      | ❌      | ✅            |
+
+## Quick Start with OpenTelemetry
+
+### 1. Choose Your Backend
+
+**For local testing with Jaeger:**
+
+```bash
+# Start Jaeger with OTLP support
+docker run -d --name jaeger \
+  -e COLLECTOR_OTLP_ENABLED=true \
+  -p 16686:16686 \
+  -p 4318:4318 \
+  jaegertracing/all-in-one:latest
+```
+
+**For New Relic:**
+
+```bash
+export OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp.nr-data.net:4318
+export OTEL_EXPORTER_OTLP_HEADERS="api-key=YOUR_LICENSE_KEY"
+```
+
+### 2. Configure Environment
+
+```bash
+# Copy example configuration
+cp .env.example .env
+
+# Edit .env with your backend settings
+# For Jaeger:
+echo "OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318" >> .env
+echo "OTEL_SERVICE_NAME=llm-observability-mcp" >> .env
+```
+
+### 3. Start the Server
+
+```bash
+npm run mcp:http
+# or
+npm run mcp:stdio
+```
+
+### 4. Test the Integration
+
+```bash
+# Test with curl
+curl -X POST http://localhost:3000/mcp \
+  -H "Content-Type: application/json" \
+  -d '{
+    "tool": "capture_llm_observability_opentelemetry",
+    "arguments": {
+      "userId": "test-user",
+      "model": "gpt-4",
+      "provider": "openai",
+      "inputTokens": 100,
+      "outputTokens": 50,
+      "latency": 1.5,
+      "httpStatus": 200,
+      "operationName": "test-completion"
+    }
+  }'
+```

 ## Development

@@ -203,6 +298,12 @@ This is the core tool provided by the server. It captures LLM usage in PostHog f
 - **Run tests**: `npm test`
 - **Lint and format**: `npm run lint` and `npm run format`

+## Documentation
+
+- [OpenTelemetry Setup Guide](OPENTELEMETRY.md) - Complete OpenTelemetry configuration
+- [Usage Examples](examples/opentelemetry-usage.md) - Practical examples for different backends
+- [Environment Configuration](.env.example) - All available configuration options
+
 ## License

 [MIT License](https://opensource.org/licenses/MIT)