TL;DR: OpenTelemetry (OTel) is a free, open-source standard for collecting observability data — traces, metrics, and logs — from your application. It's vendor-neutral, meaning you can send that data to Grafana, Datadog, Jaeger, or any platform without rewriting your code. When you ask AI to "add monitoring" to your app, there's a good chance it reaches for OpenTelemetry. Understanding what it generates — even at a high level — means the difference between an app you can debug and one that's a black box in production. If your app has an API, a database, or multiple services, you need this.

Why AI Coders Need to Know This

Here's the uncomfortable truth about building with AI: your app works great until it doesn't, and when it doesn't, you're flying blind.

When you ask Claude, Cursor, or Copilot to build a full-stack app, you get something that works on your machine. It handles requests, talks to a database, maybe calls a few external APIs. Impressive. But here's what you don't get: any way to see what's happening inside that app once real people start using it.

Traditional developers spent years learning to instrument their code — adding logging, tracking errors, measuring performance. They built that muscle over time. As a vibe coder, you're shipping production apps in days, not years. And when something goes wrong at 2 AM, "it works on my machine" doesn't help your users.

This is where OpenTelemetry comes in. It's the industry standard — backed by every major cloud provider, adopted by over 40,000 GitHub contributors, and now the second-most active project in the Cloud Native Computing Foundation (behind only Kubernetes). In March 2026, OpenTelemetry Profiles entered public alpha, adding CPU and memory profiling to the existing traces, metrics, and logs — making it even more relevant for understanding performance issues.

Here's why you specifically need to care:

  • AI generates multi-service architectures by default. Ask AI to build an app with auth, payments, and a database, and you'll get multiple moving parts. When a user reports "the checkout page is broken," you need to trace that request across every service to find the problem.
  • Observability tools are converging on OpenTelemetry. Grafana, Datadog, New Relic, Honeycomb — they all accept OTel data natively. Learn one standard, use it everywhere.
  • Debugging with AI gets 10x better when you can paste a trace into your prompt and say "here's exactly what happened — why did it fail?"

Application monitoring tells you something broke. OpenTelemetry tells you exactly what broke, where it broke, and how long each step took along the way.

The Real Scenario

You've been building a SaaS app — a project management tool. You used Claude to scaffold the whole thing: a React frontend, an Express API, a PostgreSQL database, and Stripe for payments. You deployed it on Railway, shared it with some beta testers, and went to bed feeling great.

Next morning: three users report that creating a new project "just spins forever." You open the app yourself — works fine. You check your logs — nothing obvious. You look at your error tracker — no crashes. The app isn't breaking; it's just... slow. For some users. Sometimes.

Without OpenTelemetry: You're guessing. Is it the database? The API? Railway having a bad day? You add console.log statements everywhere, redeploy, wait for it to happen again, and hope you catch it.

With OpenTelemetry: You open Jaeger (or Grafana Tempo), find a slow request trace, and see the entire journey: the API received the request in 2ms, validated the auth token in 15ms, then spent 4.7 seconds waiting on a database query. You click into that span and see the exact SQL query — a missing index on the projects table that causes a full table scan. You paste the trace into Claude, it suggests the index, you add it, problem solved.

That's the difference. And here's how to get there.

What to Ask Your AI

📋 Prompt

"Add OpenTelemetry auto-instrumentation to this Node.js Express app. Set up tracing, metrics, and basic logging. Use the OTLP exporter configured for Jaeger running locally on port 4318. Create the instrumentation as a separate file that loads before the app starts. Include instrumentations for HTTP, Express, and PostgreSQL."

That prompt is specific enough that your AI won't just give you a generic snippet — it'll generate a working setup tailored to your stack. Let's look at what Claude would generate.

What AI Generated

When you give Claude that prompt with a Node.js Express project, here's the instrumentation file it creates:

// instrumentation.js — load this BEFORE your app code
// Run with: node --require ./instrumentation.js app.js

const { NodeSDK } = require('@opentelemetry/sdk-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const { OTLPMetricExporter } = require('@opentelemetry/exporter-metrics-otlp-http');
const { PeriodicExportingMetricReader } = require('@opentelemetry/sdk-metrics');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
const { Resource } = require('@opentelemetry/resources');
const { ATTR_SERVICE_NAME, ATTR_SERVICE_VERSION } = require('@opentelemetry/semantic-conventions');
const { diag, DiagConsoleLogger, DiagLogLevel } = require('@opentelemetry/api');

// Optional: turn on diagnostic logging to debug OTel itself
// diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.INFO);

const sdk = new NodeSDK({
  // Identify your service in traces and metrics
  resource: new Resource({
    [ATTR_SERVICE_NAME]: 'my-project-api',
    [ATTR_SERVICE_VERSION]: '1.0.0',
  }),

  // Send traces to Jaeger via OTLP
  traceExporter: new OTLPTraceExporter({
    url: 'http://localhost:4318/v1/traces',
  }),

  // Send metrics to a collector via OTLP
  metricReader: new PeriodicExportingMetricReader({
    exporter: new OTLPMetricExporter({
      url: 'http://localhost:4318/v1/metrics',
    }),
    exportIntervalMillis: 15000, // Send metrics every 15 seconds
  }),

  // Auto-instrument common libraries — HTTP, Express, pg, etc.
  instrumentations: [
    getNodeAutoInstrumentations({
      // Disable fs instrumentation (too noisy for most apps)
      '@opentelemetry/instrumentation-fs': { enabled: false },
    }),
  ],
});

// Start the SDK before your app loads
sdk.start();
console.log('✅ OpenTelemetry instrumentation active');

// Gracefully shut down on exit so traces aren't lost
process.on('SIGTERM', () => {
  sdk.shutdown()
    .then(() => console.log('OpenTelemetry shut down'))
    .catch((err) => console.error('OTel shutdown error', err))
    .finally(() => process.exit(0));
});

And your package.json would need these dependencies:

{
  "dependencies": {
    "@opentelemetry/sdk-node": "^0.57.0",
    "@opentelemetry/exporter-trace-otlp-http": "^0.57.0",
    "@opentelemetry/exporter-metrics-otlp-http": "^0.57.0",
    "@opentelemetry/sdk-metrics": "^1.30.0",
    "@opentelemetry/auto-instrumentations-node": "^0.56.0",
    "@opentelemetry/resources": "^1.30.0",
    "@opentelemetry/semantic-conventions": "^1.30.0",
    "@opentelemetry/api": "^1.9.0"
  }
}

You start your app with:

node --require ./instrumentation.js app.js

That --require flag is critical — it tells Node to load the instrumentation file before your app code runs, so OpenTelemetry can wrap all your HTTP calls, Express routes, and database queries before they're used.

Now let's break down what each part actually does.

Understanding Each Part: Traces, Metrics, and Logs

OpenTelemetry calls these the three pillars of observability. Think of them as three different ways of looking at what your app is doing — like checking a construction site with blueprints, a time-lapse camera, and a daily progress log.

Traces — Following a Request From Start to Finish

A trace follows a single request through your entire system. When a user clicks "Create Project," the trace captures every step: the HTTP request hitting your API, the auth middleware checking the token, the database query inserting the row, the Stripe call if payment is involved, and the response going back to the browser.

Each step is called a span. Spans are nested — the overall request is the parent span, and each sub-operation (database call, external API call) is a child span. Together they form a tree that shows exactly what happened and how long each part took.

Why this matters to you: Without traces, debugging a slow request across multiple services is like trying to find a leak in a building by standing outside and looking at the roof. Traces let you open up every wall and see exactly where the water's coming from. When a user says "it's slow," you pull up the trace and see that the database query took 4 seconds — not the API, not the network, not the frontend.

Metrics — The Dashboard Gauges

Metrics are numerical measurements collected over time. Think of them as the gauges on a car dashboard: engine temperature, fuel level, RPM. For your app, metrics track things like:

  • Request count: How many API calls per minute?
  • Error rate: What percentage of requests are failing?
  • Response time: How fast (or slow) are your endpoints?
  • Active connections: How many database connections are open?
  • Memory and CPU usage: Is the app running hot?

Metrics are great for spotting trends and setting alerts. "If error rate goes above 5% for 3 minutes, send me a Slack notification." You don't look at individual metrics — you look at patterns. Is response time creeping up over the last hour? Are errors spiking after that last deploy?

Logs — The Detailed Diary

Logs are timestamped text records of events — the most familiar form of observability. You've already used console.log(). OpenTelemetry takes logging further by correlating logs with traces. Instead of a disconnected log entry that says "Database error," you get a log entry that's linked to the exact trace and span where it happened.

This correlation is the superpower. Without it, logs are like reading a diary with the pages shuffled — you can see individual entries but can't follow the story. With OpenTelemetry's correlated logs, every entry is tagged with a trace ID and span ID, so you can jump from a log message straight to the full trace of what was happening when that log was written.

Profiles — The Newest Pillar (Alpha in 2026)

As of March 2026, OpenTelemetry added a fourth signal: profiles. Profiling captures what your CPU and memory are actually doing — which functions are burning the most processing time, where memory is being allocated. Think of it as a slow-motion instant replay of your app's internal workload.

Profiles are still in public alpha, so you probably won't use them today. But they're worth knowing about because they solve a real problem: traces tell you which operation was slow, but profiles tell you why that specific function took so long. Expect AI tools to start generating profiling setup code as this matures throughout 2026.

How the Pieces Fit Together

Here's the mental model: your app generates observability data (traces, metrics, logs). OpenTelemetry collects that data and exports it to wherever you want to view it. The viewing tools are separate — that's the whole point.

The architecture looks like this:

Your App (with OTel SDK)
    │
    ▼
OTel Collector (optional middleman — batches, filters, routes)
    │
    ├──▶ Jaeger / Grafana Tempo (traces)
    ├──▶ Prometheus / Grafana Mimir (metrics)
    └──▶ Loki / Elasticsearch (logs)

The OTel Collector is an optional piece that sits between your app and your backends. Think of it as a mail sorting facility — your app drops off all its observability data, and the collector sorts it, batches it, and sends it to the right destinations. For simple setups, you can skip the collector and export directly from your app to Jaeger or Grafana.

For a local development setup, here's a Docker Compose that spins up Jaeger to receive and display your traces:

# docker-compose.yaml — local observability stack
services:
  jaeger:
    image: jaegertracing/all-in-one:latest
    ports:
      - "16686:16686"  # Jaeger UI
      - "4318:4318"    # OTLP HTTP receiver
    environment:
      - COLLECTOR_OTLP_ENABLED=true

Run docker compose up, start your app with the instrumentation file, make some requests, and open http://localhost:16686. You'll see every request traced end-to-end. This is the "aha moment" for most developers — suddenly you can see your app working.

What AI Gets Wrong About OpenTelemetry

AI tools are good at generating OTel setup code, but they consistently make a few mistakes you need to watch for:

1. Outdated Package Names

OpenTelemetry's JavaScript packages have been through multiple naming conventions. AI models trained on older data might generate imports like @opentelemetry/exporter-jaeger (deprecated) instead of the current OTLP exporters. If you see package names with specific vendor names in them, that's usually the old way. The modern approach uses OTLP (OpenTelemetry Protocol) exporters that work with any compatible backend.

Quick check: If your AI generates an exporter package with a vendor name (like exporter-jaeger, exporter-zipkin), ask it to use the OTLP exporter instead. The OTLP format is the universal standard — it works with Jaeger, Grafana, Datadog, and everything else.

2. Forgetting the --require Flag

This is the most common mistake. AI will often generate the instrumentation file but tell you to import it at the top of your app.js. That doesn't work reliably — OpenTelemetry needs to load before any other code so it can wrap (monkey-patch) libraries like express and pg. The correct approach is using --require ./instrumentation.js in your start command, or the newer --import flag for ES modules.

3. Not Disabling Noisy Instrumentations

The auto-instrumentations-node package instruments everything by default — including filesystem operations. This creates a firehose of spans for every file read, which buries the actually useful data. Always disable the fs instrumentation unless you specifically need it. Good AI-generated code will do this; most doesn't.

4. Missing the Service Name

AI-generated code often leaves the service name as a placeholder like "my-service" or omits it entirely. Without a meaningful service name, all your traces show up as "unknown_service" in Jaeger — useless when you have multiple services. Always set ATTR_SERVICE_NAME to something descriptive like "project-api" or "auth-service".

5. Confusing OpenTelemetry with Specific Platforms

AI sometimes conflates OpenTelemetry with Datadog or New Relic, generating vendor-specific instrumentation instead of the open standard. If you see dd-trace or newrelic in the generated code, that's a vendor-specific agent — not OpenTelemetry. Ask specifically for OTel if that's what you want.

How to Debug with AI Using OpenTelemetry Data

Once you have OpenTelemetry running, your debugging workflow transforms completely. Here's the process:

Step 1: Find the Problem Trace

Open Jaeger (or your trace viewer), filter by the endpoint that's having issues, and sort by duration. Find a slow or errored trace. Each trace has a unique trace ID — a long hex string like abc123def456....

Step 2: Copy the Trace Details

Most trace viewers let you export a trace as JSON. Copy the relevant parts — especially the span names, durations, and any error messages or status codes.

Step 3: Give It to Your AI

📋 Debug Prompt

"Here's an OpenTelemetry trace from my Express API. The /api/projects endpoint is taking 4.7 seconds. The trace shows the database span (pg.query) taking 4.6 seconds with this query: SELECT * FROM projects WHERE user_id = $1. The projects table has 50,000 rows. Why is this slow and how do I fix it?"

Now your AI isn't guessing. It has the exact query, the exact timing, and the exact context. It'll tell you to add an index on user_id, and it'll be right — because you gave it real data instead of "my app is slow sometimes."

Step 4: Verify the Fix

After applying the fix, make the same request and check the new trace. That pg.query span should drop from 4.6 seconds to milliseconds. OpenTelemetry gives you proof that your fix actually worked, not just a feeling.

Pro debugging pattern: When something breaks, grab the trace ID from your logs (OpenTelemetry adds it automatically), look up that trace in Jaeger, find the failing span, and feed the error details to your AI. This three-step loop — trace → identify → prompt — is the fastest way to debug complex issues.

OpenTelemetry vs. the Alternatives

You might be wondering: why not just use Sentry? Or Datadog's own agent? Here's the honest breakdown:

Approach Best For Limitation
console.log() Quick debugging during development No structure, no traces, doesn't scale
Sentry Error tracking, crash reporting Focused on errors, not full observability
Datadog Agent Full observability if you're all-in on Datadog Vendor lock-in, expensive at scale
OpenTelemetry Universal standard, works with any backend More setup than single-vendor solutions

The real answer for most vibe coders: use OpenTelemetry alongside Sentry. Sentry catches and groups your errors beautifully. OpenTelemetry gives you deep tracing and metrics. They complement each other — Sentry tells you what crashed, OTel traces show you why. Sentry even supports ingesting OTel data directly now.

Getting Started: The Minimum Viable Setup

You don't need to boil the ocean. Here's the simplest path to getting OpenTelemetry running:

  1. Install the packages — Copy the package.json dependencies from above and run npm install.
  2. Create instrumentation.js — Use the file shown above. Change the service name to match your app.
  3. Start Jaeger locally — Run the Docker Compose snippet above. If you don't have Docker, you can also use Grafana Cloud's free tier as a backend.
  4. Update your start command — Change node app.js to node --require ./instrumentation.js app.js.
  5. Make some requests — Hit your API endpoints, then open http://localhost:16686 to see your traces.

That's it. Five steps, maybe 15 minutes. You now have full distributed tracing on your application. Every HTTP request, every database query, every external API call — all visible, all timed, all connected.

When to Level Up

Start with traces only — they give you the most insight for the least effort. Then add these as your app grows:

  • Custom spans: Wrap important business logic (like payment processing) in manual spans so you can trace them separately.
  • Metrics and alerts: Set up Prometheus + Grafana to track error rates and response times, with alerts when things go wrong.
  • The OTel Collector: When you have multiple services, use the collector to batch and route all their data centrally.
  • Sampling: At high traffic, you don't need to trace every request. Configure head or tail sampling to keep costs down.
  • Profiles (alpha): Once the profiling signal stabilizes later in 2026, add CPU and memory profiling for deep performance analysis.

What to Learn Next

Frequently Asked Questions

What is OpenTelemetry in plain English?

OpenTelemetry is a free, open-source toolkit that collects data about what your app is doing — every request, every database call, every error. Think of it like installing sensors throughout a building: temperature gauges, motion detectors, and security cameras. Except instead of monitoring a building, you're monitoring your software. It collects three types of data: traces (what happened during a request), metrics (how many times something happened), and logs (detailed notes about events). The key thing is it's a universal standard — it works with almost every monitoring platform, so you're never locked into one vendor.

Do I need OpenTelemetry for a small app or side project?

For a simple side project with no users, probably not yet — basic error tracking with Sentry and uptime monitoring is enough. But the moment your app has multiple services talking to each other (a frontend calling an API calling a database), OpenTelemetry becomes valuable fast. It shows you exactly where things slow down or break across those connections. And since AI tools often scaffold microservice architectures by default, you might hit this point sooner than you think.

What's the difference between OpenTelemetry and tools like Sentry or Datadog?

OpenTelemetry collects the data. Tools like Sentry, Datadog, Grafana, and Honeycomb display and analyze it. Think of OpenTelemetry as the security cameras, and those tools as the monitor room where you watch the feeds. You can swap out the monitor room without reinstalling all the cameras — that's the whole point. OpenTelemetry is vendor-neutral, so you can switch from Grafana to Datadog without changing any of your application code.

Will OpenTelemetry slow down my app?

In practice, the performance impact is minimal — typically 1-3% overhead for most applications. OpenTelemetry is designed with performance in mind and includes sampling controls so you can collect data on only a percentage of requests in high-traffic apps. For most vibe coder projects, you'll never notice the difference. The insight you gain from seeing exactly what's happening inside your app far outweighs the tiny performance cost.

How do I ask AI to add OpenTelemetry to my existing project?

Start with this prompt: "Add OpenTelemetry auto-instrumentation to this Node.js project. Use the OTLP exporter and configure it to send traces and metrics to [your platform]. Include the HTTP, Express, and database instrumentations." Replace the platform with wherever you want to view the data — Grafana Cloud, Jaeger, or even the console for local development. The AI will add the right packages and create a tracing setup file. Make sure it creates the setup as a separate file that loads before your app code with the --require flag.