Monitoring & Observability

Metrics, Traces, Errors — Know What's Happening

Monitoring & Observability

Three pillars of observability — metrics, traces, logs. The tools you'll want from day one.

4 min read Level 2/5 #nodejs#monitoring#observability
What you'll learn
  • Capture errors with Sentry
  • Emit metrics
  • Add distributed tracing

You can’t fix what you can’t see. Three pillars of observability:

  1. Logs — what happened
  2. Metrics — how many, how fast
  3. Traces — what called what

Error Tracking — Sentry

The single most impactful tool to add to a new Node app.

npm install @sentry/node
import * as Sentry from "@sentry/node";

Sentry.init({
  dsn: process.env.SENTRY_DSN,
  tracesSampleRate: 0.1,   // 10% of requests are traced
});

// uncaught errors flow through automatically
// or explicitly:
Sentry.captureException(err);

Every uncaught exception, plus traces of slow operations, with stack traces, request context, user info. You’ll catch bugs before users do.

Metrics

A typical metric: request count, error rate, p95 latency, queue depth, DB pool size. Common stacks:

  • Prometheus + Grafana — standard for self-hosted
  • Datadog, New Relic — hosted, batteries included
  • Cloud-native — CloudWatch, Google Cloud Monitoring
npm install prom-client
import client from "prom-client";

const httpDuration = new client.Histogram({
  name: "http_request_duration_seconds",
  help: "duration",
  labelNames: ["method", "route", "status"],
});

app.use((req, res, next) => {
  const end = httpDuration.startTimer({ method: req.method, route: req.path });
  res.on("finish", () => end({ status: res.statusCode }));
  next();
});

app.get("/metrics", async (req, res) => {
  res.set("Content-Type", client.register.contentType);
  res.end(await client.register.metrics());
});

Scrape /metrics with Prometheus or whichever stack you use.

Tracing

Distributed tracing shows the path of a request across services:

[ingress 5ms] → [my Node 80ms]
                   ├─ [Postgres 30ms]
                   └─ [Redis 2ms]

The standard: OpenTelemetry.

npm install @opentelemetry/api @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node

Auto-instrumentation hooks Express, fetch, pg, ioredis, and dozens more out of the box. Ship traces to any OTel-compatible backend (Jaeger, Honeycomb, Datadog, Grafana Tempo).

Uptime Monitoring

External “is the site up?” checks — completely separate from your infra (so you find out when your infra is on fire).

  • BetterStack / Better Uptime
  • UptimeRobot
  • Pingdom

Hit your /healthz every 30s. Page you when it fails.

The Pragmatic Minimum

For a small project, start with:

  1. Sentry for errors (free tier is generous)
  2. Uptime monitor for liveness
  3. Platform-native metrics (Render, Fly, Vercel all give you basics)

Add tracing and Prometheus when you actually need them.

Security →