Monitoring & Observability

Three pillars of observability — metrics, traces, logs. The tools you'll want from day one.

4 min read Level 2/5 #nodejs#monitoring#observability

What you'll learn

Capture errors with Sentry
Emit metrics
Add distributed tracing

You can’t fix what you can’t see. Three pillars of observability:

Logs — what happened
Metrics — how many, how fast
Traces — what called what

Error Tracking — Sentry

The single most impactful tool to add to a new Node app.

npm install @sentry/node

import * as Sentry from "@sentry/node";

Sentry.init({
  dsn: process.env.SENTRY_DSN,
  tracesSampleRate: 0.1,   // 10% of requests are traced
});

// uncaught errors flow through automatically
// or explicitly:
Sentry.captureException(err);

Every uncaught exception, plus traces of slow operations, with stack traces, request context, user info. You’ll catch bugs before users do.

Metrics

A typical metric: request count, error rate, p95 latency, queue depth, DB pool size. Common stacks:

Prometheus + Grafana — standard for self-hosted
Datadog, New Relic — hosted, batteries included
Cloud-native — CloudWatch, Google Cloud Monitoring

npm install prom-client

import client from "prom-client";

const httpDuration = new client.Histogram({
  name: "http_request_duration_seconds",
  help: "duration",
  labelNames: ["method", "route", "status"],
});

app.use((req, res, next) => {
  const end = httpDuration.startTimer({ method: req.method, route: req.path });
  res.on("finish", () => end({ status: res.statusCode }));
  next();
});

app.get("/metrics", async (req, res) => {
  res.set("Content-Type", client.register.contentType);
  res.end(await client.register.metrics());
});

Scrape /metrics with Prometheus or whichever stack you use.

Tracing

Distributed tracing shows the path of a request across services:

[ingress 5ms] → [my Node 80ms]
                   ├─ [Postgres 30ms]
                   └─ [Redis 2ms]

The standard: OpenTelemetry.

npm install @opentelemetry/api @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node

Auto-instrumentation hooks Express, fetch, pg, ioredis, and dozens more out of the box. Ship traces to any OTel-compatible backend (Jaeger, Honeycomb, Datadog, Grafana Tempo).

Uptime Monitoring

External “is the site up?” checks — completely separate from your infra (so you find out when your infra is on fire).

BetterStack / Better Uptime
UptimeRobot
Pingdom

Hit your /healthz every 30s. Page you when it fails.

The Pragmatic Minimum

For a small project, start with:

Sentry for errors (free tier is generous)
Uptime monitor for liveness
Platform-native metrics (Render, Fly, Vercel all give you basics)

Add tracing and Prometheus when you actually need them.

Security →