Metrics, Traces, Errors — Know What's Happening
Monitoring & Observability
Three pillars of observability — metrics, traces, logs. The tools you'll want from day one.
What you'll learn
- Capture errors with Sentry
- Emit metrics
- Add distributed tracing
You can’t fix what you can’t see. Three pillars of observability:
- Logs — what happened
- Metrics — how many, how fast
- Traces — what called what
Error Tracking — Sentry
The single most impactful tool to add to a new Node app.
npm install @sentry/node import * as Sentry from "@sentry/node";
Sentry.init({
dsn: process.env.SENTRY_DSN,
tracesSampleRate: 0.1, // 10% of requests are traced
});
// uncaught errors flow through automatically
// or explicitly:
Sentry.captureException(err); Every uncaught exception, plus traces of slow operations, with stack traces, request context, user info. You’ll catch bugs before users do.
Metrics
A typical metric: request count, error rate, p95 latency, queue depth, DB pool size. Common stacks:
- Prometheus + Grafana — standard for self-hosted
- Datadog, New Relic — hosted, batteries included
- Cloud-native — CloudWatch, Google Cloud Monitoring
npm install prom-client import client from "prom-client";
const httpDuration = new client.Histogram({
name: "http_request_duration_seconds",
help: "duration",
labelNames: ["method", "route", "status"],
});
app.use((req, res, next) => {
const end = httpDuration.startTimer({ method: req.method, route: req.path });
res.on("finish", () => end({ status: res.statusCode }));
next();
});
app.get("/metrics", async (req, res) => {
res.set("Content-Type", client.register.contentType);
res.end(await client.register.metrics());
}); Scrape /metrics with Prometheus or whichever stack you use.
Tracing
Distributed tracing shows the path of a request across services:
[ingress 5ms] → [my Node 80ms]
├─ [Postgres 30ms]
└─ [Redis 2ms]
The standard: OpenTelemetry.
npm install @opentelemetry/api @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node Auto-instrumentation hooks Express, fetch, pg, ioredis, and dozens more out of the box. Ship traces to any OTel-compatible backend (Jaeger, Honeycomb, Datadog, Grafana Tempo).
Uptime Monitoring
External “is the site up?” checks — completely separate from your infra (so you find out when your infra is on fire).
- BetterStack / Better Uptime
- UptimeRobot
- Pingdom
Hit your /healthz every 30s. Page you when it fails.
The Pragmatic Minimum
For a small project, start with:
- Sentry for errors (free tier is generous)
- Uptime monitor for liveness
- Platform-native metrics (Render, Fly, Vercel all give you basics)
Add tracing and Prometheus when you actually need them.
Security →