I’ve spent a lot of time running Go applications in containers and Kubernetes. Over the years, certain approaches have proven themselves not just useful, but essential for creating systems that are robust, efficient, and easy to manage. The way Go works—as a compiled, static binary—offers unique advantages in a containerized world, but it also demands specific patterns to get the most out of it. Let’s walk through some of these approaches. Think of them as practical steps you can take to make your applications run better.
Let’s start with the very first step: building the container image itself. A common mistake is using the standard golang image as the final base. This creates a bloated image, often over 800MB, containing the entire Go toolchain that your application doesn’t need to run. The better way is a multi-stage build. You use one image to compile your application and a second, minimal image to run it.
Here’s a more detailed example. Notice how we explicitly disable CGO and strip debugging symbols. This creates a truly static binary that can run on the most minimal base images.
# Dockerfile
# Stage 1: The Builder
FROM golang:1.21-alpine AS builder
WORKDIR /app
# Copy dependency definitions first (for better layer caching)
COPY go.mod go.sum ./
RUN go mod download
# Copy the rest of the source code
COPY . .
# Build the application
RUN CGO_ENABLED=0 GOOS=linux go build -a -ldflags='-s -w -extldflags "-static"' -o server ./cmd/api
# Stage 2: The Runner
FROM scratch
# Copy the CA certificates from the builder stage if you make HTTPS calls
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
# Copy our static binary
COPY --from=builder /app/server /server
# The port your application listens on
EXPOSE 8080
# Run the binary
ENTRYPOINT ["/server"]
An image built this way can be under 10MB. This is significant. It means faster pod startup times, less network bandwidth for pulling images across your cluster, and a drastically reduced attack surface. There’s simply less stuff in there that could be vulnerable.
Once your application is in a container, it will have resource limits. This is where Go’s memory management needs a nudge. By default, the Go garbage collector (GC) triggers based on the GOGC variable, which is a percentage of live memory. In a limitless environment, this is fine. But inside a container with a strict memory limit, this default behavior can lead to your application being killed by Kubernetes for exceeding its limit.
The GOMEMLIMIT environment variable, introduced in Go 1.19, is a game-changer. It tells the runtime the soft memory limit it should target.
# deployment.yaml excerpt
apiVersion: apps/v1
kind: Deployment
spec:
template:
spec:
containers:
- name: my-app
image: my-app:v1
env:
- name: GOMEMLIMIT
value: "384MiB" # Set to ~80-90% of your container memory limit
resources:
limits:
memory: "400Mi"
cpu: "500m"
requests:
memory: "256Mi"
cpu: "100m"
By setting GOMEMLIMIT slightly below your container’s hard limit, you give the GC a clear goal. It will work more proactively to keep memory within that bound, reducing the risk of an Out Of Memory (OOM) kill. You trade a small amount of extra CPU for GC cycles for massively improved memory predictability.
Now, let’s talk about lifecycle. Kubernetes doesn’t just kill pods; it asks them to shut down politely first with a SIGTERM signal. If your application ignores this, it gets a SIGKILL after a grace period. A graceful shutdown means you stop accepting new connections, finish processing current requests, close database connections, and then exit.
Here is a more comprehensive example using an HTTP server and background workers. It shows how to coordinate the shutdown of multiple components.
// shutdown.go
package main
import (
"context"
"fmt"
"log"
"net/http"
"os"
"os/signal"
"sync"
"syscall"
"time"
)
func main() {
// Create a server
mux := http.NewServeMux()
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
time.Sleep(2 * time.Second) // Simulate a long request
w.Write([]byte("Request completed"))
})
server := &http.Server{
Addr: ":8080",
Handler: mux,
}
// Channel to listen for shutdown signal
shutdownSignal := make(chan os.Signal, 1)
signal.Notify(shutdownSignal, syscall.SIGTERM, syscall.SIGINT)
// Simulate a background worker
ctx, cancel := context.WithCancel(context.Background())
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
ticker := time.NewTicker(10 * time.Second)
defer ticker.Stop()
for {
select {
case <-ticker.C:
log.Println("Background task executed")
case <-ctx.Done():
log.Println("Background worker shutting down...")
// Perform cleanup here
time.Sleep(1 * time.Second) // Simulate cleanup
log.Println("Background worker stopped.")
return
}
}
}()
// Start server in a goroutine
go func() {
log.Println("Server starting on :8080")
if err := server.ListenAndServe(); err != nil && err != http.ErrServerClosed {
log.Fatalf("Server error: %v", err)
}
}()
// Block until we receive the shutdown signal
sig := <-shutdownSignal
log.Printf("Received signal: %v. Initiating graceful shutdown...", sig)
// Give ongoing requests 25 seconds to finish
shutdownCtx, shutdownCancel := context.WithTimeout(context.Background(), 25*time.Second)
defer shutdownCancel()
// Stop accepting new connections
if err := server.Shutdown(shutdownCtx); err != nil {
log.Printf("HTTP server shutdown error: %v", err)
}
// Cancel the background worker context
cancel()
// Wait for the worker to finish cleanup
wg.Wait()
log.Println("Graceful shutdown complete.")
}
This pattern ensures no user requests are dropped mid-process during a rolling update or a scale-down event.
Kubernetes needs to know if your application is alive and ready to serve traffic. This is done through probes. Getting these right is critical. A misconfigured liveness probe can cause a restart loop. A misconfigured readiness probe can cause outages during deployment.
I define them as separate HTTP endpoints with different semantics. The liveness check should be a simple, low-cost operation that verifies the process is running. The readiness check is more thorough; it should verify connections to all critical dependencies (databases, caches, other services).
// probes.go
package main
import (
"database/sql"
"net/http"
"sync/atomic"
)
var (
isReady atomic.Value
db *sql.DB // Assume this is initialized elsewhere
)
func init() {
isReady.Store(false)
}
func livenessHandler(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte("OK"))
}
func readinessHandler(w http.ResponseWriter, r *http.Request) {
if !isReady.Load().(bool) {
w.WriteHeader(http.StatusServiceUnavailable)
w.Write([]byte("Service not ready"))
return
}
// Check database connectivity
if err := db.PingContext(r.Context()); err != nil {
w.WriteHeader(http.StatusServiceUnavailable)
w.Write([]byte("Database unavailable"))
return
}
w.WriteHeader(http.StatusOK)
w.Write([]byte("Ready"))
}
// Call this function when your app has finished initialization
func setReady() {
isReady.Store(true)
}
The corresponding Kubernetes configuration is just as important. You must set appropriate initialDelaySeconds, periodSeconds, and timeoutSeconds. For applications with slow startup (like those loading large caches), a startupProbe is invaluable to give them time to become ready without being killed.
# deployment.yaml probes section
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 2
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 2
failureThreshold: 1
startupProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 0
periodSeconds: 5
timeoutSeconds: 2
failureThreshold: 30 # Allow up to 150 seconds (30 * 5) to start
Configuration in a Kubernetes environment should be dynamic and environment-specific. I avoid baking config into the binary. The twelve-factor app methodology got this right: use environment variables. For Go, this is straightforward with os.Getenv, but using a library like github.com/spf13/viper or github.com/kelseyhightower/envconfig provides structure, validation, and the ability to fall back to config files for local development.
// config.go
package config
import (
"log"
"os"
"strconv"
)
type Config struct {
ServerPort string
DBHost string
DBPort string
CacheURL string
DebugMode bool
}
func Load() *Config {
cfg := &Config{
ServerPort: getEnv("PORT", "8080"),
DBHost: getEnv("DB_HOST", "localhost"),
DBPort: getEnv("DB_PORT", "5432"),
CacheURL: getEnv("CACHE_URL", "redis://localhost:6379"),
DebugMode: getEnvAsBool("DEBUG", false),
}
// Validation
if cfg.DBHost == "" {
log.Fatal("DB_HOST environment variable is required")
}
return cfg
}
func getEnv(key, defaultValue string) string {
if value, exists := os.LookupEnv(key); exists {
return value
}
return defaultValue
}
func getEnvAsBool(key string, defaultValue bool) bool {
strVal := getEnv(key, "")
if val, err := strconv.ParseBool(strVal); err == nil {
return val
}
return defaultValue
}
Then, in Kubernetes, you inject configuration via the env field in your pod spec or, for more substantial configuration, via ConfigMaps and Secrets mounted as volumes or environment variables.
For logging, forget about writing to log files inside the container. Containers are ephemeral. Write to standard output (stdout) and standard error (stderr). Kubernetes’ kubelet collects these streams. But writing plain text lines makes analysis hard. Instead, write structured logs, like JSON. Every log aggregator (Fluentd, Logstash, Loki) understands JSON.
// logger.go
package main
import (
"os"
"time"
"github.com/rs/zerolog"
"github.com/rs/zerolog/log"
)
func main() {
// Use zerolog for simple, structured JSON logging to stdout
zerolog.TimeFieldFormat = zerolog.TimeFormatUnix
// Pretty print for local development
if os.Getenv("ENV") == "development" {
log.Logger = log.Output(zerolog.ConsoleWriter{Out: os.Stderr})
}
log.Info().
Str("service", "api").
Str("endpoint", "/users").
Int("user_id", 12345).
Dur("duration_ms", 120*time.Millisecond).
Msg("request processed")
// For errors with stack traces (avoid in production for simple errors)
log.Error().Err(someError).Str("operation", "db_insert").Msg("database operation failed")
}
This produces a log line like {"level":"info","service":"api","endpoint":"/users","user_id":12345,"duration_ms":120,"time":1649287312,"message":"request processed"} which can be easily filtered and queried.
A Go-specific optimization involves GOMAXPROCS. This controls the maximum number of operating system threads that can execute Go code simultaneously. In a container with a CPU limit (e.g., 500m, or half a core), the Go runtime might see the host’s total CPUs, not the limit. This can lead to excessive thread creation and context switching. The automaxprocs library solves this automatically.
// main.go
import (
_ "go.uber.org/automaxprocs"
)
func main() {
// automaxprocs has already adjusted runtime.GOMAXPROCS()
// based on the container's CPU limit.
// Start your application as normal.
}
For microservices, understanding how a request flows through your system is non-negotiable. Distributed tracing provides this visibility. Integrating OpenTelemetry into a Go HTTP service is now quite simple.
// tracing.go
package main
import (
"net/http"
"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/jaeger"
"go.opentelemetry.io/otel/propagation"
"go.opentelemetry.io/otel/sdk/resource"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
semconv "go.opentelemetry.io/otel/semconv/v1.17.0"
)
func initTracing(serviceName string) (*sdktrace.TracerProvider, error) {
exporter, err := jaeger.New(jaeger.WithCollectorEndpoint(jaeger.WithEndpoint("http://jaeger-collector:14268/api/traces")))
if err != nil {
return nil, err
}
tp := sdktrace.NewTracerProvider(
sdktrace.WithBatcher(exporter),
sdktrace.WithResource(resource.NewWithAttributes(
semconv.SchemaURL,
semconv.ServiceName(serviceName),
)),
)
otel.SetTracerProvider(tp)
otel.SetTextMapPropagator(propagation.NewCompositeTextMapPropagator(propagation.TraceContext{}, propagation.Baggage{}))
return tp, nil
}
func main() {
tp, err := initTracing("my-go-service")
if err != nil {
panic(err)
}
defer tp.Shutdown(context.Background())
// Wrap your HTTP handler with OTel instrumentation
handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello, traced world!"))
})
wrappedHandler := otelhttp.NewHandler(handler, "root")
http.Handle("/", wrappedHandler)
http.ListenAndServe(":8080", nil)
}
This instruments your HTTP server and client to propagate trace headers automatically, creating a visual timeline of requests across your cluster.
Finally, embrace statelessness. In a world where pods can be terminated or rescheduled at any moment, storing anything meaningful locally is a path to data loss and user frustration. Use external, durable stores for sessions, caches, and uploads. This principle is what allows Kubernetes to scale your application horizontally by simply adding more identical pods.
An often-overlooked helper is the init container. Your main application might depend on something being prepared first. Perhaps a configuration file needs to be downloaded from a secure vault, or database migrations must run. An init container runs to completion before your main container starts.
# deployment.yaml with init container
apiVersion: apps/v1
kind: Deployment
spec:
template:
spec:
initContainers:
- name: run-migrations
image: my-app-migrator:v1 # An image with your migration tooling
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: app-secrets
key: database-url
command: ['sh', '-c', 'migrate -path /migrations -database $DATABASE_URL up']
containers:
- name: main-app
image: my-app:v1
# ... main app spec
This cleanly separates the concern of setup from the concern of serving. If the migration fails, the pod won’t start, which is the correct behavior.
Each of these patterns addresses a specific friction point between Go’s design and the container orchestration environment. They are not theoretical; they are the accumulated result of dealing with production issues—memory leaks that weren’t leaks, pods stuck in crash loops during deployments, and debugging sessions that took hours instead of minutes because logs were unstructured. By applying them, you shift the odds in your favor. Your applications become predictable citizens of the cluster, easier to scale, observe, and maintain. That’s the ultimate goal: software that runs reliably, night and day, while you sleep.