Deployment & Infrastructure

Health Check

Last updated: February 16, 2026

A health check is an automated probe that verifies whether a service is running correctly and ready to accept traffic. It typically involves sending an HTTP request to a designated endpoint and evaluating the response status code. A healthy service returns a success code (usually 200), while an unhealthy one returns an error or fails to respond within a timeout.

Why It Matters

In production deployments of AI assistants, reliability is critical. Users expect instant responses, and a crashed or stalled gateway process can silently break the entire system. Health checks provide the feedback loop that orchestrators like Docker, Kubernetes, and Railway use to detect failures and automatically restart unhealthy containers. Without them, a failed process may sit idle indefinitely while users see only connection errors.

How It Works

Health checks come in several flavors. A liveness check confirms the process is running and not deadlocked. A readiness check verifies the service can handle requests, meaning all dependencies like database connections and model provider APIs are reachable. A startup check gives slow-starting services extra time before liveness probes begin.

For AI gateway deployments, readiness checks are especially important. The gateway may take several seconds to initialize, load configurations, and establish connections to messaging channels. Polling multiple endpoints such as /health, /, and the control UI path accounts for differences across gateway versions.

In Practice

Configure health checks in your Dockerfile with the HEALTHCHECK instruction, or define them in your orchestrator's configuration. Set sensible intervals (every 10-30 seconds), timeouts (5-10 seconds), and retry thresholds (3 consecutive failures before restart). Always log health check results to aid debugging when deployments fail to stabilize.

Back to Glossary

Health Check

Why It Matters

How It Works

In Practice

Related Terms