Observability

What Your Agent Inherits

Every HTTP request your agent handles is automatically traced with OpenTelemetry spans, counted by Prometheus metrics, and logged in structured JSON with request ID correlation. Your agent never has to configure an exporter, register a metric, or wire up a log filter. It focuses on business logic while the chassis takes care of observability.

Three systems work in concert:

Distributed tracing propagates context across service boundaries, letting you follow a single request through the FastAPI app, its database queries, and any outbound HTTP calls.
Prometheus metrics count requests, measure latencies, and track body sizes. Everything is exposed on /metrics for any scraper to collect.
Structured logging emits every log record as JSON (or human-readable text in development) with request_id and correlation_id fields injected automatically through Python context variables.

The payoff is that when something goes wrong in production, the operator can correlate a trace, a metric spike, and a log line back to the same request without touching a single line of application code.

Distributed Tracing

The tracing subsystem sets up a global TracerProvider with the application’s service metadata and ships spans over OTLP to any compatible collector, whether that is Jaeger, Tempo, Honeycomb, or Datadog.

Provider Configuration

configure_tracing() View source

def configure_tracing(settings: Settings) -> None:
  """Configure the global OpenTelemetry tracer provider once."""
  global _provider_configured, _httpx_instrumented

  if not settings.otel_enabled or _provider_configured:
      return

  provider = TracerProvider(
      resource=Resource.create(
          {
              "service.name": settings.otel_service_name,
              "service.version": settings.otel_service_version,
              "deployment.environment": settings.otel_environment,
          }
      )
  )
  exporter = OTLPSpanExporter(
      endpoint=settings.otel_exporter_otlp_endpoint,
      headers=_parse_headers(settings.otel_exporter_otlp_headers),
  )
  provider.add_span_processor(BatchSpanProcessor(exporter))
  trace.set_tracer_provider(provider)
  _provider_configured = True

  if not _httpx_instrumented:
      HTTPXClientInstrumentor().instrument()
      _httpx_instrumented = True

Key design decisions:

Resource attributes like service.name, service.version, and deployment.environment tag every span, so you can filter traces by service and environment in your collector UI.
BatchSpanProcessor buffers spans and ships them in batches. This keeps per-request overhead negligible.
HTTPX instrumentation is global and one-shot: all outbound HTTP calls made through httpx automatically create child spans.
Guard flag (_provider_configured) prevents double-initialization if create_app() runs multiple times in tests.

FastAPI Auto-Instrumentation

instrument_fastapi_app() View source

def instrument_fastapi_app(app: Any, settings: Settings) -> None:
  """Attach FastAPI instrumentation to an application instance."""
  if not settings.otel_enabled:
      return

  FastAPIInstrumentor.instrument_app(
      app,
      excluded_urls=",".join(
          [settings.health_check_path, settings.readiness_check_path, "/metrics", "/favicon.ico"]
      ),
  )

The instrumentor automatically wraps every route handler with span creation. Health checks, readiness probes, the metrics endpoint, and favicon requests are excluded so they don’t pollute your traces with infrastructure noise.

Database Tracing

instrument_database_engine() View source

def instrument_database_engine(engine: AsyncEngine, settings: Settings) -> None:
  """Attach SQLAlchemy tracing to the engine when enabled."""
  if not settings.otel_enabled:
      return

  SQLAlchemyInstrumentor().instrument(engine=engine.sync_engine)

Database queries show up as child spans beneath the HTTP request span. This gives you timing and statement-level visibility for every query the agent executes.

Builder Integration

FastAPIAppBuilder.setup_tracing() View source

    def setup_tracing(self) -> Self:
      """Configure OpenTelemetry tracing for the application."""
      configure_tracing(self.settings)
      instrument_fastapi_app(self.app, self.settings)
      self.logger.info(
          "Tracing %s",
          (
              "configured successfully"
              if self.settings.otel_enabled
              else "disabled by configuration"
          ),
      )
      return self

The builder’s setup_tracing() method calls both configure_tracing() and instrument_fastapi_app() as one step in the build chain. Tracing is off by default (APP_OTEL_ENABLED=false) and you can activate it with a single environment variable.

Prometheus Metrics

The chassis exposes request-level metrics at /metrics through starlette-exporter, a lightweight Prometheus middleware built for ASGI applications.

FastAPIAppBuilder.setup_metrics() View source

    def setup_metrics(self) -> Self:
      """Configure Prometheus metrics collection."""
      if not self.settings.metrics_enabled:
          self.logger.info("Metrics collection disabled by configuration")
          return self

      try:
          from prometheus_client import REGISTRY, Info
          from starlette_exporter import PrometheusMiddleware, handle_metrics
          from starlette_exporter.optional_metrics import request_body_size, response_body_size

          with contextlib.suppress(KeyError):
              REGISTRY.unregister(REGISTRY._names_to_collectors["fastapi_app_info_info"])

          app_info = Info("fastapi_app_info", "FastAPI application information")
          app_info.info(
              {
                  "app_name": self.settings.app_name,
                  "app_version": self.settings.app_version,
                  "python_version": platform.python_version(),
                  "fastapi_version": fastapi.__version__,
              }
          )

          self.app.add_middleware(
              PrometheusMiddleware,
              app_name=self.settings.app_name,
              prefix=self.settings.metrics_prefix,
              group_paths=False,
              optional_metrics=[response_body_size, request_body_size],
              skip_paths=[
                  self.settings.health_check_path,
                  self.settings.readiness_check_path,
                  METRICS_PATH,
              ],
              skip_methods=["OPTIONS"],
          )
          self.app.add_route(METRICS_PATH, handle_metrics)
          self.logger.info("Prometheus metrics configured successfully")
      except ImportError:
          self.logger.warning(
              "Prometheus dependencies not installed. "
              "Install with: pip install prometheus-client starlette-exporter"
          )
      except Exception as exc:
          self.logger.exception("Failed to configure metrics: %s", exc)
          raise

      return self

Here is what this provides automatically:

Request count and latency histograms broken down by method, path, and status code.
Request and response body size tracking through optional metrics.
Application info gauge that exports the app name, version, Python version, and FastAPI version as labels.
Noise filtering. Health checks, readiness probes, the metrics endpoint itself, and OPTIONS preflight requests are all excluded from collection.

Just like tracing, metrics are off by default (APP_METRICS_ENABLED=false) and require no code changes to activate.

Structured Logging

The chassis supports two log formats: JSON for production, which is machine-parseable and compatible with any log aggregator, and text for local development, which is human-readable with color support. A single environment variable controls the format: APP_LOG_FORMAT=json|text.

Request Context Injection

Every log record emitted during a request includes request_id and correlation_id fields, even though application code never passes them explicitly. The mechanism behind this is Python’s contextvars module:

Request context variables View source

from contextvars import ContextVar, Token

_request_id: ContextVar[str] = ContextVar("request_id", default="-")
_correlation_id: ContextVar[str] = ContextVar("correlation_id", default="-")

type RequestContextTokens = tuple[Token[str], Token[str]]


def get_request_id() -> str:
  """Return the request ID for the current context, or '-' when absent."""
  return _request_id.get()


def get_correlation_id() -> str:
  """Return the correlation ID for the current context, or '-' when absent."""
  return _correlation_id.get()


def set_request_context(request_id: str, correlation_id: str) -> RequestContextTokens:
  """Set request-scoped tracing IDs and return reset tokens."""
  return _request_id.set(request_id), _correlation_id.set(correlation_id)


def reset_request_context(tokens: RequestContextTokens) -> None:
  """Reset tracing context to the previous values."""
  request_id_token, correlation_id_token = tokens
  _request_id.reset(request_id_token)
  _correlation_id.reset(correlation_id_token)

The RequestIDMiddleware (covered in the Middleware chapter) calls set_request_context() at the start of every request and reset_request_context() once the response completes. Since ContextVar is async-safe, the correct IDs propagate naturally through await chains, background tasks, and database callbacks, with no thread-local hacks required.

JSON Log Configuration

configure_root_logging() View source

def configure_root_logging(settings: Settings) -> None:
  """Bootstrap the root logger with the format specified in settings."""
  level = getattr(logging, settings.log_level.upper(), logging.INFO)
  root = logging.getLogger()
  root.handlers.clear()
  root.setLevel(level)

  handler = logging.StreamHandler(sys.stdout)
  handler.setLevel(level)
  handler.addFilter(RequestContextFilter())

  if settings.log_format == "json":
      from pythonjsonlogger.json import JsonFormatter

      handler.setFormatter(
          JsonFormatter(
              fmt=(
                  "%(asctime)s %(levelname)s %(name)s"
                  " %(request_id)s %(correlation_id)s %(message)s"
              ),
              datefmt="%Y-%m-%dT%H:%M:%S",
              rename_fields={
                  "asctime": "timestamp",
                  "levelname": "level",
                  "name": "logger",
              },
          )
      )
  else:
      handler.setFormatter(
          logging.Formatter(
              fmt=settings.log_text_template,
              datefmt=settings.log_date_format,
          )
      )

  root.addHandler(handler)

The RequestContextFilter pulls get_request_id() and get_correlation_id() from the context variables and injects them into every log record. In JSON mode, the output looks like this:

{
  "timestamp": "2026-03-08T14:22:10",
  "level": "INFO",
  "logger": "app.routes.items",
  "request_id": "a1b2c3d4",
  "correlation_id": "x9y8z7w6",
  "message": "Created item id=42"
}

In text mode, the same request produces:

2026-03-08 14:22:10 | INFO     | app.routes.items | request_id=a1b2c3d4 | correlation_id=x9y8z7w6 | items:create:55 | Created item id=42

The Three Pillars Together

Traces, metrics, and logs are independent subsystems, but the request_id ties them all together:

A trace captures the full request lifecycle, including the HTTP span, database query spans, and any outbound HTTP call spans, all sharing a single trace ID.
A metric increments the request counter and records the latency in a histogram, tagged by method, path, and status code.
A log line records application-level events with the request_id and correlation_id fields.

When a latency spike shows up in your Prometheus dashboard, you search for the corresponding request_id in your log aggregator. That same ID links to the trace in your tracing backend, where you can see exactly which database query or external call caused the delay. There is no manual instrumentation and no boilerplate, just the correlation the chassis provides out of the box.

Best Practices

Always use the three-pillar approach: traces, metrics, and logs together. Each pillar answers different questions — traces show request flow, metrics show aggregate trends, logs show application-level events. The request_id ties them all together.
Never include health checks, readiness probes, or metrics endpoints in telemetry collection. Infrastructure noise drowns out real application signals and inflates storage costs.
Always use BatchSpanProcessor instead of SimpleSpanProcessor in production. Batch processing keeps per-request overhead negligible while simple processing blocks on every span export.
Prefer structured JSON logging over unstructured text in production. JSON logs are machine-parseable, compatible with every log aggregator, and queryable by field name.
Always propagate request_id and correlation_id through Python ContextVar. Context variables are async-safe and propagate naturally through await chains without thread-local hacks.

What the Agent Never Implements

The chassis handles all observability plumbing, so the agent never needs to:

Create or manage spans. The FastAPI and SQLAlchemy instrumentors handle the span lifecycle automatically.
Register Prometheus metrics or expose a /metrics endpoint. The middleware creates histograms, counters, and gauges from request traffic.
Configure log formatters or filters. JSON versus text output toggles with a single environment variable.
Propagate request IDs through async code. ContextVar handles propagation transparently.
Exclude infrastructure paths from telemetry. Health checks, readiness probes, and metrics endpoints are filtered by default.
Bootstrap the tracing provider or exporter. Setting APP_OTEL_ENABLED=true activates the full OTLP pipeline.