Skip to main content

Agent Overview

The PMP4PG Agent is a lightweight binary written in Go, deployed directly on each PostgreSQL server you want to monitor. It is the data collection layer of the PMP4PG platform.


Role of the Agent

The agent runs as a background service on the monitored server. It:

  1. Connects locally to the PostgreSQL instance
  2. Collects metrics from PostgreSQL internal views at configured intervals
  3. Sends the collected data to the PMP4PG Backend via REST API
  4. Manages its own registration and authentication with the central platform

The agent requires no changes to your PostgreSQL configuration beyond creating a read-only monitoring user and enabling pg_stat_statements.


What the Agent Collects

CollectorSourceFrequencyPurpose
ASH Samplespg_stat_activityEvery 2 secondsReal-time session activity
Statements Deltapg_stat_statementsEvery minuteQuery-level performance (delta)
Database Deltapg_stat_databaseEvery minuteDatabase throughput (delta)
Database Sizepg_database_size()Every 10 minutesSize tracking per database
OS MetricsSystemEvery minuteCPU, memory, disk context
Server SnapshotMultiple sourcesEvery 30 minutesAWR snapshot base data
HeartbeatInternalEvery 30 secondsAgent liveness signal

How It Works

Delta-Based Collection

For cumulative counters (like pg_stat_statements and pg_stat_database), the agent computes deltas between two consecutive readings — sending only the difference, not the absolute values. This ensures that AWR reports reflect activity during the measured period, not cumulative totals since PostgreSQL was last started.

Gzip Compression

All payloads sent to the backend are compressed with gzip, minimizing network traffic even on high-frequency ASH sampling.

Retry with Exponential Backoff

If the backend is temporarily unreachable, the agent retries automatically with exponential backoff — ensuring no data loss during short network interruptions.

Worker Pool Architecture

The agent uses a concurrent worker pool to handle metric collection and sending in parallel, without blocking the ASH sampling loop.


Agent Lifecycle

┌─────────────────────────────────────────────────────────┐
│ Agent Startup │
│ │
│ 1. Load config.yml │
│ 2. Check registration (agent_id + api_key present?) │
│ ├─ No → Enter REGISTRATION mode → exit │
│ └─ Yes → Enter NORMAL operation mode │
│ │
│ NORMAL mode: │
│ 3. Connect to PostgreSQL │
│ 4. Start all collectors (goroutines) │
│ 5. Start heartbeat sender │
│ 6. Run until stopped │
└─────────────────────────────────────────────────────────┘

Agent vs. Platform

The agent is intentionally stateless and simple. It does not store any metrics locally — all data is forwarded to the central platform. If the backend is temporarily unavailable, the agent buffers recent data in memory and retries.

The agent does not perform any analysis, aggregation or alerting — that is the responsibility of the backend.


Resource Usage

The agent is designed to be minimal:

ResourceUsage
Memory~30–50 MB
CPU< 0.5% average
Disk~50 MB (binary + logs)
Network~1–5 MB/hour per server (with gzip)

Next Steps