The risk score, explained.
Every server gets a single number, 0 to 100. Higher is worse. The score is a weighted sum of resource pressure, patch lag, reliability, and hardware age — not a black box. This page is the formula.
The formula
The risk score is a weighted average of four normalized sub-scores, each on a 0–100 scale, then clamped to [0, 100].
risk =
0.40 × resource_pressure + // CPU, memory, disk
0.30 × patch_lag + // CVE-weighted
0.20 × reliability + // uptime, last-seen, restarts
0.10 × hardware_age // age + EOL warranty
The score updates every minute as new telemetry arrives. A server's score is recomputed in full on each agent push — there's no rolling state, no smoothing, no hidden EMA.
What goes into each sub-score
Resource pressure (40%)
The maximum of three normalized utilization values: CPU (5-minute average), memory (current), and the most-pressured filesystem (current). We use the max — not the average — because a single saturated dimension is what causes outages.
- 0–60% utilization → 0–30 score (linear)
- 60–85% → 30–70 (steeper)
- 85–100% → 70–100 (steepest)
Patch lag (30%)
Computed from pending package updates and their CVE severity. Each pending package contributes points based on the highest-severity CVE it patches:
- Critical CVE: 25 points per package
- High: 10
- Medium: 3
- Low or no CVE mapping: 1
Sub-score is clamped to [0, 100]. Note the saturation: four pending critical CVEs already produce a score of 100, and a server with fifteen criticals reads the same on this sub-score as one with four. That's intentional for alerting (any of those is "fix now"), but it means patch_lag isn't a quantity dial above the saturation point — once it's at 100, the dashboard surfaces the count of pending criticals separately so operators can rank.
Where the CVE data comes from
Severity classification uses the distribution security tracker as primary source — Ubuntu's USN feed, Debian's DSA, RHEL's RHSA, Alpine's secdb, and Microsoft's MSRC for Windows Update — because they reflect the patch the package actually contains, not just the upstream CVE assignment. We fall back to the NVD feed for CVEs not yet in a distro tracker, and the OSV database for cross-ecosystem coverage.
For air-gapped deployments, every feed can be mirrored: the platform can be configured to read distro tracker URLs from your internal mirror, and an offline NVD/OSV bundle can be loaded via the admin UI. No outbound CVE-feed traffic is required once mirrors are in place. Feed update cadence is configurable per source — defaults are hourly for distro trackers and daily for NVD/OSV.
Reliability (20%)
Combines three signals:
- Last-seen lag — how long since the agent last reported. >5 min late adds points; >1 hr is severe.
- Recent restarts — unplanned reboots in the last 7 days.
- Service flapping — monitored services that have changed state more than 3 times in 24 hours.
Hardware age (10%)
Server age in years (from BIOS/SMBIOS install date or first-seen by the platform), plus an EOL warranty flag if applicable. Age 0–3 years contributes 0; age 8+ contributes 100. Vendors and warranty status are read from dmidecode on Linux and WMI on Windows.
Weights, at a glance
| Sub-score | Weight | Why |
|---|---|---|
| Resource pressure | 40% | A saturated server is an immediate outage risk. Highest weight. |
| Patch lag | 30% | Unpatched critical CVEs are the most common cause of compromise on internet-facing hosts. |
| Reliability | 20% | Flapping services and unexplained restarts often precede incidents. |
| Hardware age | 10% | A signal, not a verdict. Old hardware is correlated with failures but rarely the proximate cause. |
How to read the score
Server is operating within expected parameters. Patches current or low-severity only.
Investigate before it becomes critical. Often: high resource use OR pending high-severity patches.
Action needed. Saturation, missed agent check-ins, or critical CVEs unpatched.
The thresholds are configurable per tenant. The defaults above are calibrated against a sample of fleets ranging from 8 to ~400 servers.
Tuning and overrides
You can adjust weights and thresholds in platform settings if your environment has unusual characteristics — for example, a fleet of build agents that run hot by design, or a lab where old hardware is the point.
- Weight overrides: change the four percentages (must sum to 100).
- Per-host suppression: exclude specific hosts from contributing to fleet-wide alerting (still scored individually).
- CVE allow-list: mark specific CVEs as accepted-risk; they don't add patch-lag points.
- Band thresholds: shift the warning and critical cutoffs.
Tuning is logged so you can answer "why is this server green?" in an audit.
Why this design
A single score is reductive on purpose. Operators don't need a chart for every metric on every server — they need to know which server to look at next. The score answers that. The dashboard then surfaces the dominant sub-score so you know whether you're looking at a memory issue or a missed patch.
If you'd rather not collapse signals into one number, every input is also exposed individually — risk score is opt-in, not the only view.