Living at the Threshold: Dynamical Systems Theory for Security Operations

The views and opinions expressed in this article are those of the author and do not reflect the views of any organization or employer.

In the past few weeks, I’ve been using generative AI to explore general and special relativity, quantum mechanics and its applications across physics, chemistry, and philosophy, and how these theories intersect and diverge. I took college courses in these subjects, but I never understood them like I do now. But now, hours of simulated conversation with an interlocutor who can meet me where I am, answer follow-up questions without judgment, go down rabbit holes without annoyance, and generate visualizations on demand has done what years of traditional instruction could not.

What I’ve been learning is dynamical systems theory. The mathematics of feedback, stability, and thresholds. It maps directly onto cybersecurity.

This is a paradigm shift, and most organizations haven’t absorbed what it means. The assumptions underlying “best practice” were formed in a world that no longer exists. The cost of exploration has collapsed. The penalty for not exploring is growing. What follows is my attempt to bring that vocabulary across. To describe how security operations actually works, and how to measure whether it’s working.

Consider the security analyst three hours into her shift, triaging the same categories of alerts she triaged yesterday. Somewhere in the queue, between the noise, sits a finding that matters. She might catch it. She might not. It depends on a myriad of factors, from how many alerts arrived before it to how much sleep she got. Her attention, her fatigue, the quality of her telemetry, etc. are all variables in a system whose behavior emerges from their interaction. None of them alone determines the outcome. The dynamics do.

Her queue has been growing for three weeks. Recovery time after each incident is longer than last month. Something is wrong, but the vocabulary she’s been given (controls, compliance, risk ratings) doesn’t capture it. She knows the system is drifting. She just can’t name where.

Dynamical systems theory provides the name.

The problem: state-based thinking

Most security models pretend posture is a state. A “control” is implemented or not, passing or failing. A SOC 2 Type II report captures a moment. Between that moment and the next audit, configurations drift, services deploy, staff turns over, and the snapshot becomes a historical document about a system that no longer exists.

Within this paradigm, our analyst plays whack-a-mole. Scan, find, patch, repeat. Each vulnerability is treated as an isolated event rather than a symptom of system behavior. She remediates the finding but never addresses why vulnerabilities accumulate faster than she can remediate them. The perimeter she’s defending dissolved years ago. Workloads span providers. Data flows through APIs no one fully controls. Identity determines access, not network topology. Meanwhile the actual dynamics go unexamined.

These are symptoms of the same underlying problem: treating security as a fixed property rather than an emergent behavior.

The reframe: your environment is a dynamical system

The concepts physicists use to study systems (attractors, feedback loops, thresholds, coupling, bifurcation) translate directly into how security operations actually work. Not as metaphors. They describe the structure of what security practitioners navigate every day. The goal is to make that structure explicit. To work with it deliberately rather than by feel.

Thresholds are where influence concentrates. In dynamical systems, a threshold is the boundary between basins of attraction. On one side of it, the system recovers from perturbation and returns to a secure equilibrium. On the other, it drifts toward compromise. The threshold is where the system’s fate gets determined, and security teams live there whether they recognize it or not.

Two attractor basins in a projection of security state space, showing trajectories spiraling toward either a secure or compromised attractor, with a threshold boundary between them.

Figure 1: Two attractor basins in a projection of security state space. The x-axis captures aggregate exposure accumulation: how fast vulnerabilities, configuration drift, and attack surface grow. The y-axis captures corrective feedback strength: how effectively detection, triage, and remediation push back. Every trajectory spirals toward one attractor or the other. The threshold boundary between them is where the system’s fate is determined.

Multiple thresholds coexist. Alert volume has one: above it, triage quality collapses. Staffing has one: below it, response capacity can’t keep pace. Vulnerability accumulation has one: beyond it, remediation debt becomes unserviceable. Each operates semi-independently. Each can trigger cascades when crossed.

Critical slowing down: the early warning

Systems approaching a threshold produce warning signatures before they cross it.

Recovery time increases. Independent metrics start correlating. Variance grows. Queue depths trend upward. Physicists call this critical slowing down. It tells you where the system is heading, not just where it sits.

Our analyst’s queue has been growing for three weeks. Her recovery time after each incident is longer than it was last month. These are the signatures. A system that passes every audit but shows critical slowing down is closer to compromise than one with open findings and fast recovery.

Monitor for threshold proximity, not just state.

Example: a guitarist inside the loop

A guitarist sustaining a note at the edge of feedback lives at the same kind of boundary. Too much pressure kills the note. Too little and it screeches into noise. The skill is maintaining dynamic equilibrium through continuous modulation. And the guitarist is inside the loop: vibration through pickup through amplifier through speaker through air through string through finger. The finger doesn’t observe the feedback. It participates in it.

Security operators participate in their systems the same way. Their attention, fatigue, judgment, and response speed are state variables. They are not outside the system looking in. They are part of what determines where it goes.

This is the difference between a thermostat and a living being. The thermostat regulates: deviation triggers correction, mechanically. The guitarist modulates: staying at the edge, feeling the system, adjusting continuously. You cannot “control” a complex system from outside. You can only tune it from within.

The eigenvalue: one number governs stability

Behind the observable signatures of critical slowing down is a single mathematical quantity: the dominant eigenvalue.

The eigenvalue governs how fast the system recovers from perturbation. When it’s strongly negative, the system snaps back quickly. Stable, with margin. As it approaches zero, recovery slows, variance accumulates, and correlations spike. If it crosses zero and goes positive, the system diverges — perturbations amplify rather than decay.

This is why all the critical slowing down signatures co-occur: they’re all consequences of the eigenvalue approaching zero.

The operational translation: MTTR is a direct proxy for the eigenvalue. Recovery time τ ≈ 1/|λ|. When your Mean Time to Remediate rises, your eigenvalue is falling. You’re approaching a threshold. When MTTR is stable and low, you have margin.

This transforms abstract math into something actionable. You’re already measuring MTTR. Now you know what it means.

FedRAMP 20x: the infrastructure for eigenvalue governance

FedRAMP 20x, read through a dynamical systems lens, is an attempt to make system stability observable and governable.

The framework’s core innovations map directly to the concepts above:

FedRAMP 20x Concept	Dynamical Systems Translation
Deterministic telemetry	State observation — making the system’s trajectory visible
Persistent validation	Continuous measurement of dynamics, not periodic snapshots
Capability-based assessment	Evaluating attractor health, not just control presence
Automated evidence collection	Closing the feedback loop between state and response
Risk-based prioritization	Focusing on threshold proximity, not uniform control coverage

The demand for “deterministic telemetry” is the key. FedRAMP 20x requires that security claims be backed by data, not narrative. This is what’s needed to estimate eigenvalues. The numbers that tell you whether your system is stable, approaching a threshold, or already diverging.

From KSIs to eigenvalues

FedRAMP environments already generate the data needed for eigenvalue estimation. The Key Security Indicators mandated by continuous monitoring programs are, in effect, time series of state variables:

FedRAMP 20x KSI	State Variable	What It Measures
Monitoring, Logging, and Auditing (KSI-MLA-RVL)	Q	Alert queue depth and backlog accumulation
Service Configuration (KSI-SVC-ACM)	C	Drift from secure baseline
Change Management (KSI-CMT-VTD)	V	Perturbation rate and accumulated exposure from changes
Incident Response (KSI-INR-AAR)	τ	Recovery speed (≈ 1/\|λ\|)
Recovery Planning (KSI-RPL-TRC)	R	Capacity to restore equilibrium after disruption
Supply Chain Risk (KSI-SCR-MON)	S	Coupling strength to external perturbation sources
Identity and Access Management (KSI-IAM-ELP)	A	Access surface and zero trust enforcement
Policy and Inventory (KSI-PIY-GIV)	P	Completeness of state observation across resources
Cloud Native Architecture (KSI-CNA-MAT)	D	Architectural resilience and decoupling
Cybersecurity Education (KSI-CED-RST)	E	Human operator readiness within the feedback loop
Authorization by FedRAMP (KSI-AFR-PVA)	Σ	Aggregate system compliance with governance thresholds

The data exists. The interpretation is missing. Three methods to bridge the gap:

Track MTTR trends, not just MTTR values. A rising trend signals threshold approach even if the absolute value is still “acceptable.”
Watch for correlation spikes. When previously independent KSIs start moving together (vulnerability count rises as patch compliance falls as queue depth grows), the system is entering the threshold region.
Measure recovery after perturbations. When a new CVE drops or a major deployment occurs, how fast do your metrics return to baseline? Slower recovery = closer to threshold.

Course correction: tuning the eigenvalue

If the eigenvalue tells you where you are, tuning it is how you steer. The goal is to shift the underlying dynamics. Push λ further from zero, widen your stability margin.

Interventions that shift eigenvalues toward stability:

Intervention	How It Shifts λ	Observable Effect
Add remediation capacity (staffing, automation)	Strengthens corrective feedback	MTTR drops, queue stabilizes
Reduce deployment velocity temporarily	Decreases perturbation rate	Variance drops, recovery improves
Improve detection fidelity (reduce false positives)	Increases signal-to-noise ratio	Triage throughput increases
Decouple vulnerable subsystems	Reduces cascade risk	Cross-correlations weaken
Implement security gates earlier in pipeline	Prevents perturbations from entering production	Introduction rate falls

Interventions that shift eigenvalues toward instability:

Intervention	How It Shifts λ	Observable Effect
Increase deployment velocity without capacity increase	Perturbation rate exceeds recovery rate	Queue grows, MTTR rises
Reduce security staffing	Weakens corrective feedback	Recovery slows across all metrics
Defer remediation to hit feature deadlines	Accumulates vulnerability debt	V grows, correlations increase
Add new integrations without security review	Increases coupling and attack surface	Variance spikes after changes

The art is reading the eigenvalue proxies and adjusting before you cross a threshold. A system showing critical slowing down (rising MTTR, growing correlations, increasing variance) needs intervention now. Not after the next audit.

The regulatory opportunity

FedRAMP 20x opened the door. The framework’s emphasis on continuous evidence and capability demonstration creates the data infrastructure for eigenvalue-based governance. What’s missing is the interpretive layer. The translation from raw KSIs to stability assessment.

This is an opportunity for:

Cloud service providers to differentiate on demonstrated stability, not just compliance checkboxes
Authorizing officials to assess risk dynamically rather than annually
Third-party assessors to evaluate trajectory, not just current state
The FedRAMP PMO to evolve the framework toward explicit stability metrics

The opportunity extends beyond federal compliance:

Financial services — from stress testing to stability assessment. Rating agencies like Moody’s and S&P already factor cyber risk into credit ratings, but crudely. Eigenvalue metrics make “cyber risk” quantifiable. Private equity and venture capital could conduct due diligence not on whether a company has SOC2, but whether it’s stable or drifting toward threshold.

National security — CMMC is moving toward continuous assessment, but mission assurance is fundamentally about stability under adversarial conditions. The Defense Industrial Base includes thousands of suppliers whose security postures couple to national security outcomes. Supply chain resilience is about understanding which supplier failure cascades furthest. That’s an eigenvalue question.

Insurance — cyber insurers struggle to price risk because they lack good models. Eigenvalue analysis offers granularity that wasn’t possible before: pricing policies based on stability trends rather than checkbox audits, distinguishing organizations approaching thresholds from those with genuine margin.

The same framework applies wherever complex systems meet risk: healthcare (patient safety as a dynamical systems problem), critical infrastructure (power grids, pipelines), manufacturing (production system resilience). The math doesn’t care about the domain. Thresholds are thresholds.

The math exists. The data increasingly exists. The gap is operational adoption. Organizations that close this gap will know they’re approaching a threshold before they cross it — and will have the language to communicate that risk to stakeholders who understand trajectories, not just snapshots.

What to do tomorrow morning

Translating this framework into operational changes does not require wholesale transformation. A security leader (CISO, SOC manager, senior analyst) can start with visibility into the dynamics already present. Pick one item from each category and iterate from there:

Map one feedback loop from trigger through action to outcome. Note whether it closes (output informs input) or stays open (output goes nowhere). Open loops cannot self-correct.
Identify one threshold parameter in your environment: staffing level, alert volume, deployment rate. Watch for the signatures of critical slowing down: rising recovery times, increasing correlations, growing queues.
Measure one eigenvalue proxy: track MTTR trend over 30 days, or measure how long your metrics take to stabilize after a significant CVE or deployment.
Calibrate one security gate based on observed data: what does it catch, what escapes, what gets bypassed? Adjust based on what the evidence shows, not what policy prescribes.
Model one coupling: if vendor X is compromised, what is the blast radius in your system? Prioritize monitoring on strongly coupled subsystems.

The full picture emerges through iteration, not a single assessment.

Our analyst is still at her desk. The queue is still growing.

But now she has language for what she already knows. She’s approaching a threshold. The rising MTTR, the correlations between her metrics, the growing queue. These are signals. The system is announcing where it’s heading.

The loops need to close. The eigenvalue needs to shift. And she’s not outside the system looking in. She’s the finger on the string, the threshold where the system’s fate gets determined.

Tomorrow morning, she’ll start mapping the feedback loops.

Like this article? Here's another you might enjoy...

engagement_divergence.md