Most resilience conversations start with prevention: how do you stop the outage, stop the breach, stop the misconfiguration from going out.
And while those questions are important, we think the most important one is being left out: what do you do after it has already gone wrong?
At Cisco Live Amsterdam in February, Patrick Quirk, President of Opengear, joined leaders from BlueCat and IP Fabric on stage answering direct questions about agility, security, and what happens when networks fail. The conversation landed in a more useful place than most panels do.
The question worth building on
The panel opened with the tension between security and agility, and it is a real tension. Security teams slow things down to manage risk. Operations teams need to move fast. That conflict is not manufactured.
The conversation was most valuable when it shifted to the question of what comes next: not just how do you prevent failure, or detect it, but what do you actually do when it happens. Because it will.
The IP Fabric perspective was about evidence. You cannot automate safely against network state you cannot verify. The BlueCat perspective was about trust. Security controls that drift from operational reality stop being trusted, and that is when the manual workarounds and approval queues pile up.
Patrick’s perspective was about control. Specifically, about what kind of control survives failure.
His argument, stated plainly: your management plane needs to be orthogonal to your data plane. If your ability to access and act on your network runs through the same infrastructure that just broke, you are not in control. You have awareness. Those are not the same thing.
Why this matters more now
Agentic AI is not a future state. Organizations are deploying it in network management today. Patrick said as much in Amsterdam, and the room nodded.
The difference between a human operator and an agentic AI is not capability. In many ways the AI is more capable. The difference is that an agent will never stop. A human operator hits a wall: they cannot reach the device, they cannot get a console session, the VPN is down. They stop and escalate. An agent keeps looking for a path. It tries to fulfill its objective. It will find vectors a human would not.
That is mostly good. Until it is not.
An AI agent that has been instructed to remediate a routing problem and cannot reach its target through the primary network will find another way in, if one exists. An AI agent that has been given security enforcement responsibilities will enforce them, including, potentially, locking out the recovery path for the human trying to fix what the agent flagged.
This is not a hypothetical. It is a predictable consequence of deploying autonomous systems on networks that were not designed with the assumption of failure.
Design for failure is not a defensive posture
This is the part of Patrick’s closing remarks worth repeating.
He did not say “plan for failure” as a risk management note. He said design for it. From day one. As a foundational assumption.
The distinction matters. Planning for failure means you have a runbook. Designing for failure means the network itself, the architecture, the management stack, the access paths, was built with the assumption that something will go wrong, and the question is whether you retain control when it does.
Out-of-band management is not a new idea. Serial consoles have been around for decades. What has changed is the threat surface and the operational model. When your primary network fails today, it may not be a hardware fault. It may be a ransomware lateral move. It may be a misconfiguration pushed by an automation script. It may be a security policy that did exactly what it was supposed to do and locked out the wrong thing. In any of these cases, your ability to recover depends on whether you have access to the devices through a path that is independent of what just went wrong.
Tier four in data center design means your management plane is fully independent of your production plane. The same logic applies everywhere networks run today: distributed enterprise, edge, maritime, financial services, healthcare. The network gets more distributed. The stakes get higher. The need for an independent control plane does not go away.
What the panel got right about the partnership
A session like this works when three vendors with different capabilities each add something the others don’t.
IP Fabric’s contribution is ground-truth visibility: what is the network actually doing, not what the documentation says it should be doing. BlueCat’s contribution is intelligent DNS and network services management, reducing the drift between security policy and operational reality. Opengear’s contribution is the access layer that remains operational when everything else has failed.
Visibility. Policy. Control. Each one is necessary. None of them is sufficient alone.
Patrick’s framing for this was direct: “Resilience is an outcome, not a product.” The three companies on that panel do not sell resilience. They sell the building blocks that, when combined with thoughtful architecture, produce it.
The one thing to take away
The moderator asked each panelist for a single takeaway at the close of the session.
Patrick’s was this: assume your network is going to fail, and build everything, your controls, your operations stack, your access architecture, from that assumption. If you start there, you are building resilience in from day one rather than bolting it on after the first incident teaches you the hard lesson.
The corollary, which he stated earlier in the session, is what this post is titled after.
Visibility without control is just awareness.
You can see what is happening. You can know what went wrong. You can have dashboards and telemetry and a fully correlated incident timeline. None of that helps you if you cannot act on what you are seeing. And if your management path runs through the network that just failed, you cannot act.
The practical question
When was the last time your team validated that out-of-band access actually works, not in a scheduled maintenance window where someone manually tested the console server, but a real scenario where primary access was unavailable and OOB was the only path?
Most organizations that have OOB deployed have not tested it under realistic failure conditions. Most organizations that have not deployed it are waiting for the first serious incident to make the case internally.
The design-for-failure principle says you do not want to be making that case during an incident. You want to have already made it.
Opengear’s Smart Out of Band® management platform and the Lighthouse® centralized management platform exist to make that independent control plane deployable at scale, across distributed environments, with the automation and visibility to know it is working before you need it.
That is the part that does not fit neatly into a panel. The panel gives you the argument. The implementation is where it becomes real.




