Back to Insights

OT Threats, Penetration Testing, and Resilience

Achieving OT security isn’t just getting a tick in a box. “Good” pen tests in a plant aren’t as simple as an IT pen test but with a new badge.

21/01/2026 Podcast

An engineer with an OT network diagram that include a firewall and VLAN boundaries and a cellular modem icon. The engineer is looking at the actual physical implementation and it is chaotic with little chance that the diagram actually represents the reality.

Industrial control environments are full of systems that were built to run reliably, safely, and for a long time. That’s exactly why taking a traditional IT penetration testing approach and simply “aiming it at OT” often misses the point.

In part two of Felix’s conversation with Emily, a Principal Industrial Cyber Security Consultant, the discussion moves beyond compliance and into real-world threat behaviour: how attackers actually get in, how long they stay, and why resilience matters more than neat dashboards.

Why IT-style penetration testing falls short in OT

On paper, many OT networks look beautifully segmented: VLANs, firewalls, “levels,” and carefully drawn diagrams. In practice, Emily points out that those diagrams often hide the messy reality—firewall bypasses, overly permissive rules, and operational workarounds that accumulate over time.

Even when the firewall configuration is decent, the real surprises are frequently on-site: unmanaged outbound communications, “temporary” remote access, or vendor connectivity (often required by service contracts). These aren’t always malicious decisions—they’re usually made to keep production moving. But if you’re not monitoring, you may not know what routes into the environment exist today versus what was intended months (or years) ago.

Threat-based testing: start with what attackers actually do

A key shift Emily advocates is threat-based OT security: instead of pentesting “because the standard says so,” you scope testing around real attack paths that have been documented in the wild.

They reference notable examples and frameworks that help define plausible scenarios, such as attacks on safety systems and OT-focused tooling designed to pivot across PLC networks. The practical point: if your facility has safety systems, engineering workstations, SCADA servers, vendor remote access, or interconnected PLC zones, you can map those to threat reports and ask: Could this happen here? If so, what would the path look like?

Dwell time, trust, and what “good” looks like

The episode highlights a crucial contrast: some ransomware groups move incredibly quickly, hours, sometimes less, while sophisticated OT-targeting actors may maintain access for months before acting. In that time, they may understand your environment better than you do.

Emily describes that the strongest OT penetration tests aren’t “parachute-in” engagements. They work best when there’s trust, time, and strong operator involvement, plus careful risk assessment to avoid impacting production. And even in very mature environments with strong controls, a skilled team may still find a route to meaningful control-system impact, which is why technical controls alone aren’t the whole story.

Felix (00:09)
Hello and welcome back to the You Gotta Hack That podcast. This is part two of my conversation with Emily, a principal industrial cybersecurity consultant. In the last episode, we were chatting about the complexities of the 62443 standard, but this time we’re moving from compliance into combat, I suppose. We’re talking about real-world threats, the scary reality of dwell time, and what resilience actually looks like. Emily.

One thing that comes up for me a lot is that people try to apply traditional IT pen testing to OT networks, but I don’t think that’s worthwhile as-is. Is that something you feel is accurate?

Emily (00:46)
Time and time again, what you’d see is that people, on paper, if you look at the network diagram, it goes, wow, you have network segmentation, all these VLANs and these firewalls, and the, you know, the Purdue model of levels. Wow, it’s…

But then you’d actually look into the network, and you’d find there were all kinds of bypasses around the firewall. You look at the actual rules in the firewall and you go, what’s that? No, that’s just SMB to the file store in IT, okay?

Felix (01:16)
I mean, it’s fair to say there has never been a single vulnerability in the SMB stack. No, never.

Emily (01:23)
Which has not dramatically impacted significant parts of the UK. Not.

So that’s it, the point we’ve been making. I think some people do get a bit focused on this kind of like, yeah, we’ve got these networks, but once you fit-and-forget, and that’s why having something that’s looking at your network, having a proper pen test where it’s actually scoped to really test just how secure your firewall rules are and just how locked down they are, because chances are, unless you’re really doing a good job of monitoring them as well, there’ll be some other stuff going on.

And again, you know, we do find, even if the firewall is okay, I always find there’s room for improvement. Some are worse than others, but you do sometimes find some pretty open rule sets, shall we say. But even if they were pretty good, you’ll go to the plant and you’ll find that 4G modem, or that outbound comms. Again, it was done for all the right reasons. I understand, you’re there to make money, you’re there to do operations.

A lot of vendors these days, particularly for things like turbines and complex machinery, it’s in the contract, the service contract, that they must have remote connectivity. But again, there’s a variety of reasons why this other stuff exists, and if you’re not monitoring, how do you know?

So your company could be exposed to all kinds of bad days that you just have no visibility of. So if you’re not doing a test, if you’re not having someone working on-site looking at it, if you’re not monitoring the network with some network visibility product, then, yeah.

Felix (03:05)
I don’t know whether you’ll know this off the top of your head, but are there any bits within 62443 that indicate how, and to what level of quality, and what sort of style, the qualities of pen testing you should apply to an OT environment? Are there any bits that stick out to you that go, well actually, if you were trying to do this to a standard, this is what the standard says, even in really broad terms?

Because there are so many standards out there that have a “must do cybersecurity and include pentesting”, and that is the entirety of the advice. I wonder whether in 62443 there’s actually anything about that. And the reason I ask is: A, because I’ll be honest, I don’t know off the top of my head, but B, it’s the question that comes up so often, well what is a pentest in OT if it’s not an IT pentest in an OT environment? And can you even do one?

So many organisations won’t allow you to touch a live environment for pretty good reason. What does that imply? What do you do to compensate for that? How do you know you’re doing something that is not just the opinion of Felix or the opinion of Emily? How do you know you’re doing something that is considered rational and thought through?

Emily (04:24)
I don’t recall any specific areas that cover that, but again, I can’t pretend I know every single element of the standard, [unclear]. They keep adding new stuff, and I know there’s one about evaluations, so maybe they will start to move towards that.

And again, the question is, is it the right place? Because it’s consensus-based, it means changes take a very long time, because you’ve got a whole load of different committees that have to agree.

So in terms of that, you could use stuff like zones and conduits. If you’ve drawn that out, and you’ve got a network diagram as you should, with your zones and conduits in it, then actually that’d be a great place to start to go and scope out a pen test, right? And go, well look, we’ve got this kind of connectivity to this zone, this kind of connectivity down here. Maybe that safety system, interestingly, there is connectivity up here and through two different routes, maybe we can pivot to that.

But yeah, I think the thing here is to start with threats that we’ve seen in real life. So TRITON, the attack on a safety system, on the Triconex safety system, the route through that, through the IT into the engineering workstations, to then manipulate the firmware. That’s been documented. People actually have explained how that’s happened. The attacks in the energy sector in Ukraine back in 2015, 2016, that’s quite well explained.

And, you know, PIPEDREAM is another good one. Yeah, okay, it’s a term that we use, but it was a bit of malware that was found before it was actually deployed. A framework, kind of like a Metasploit for OT. And it had different modules in it, but it was designed to pivot across different PLC networks.

So often in complex plant, you’ll have a PLC that’s kind of the central PLC, and then a whole load of individual skid units and stuff with their own PLCs that are sort of linked up, and they’re often in different network zones. And essentially what one of these modules did was use that middle PLC as a bridge and just jump between the networks. And often that’s a design feature, so you’d do updates, you’d do firmware updates, and a number of the vendors natively support that anyway.

But getting back to threats, we’ve got a number of documented things that have actually happened. PIPEDREAM, TRITON, CrashOverride. I’d be going to have a look at those depending which sector.

Felix (07:19)
Interesting. That’s a fascinating thought, I quite like that.

Emily (07:22)
Yeah, so we use threat. That’s what all our work is about, actually being threat-based as opposed to just kind of like… Because again, sometimes you deploy a control, but if the threat you’re worried about is a ransomware group spreading ransomware from your IT to your OT, then there’s been a number of documented cases of that. There’s some really basic things you need to do to reduce the risk of that. Don’t have a shared domain across IT and OT, for example.

Felix (07:51)
Yeah. I think just the very concept of not relying upon a third party to say, yes, you’re doing the right thing, but instead doing it the other way around and saying, well, we have determined these are our problems, and therefore this is why we’re doing the pentesting in that way, or your controls in that way, or whatever. I think that’s quite liberating for some organisations.

I guess there’s maybe the tendency for people to want to have a pen test because it’s a checkbox thing, and therefore they’ve kind of finished the due diligence or whatever, and therefore adding more steps to that makes things harder and less attractive. But hopefully what we won’t find is that people try to skip this step altogether if they try and do it with a bit more validity in the background there.

Emily (08:42)
One of the first things we do is ask for any previous assessments you’ve had, because it gives us a good starting place. You’ve got people who really know the issues, but they’re trying to explain it to their senior leadership, so they might get a pen test company to come in to help say, it’s all red. I told you it was all red, but here you go, someone else has said it’s all red as well. I think both of us have that. We feel that sometimes we’re there for that, basically helping people articulate the issues.

But equally, if it’s not scoped well, you can end up with a pen test report that doesn’t really explain your risk and maybe only has an amber or two, and people go, well, what’s the problem? It’s all good. Yeah, absolutely. Absolutely. It’s a scary bad pen test.

Felix (09:31)
Yeah, I mean, unfortunately those concepts don’t just apply to OT, do they? They apply across the board. And as the pen testing market, if that’s the right way of putting it, is growing, particularly in the UK, what I see is a lot of, I hesitate to say poor quality, because I’m not sure that’s necessarily true, but just too focused on a particular bit. It’s not holistic enough to understand the concerns in the aggregate, in the wider space, and be able to put those risks in context. And context is king, right? It’s so important for all of these things.

Emily (10:11)
So the dwell time for ransomware groups can be very, very, very small. They can literally, from initial access to [unclear], it can be very short, hours, maybe days. I think there’s been documented cases of literally less than an hour, which is just insane.

Some of the most sophisticated actors targeting OT, particularly the ones looking to gain persistent access, that initial compromise and their persistence, they’re trying to be stealthy as well. And I think the trade-off is, you know, six or twelve months of dwell time before they actually did what they did. So they’ve got a lot of time to understand the network and to understand what other systems are out there. They probably have a better understanding of your network than you do because they’re pretty much monitoring it.

Yeah, that’s it. It’s a good thing to think about when the pen test comes in. I think some of the best pen tests that we’ve done, it’s not something I do, it’s a different team, but they’ve been where we’ve worked with the client for quite a long time. We have a really good understanding of them and we’ve built up the trust. We’ve also got really knowledgeable people on their side who are able to get us permission to kind of, you know, we really do the risk assessments that we need to do so that we can actually start to go down into some of the OT.

And again, you can do it, you just really need to understand the risk. You need to have someone on the operator side who really understands the risk, and then also there’s things you can do to mitigate it.

Felix (11:49)
I’ve really enjoyed this bit of this discussion because obviously it plays into exactly what I do for a living as well. Yeah, there’s a lot of people who ask questions about OT cybersecurity and pen testing and that kind of stuff, and there’s just kind of no real answers. And it can feel a little bit like I’m just making stuff up when you don’t have a big crowd of people who are saying the same thing.

And I don’t feel OT cybersecurity necessarily has that overt crowd yet. There’s bits of that coming. You look at some of the more established conventions around the world and there are OT-specific ones, but even in the more mainstream hacker conferences, if that’s a thing, there are now elements dedicated to OT and IoT, or cyber-physical effects, and all this wonderful buzzword soup. I don’t know, the weight of the voice hasn’t got behind it yet, in my opinion.

So it’s good to hear from someone who’s completely objective in this space, as far as I’m concerned anyway. I come to similar conclusions, if not the same.

Emily (12:57)
You know, again, why not just pull up a threat report? There are literal threat reports that explain in lots of technical detail how stuff’s happened. Pick one of those up in the scope again and say, hey, we have safety systems, could something like this happen to us? Is there a plausible scenario?

Because I was thinking, there was one pentest I heard about, and I’d done a lot of work with this client, and I knew they had a really robust architecture. They had quite a large team as well. They were doing a really, really good job across virtually all domains. If there was a pinnacle client, it would be this one.

But even the pentest team, once they were given initial access into the OT, once they had that, because again to save time you’ve got to scope it, so yes, they probably could have found a way in, they were able to jump between the engineering supervision system, the SCADA servers, and down into the controllers, and they would have been at a point to send commands to controllers. Because again, a lot of these things don’t have authentication.

And that was with really robust controls. Even the assets, everything’s whitelisted. There’s all kinds of different security products, network monitoring across virtually the whole network. They had loads of separation between various different trust zones. It was a really impressive setup. But even after all of that, the pen test could find a way in. So it goes to show that even if you’re there, there’s other stuff you need to do other than just technical controls.

Felix (14:52)
In 2026, the You Gotta Hack That team has two training courses. On March the 2nd, we start this year’s PCB and Electronics Reverse Engineering course. We get hands-on with an embedded device and expose all of its hardware secrets, covering topics like defeating defensive PCB design, chip-to-chip communications, chip-off attacks, and the reverse engineering process. On June the 8th, we launch the Unusual Radio Frequency Penetration Testing course. We dig into practical RF skills so that you can take a target signal and perform attacks against it in a safe and useful way.

Both courses are a week long. They are a deep dive, they’re nerdy, and we provide everything you need other than your enthusiasm. As the Unusual RF Penetration Testing course is brand new, you can be one of our beta testers and get £1,000 off. There’s more information available on our website at yougottahackthat.com/courses, and we recommend booking straight away as we have to limit the spaces to ensure the best learning experience.

But for now, let’s get back to today’s topic. All right, so as it happens, Emily and I were just having a bit of a wrap-up and realised we hadn’t talked about resilience. So, go.

Emily (16:00)
Okay, so for me, one of the things that often comes up is people are trying to get to grips with how to understand what risks they’re willing to take. They understand they need to set their risk appetite, and it’s covered all over the place. What does it mean? Who knows how to calculate it? You go online and there’s stuff that has likelihood and threat and all that, and you end up with a risk number or whatever.

I have massive issues with that. That is a nice little dashboard with “this is our risk”, but in my experience that doesn’t actually articulate the reality of the risk to the organisation. If you looked at a risk register versus the reality, you’d go, wow, I don’t think the executives understand the risks they have endured. Their entire business wiped out by ransomware.

Felix (16:32)
It’s like a bit of management porn.

Emily (16:57)
You look at their risk register and it doesn’t quite appear that way. So, because of likelihood in OT, one of the challenges with that is, although now you can start to look at ransomware events that have spread into OT and there’s actually a great number of those, and reporting has helped with that, likelihood can still be quite a hard number to calculate.

So like true OT attacks, you know, there’s only been, what, 12 or something like that, not many. So I like to look at resilience and just go, as an organisation, as a company, as a site, what kind of things do we want to be resilient against?

So is it in our appetite to be resilient against IT ransomware spreading to our OT? Now, I would say most senior leaders would say, yeah, I want to be resilient against that. It’s mad that I wouldn’t want that to happen, that would tank my business.

And you can start to think about certain cases. So, I want to be resilient against a third party being compromised and impacting my OT. And it just takes luck out of the equation and starts to go, let’s just think as a business, what do we actually want?

But the problem is, of course, it’s less pretty in a dashboard. But there can be some pretty simple statements saying, we want to be resilient against attacks.

Felix (18:20)
Have you found people want to try and quantify, you know, apply numbers to their level of resilience against, and effectively just swap the word likelihood for resilience?

Emily (18:29)
So it’s not necessarily a bad thing, but the problem with some of that is the people doing it might understand the nuance. The problem is when those dashboards and things end up, or people leave or whatever, and the nuance is forgotten, and they start to be adopted as gospel. Because those numbers might represent something plus nuance, but by themselves aren’t necessarily meaningful.

But what you can do is start to factor in inbuilt resilience. Like, if your safety system, your emergency shutdown systems, things like that, are networked, or really genuinely are not networked, and all those physical, I know, I know we laugh, and all those other physical safeguards.

A lot of OT industrial processes have physical safeguards, pressure relief valves, emergency actuators, and various systems for that because they’re all about functional safety. Now don’t get me wrong, you don’t necessarily want an issue to have got to that point, because by the time physical safeguards are in play it’s a pretty concerning situation, but they are there ultimately to protect life and try to prevent catastrophic loss.

But again, you can start to build some of that in. So you might realise that certain attack paths just aren’t possible. Let’s say changing the amount of pressure in a vessel because you’ve opened a number of valves and turned on some pumps. Well, if there’s a pressure reducing valve, if there’s a pressure relief valve, then ultimately that isn’t going to end up in a catastrophic scenario. Or mixing chemicals, there’s likely to be functional safety safeguards in.

You can factor all of that into your resilience, which is why a lot of the risk models haven’t really gone into that detail. Anyone who works in functional safety, this is standard stuff for them, thinking about failure modes, apart from the fact that cyber introduces failure modes that, in my experience, haven’t really been delved into in some of the process hazard analysis type stuff.

You can do things, if you can control individual controllers, you can bypass some of the safeguards that some of the functional safety systems have.

Felix (20:52)
I feel like there’s a whole episode in talking about the failure modes that cybersecurity problems introduce. A couple of things came up in my head as you were talking about that. There are examples of not exactly attacks, but circumstances around operational technology that just weren’t expected and ended up in catastrophic events.

I can’t remember the name of the plant, but it was a British incident, and it must have been handling flammable fluids of some description, because it was reported that cars driving along the road next to this plant suddenly accelerated and they had a lot more “go”, because essentially they were drawing in air that was pre-filled with flammable gases of some description and therefore had more petrol, effectively. And I believe one of the things that happened on that one was that the last-stop safety had kicked in and therefore it was releasing gas rather than having a big contained explosion, it was leaking all this stuff all over the place.

However, what happens next? Well, in that particular circumstance, if I remember the story correctly, somebody followed a procedure, then they hit a switch and the whole place went boom.

Sometimes there are unknowns or unexpected consequences or conditions that you need to take into account. And I feel that the complexity around an intentional cyber attack against these things is far more than most people can cope with, in terms of designing likelihood or impact, or maybe even resilience.

How do you design a system to be resilient against an intentional act of this sort of thing where it’s complicated and nuanced, and only in very specific circumstances does it actually go boom?

I spent a bit of time having a look at CVSS version 4, because there are starting to be the concepts of things like safety and physical response to systems and that kind of stuff. And that sounds great. I still actually don’t like CVSS version 4 because it doesn’t feel like it’s got much control over how the score is raised to a significant level. It just feels rigid compared to [unclear], despite the fact that in theory it’s got way more nuance for weird situations like physical threats.

But anyway, that was really interesting. Thank you for sharing that with us. I’m glad we picked that up before we ended the show today. Have you got any other thoughts on that, or anything that I’ve just said?

Emily (23:29)
I have one final thought. Go to the US Chemical Safety Board’s YouTube. The channel is USCSB. It’s really cool because it goes into the investigations of various industrial accidents across all kinds of different sectors, and it tells you a lot about unintended consequences and unexpected failure modes.

Because yeah, a lot of these systems are designed to fail safe and things, but there’s a zillion ways in which they don’t. It’s funny, the incidents where, as a result of a safety valve opening, you prevent the pressure vessel exploding, but then you end up dumping to atmosphere, and then obviously it ignites and you still get a catastrophic failure. It is just crazy, but very interesting.

Felix (24:11)
Really, really interesting. The resources you pointed out sound epic. Thank you very much, everybody. Catch you next time.

Emily (24:14)
Thank you. Cheers.

OT Threats, Penetration Testing, and Resilience

Why IT-style penetration testing falls short in OT

Threat-based testing: start with what attackers actually do

Dwell time, trust, and what “good” looks like

Get in touch