Close this search box.
SDP 6: Failsafe Defaults


About this episode

In this episode, we go back to the Security Design Principles series, this time we are discussing Failsafe Defaults.

Failsafe defaults simply means that the default condition of a system should always be to deny.

An example of a failsafe default is the security reference monitor (SRM) that has been implemented in Windows operating systems since Windows NT. The SRM prevents access to any actions like logging on, accessing a file, or printing something unless the user presents a token to prove that they should have access to a file or an action.

There will always be two choices for failsafe defaults – to fail close or to fail open. The DoD and government organization side will tend toward using the fail close option, while the commercial and more streamlined companies will definitely prefer to fail open.

There will always be this challenge between security and operations. More security means less operations and more inconveniences, while prioritizing operations means that security will not be the best. It all depends on your organization and its goals.

Understanding failsafe defaults and other security design principles will help you become a better analyst and produce more secure, robust, and functional systems.

What You’ll Learn

  • What is Failsafe Defaults?
  • What are some examples for Failsafe defaults?
  • What is the Security Reference Monitor
  • What is the difference between failing close and failing open?

Relevant websites for this episode

Episode Transcript

Kip Boyle:
Hey everybody, guess what? It’s time for Your Cyber Path. My name’s Kip Boyle, and with me here is Jason Dion. Hey, Jason, how are you today?

Jason Dion:
Nice to see you again. Great to be here. Back on the podcast, doing another wonderful episode.

Kip Boyle:
Yeah, thank you very much for being here. Really appreciate your commitment to helping our audience figure out what they need to do to get into the cybersecurity career, or if they want to accelerate their cybersecurity career, they want a promotion, they want more responsibility. Of course, greater compensation comes with greater responsibility, so that’s good too. And anyway, that’s what we’re all about, everybody.

So welcome to the episode and what we’re going to talk about today, talk today is we’ve reached the halfway point in our series on something called Security Architecture and Design Principles. And this goes all the way back to the early to mid-1970s. I’ll spare you the recap of the paper that we’re drawing these principles from, but you can always go back and check out our prior episodes where we really break it all down and tell you where this is coming from.

Today what we want to do is we want to talk about the next design principle. Last time we talked about Work Factor, go back and listen to that episode. Today we’re going to talk about something called fail-safe defaults. And this is the first of the last five principles that we’re going to talk about. Now, why do we even talk about principles? Because if you use them in your job, you’re going to sound like an excellent analyst, you’re going to sound super smart, and you’re going to make higher quality decisions on the job if you use these principles to guide your work. So Jason, what is a fail-safe default? I mean, what does that even mean?

Jason Dion:
Yeah, so if we go to the pure definition back from the paper, it says the protection mechanism should deny access by default and grant access only when explicit permission exists. So really what we’re talking about here is that the best condition, the fail-safe condition, the default condition, is always going to be locked. And so if you think about it like a physical world, and I have a security gate in my neighborhood, if the power goes out, what should that gate do? Well, if we want best security, we want it to lock in the shut position so nobody can get in and nobody can get out. Well, that can be a dangerous thing in the physical world because we might lock you into a neighborhood where there’s a fire going on or something like that, and we wouldn’t want that. But in the security world, we do want that. We want things to lock down. And so anytime you have something that can make a choice and there’s an error, the error should always be to a lockdown condition. That’s really what we’re talking about here with fail-safe defaults. At a very macro level.

Kip Boyle:
What this reminds me of, you have a background of working in the Navy, and when I heard you speaking just now about hey, locked gates, they should fail close, although that can be dangerous. Well, that happens all the time on ships and submarines, doesn’t it?

Jason Dion:
Yeah, exactly. We have a lot of different things that we use that are mechanical systems, and a lot of them are controlled by computers. And for example, on a submarine, you want to be able to open the valve and take water in, which then allows you to get heavier and sink down, which is what a submarine wants to do because it’s always trying to maintain buoyancy. And if it needs to rise, it would push that water out by using air, and it pushes it out of that bellows in that system that holds all that water. But you’re right, we would want that to always fail close so that we’re not taking on more water in the case of an issue, and that way we can then try pumping out that water and get ourself back to the surface.

Kip Boyle:
And what about, and I don’t know, so I’m asking, seriously, what about the idea of watertight compartments. On movies, you see this all the time where a ship comes under attack and the order goes out and all the doors are sealed. And if there’s crew on the wrong side of a door, well, that’s kind of too bad.

Jason Dion:
Well, that is the decision that has to be made. And so the big difference here is when you talk about those sealed doors on a navy ship, they’re not done automatically. We physically go and lock those doors. And the reason is exactly for what you said, to be able to hopefully get the people onto the right side of the door before we shut it. But if we have to make the decision of losing a ship of a thousand people or losing one sailor who’s on the wrong side of the compartment when the order goes out, we’re going to choose the thousand over the one. It’s the needs of the great over the needs of the few.

Kip Boyle:

Jason Dion:
But you’re right. It’s that fail close mentality where we have to start shutting these doors. And recently I was on a cruise. I love going on cruises. And if you ever go on a cruise ship, as you’re walking down the halls, you’ll see these electric doors, they’re called fire safe doors, and they’re held open by a magnet. If they lose power, those doors will all lose the magnetism and they will shut. And when that happens, that is a fail-safe default to prevent fire from going from one compartment to another on a large ship like a cruise ship. They simply don’t have enough crew in all these different spaces to shut these doors manually in time when there’s an issue. And that’s why they have these automated systems to do it.

Kip Boyle:
And a 16 deck cruise ship, I can imagine everybody fanning out to lock doors manually. That ain’t going to happen. You’re right. I want to give one more real world example, and then I think we should turn our attention to how fail-safe defaults work in computer systems. But some of the examples we’ve given so far are a little bit esoteric. If you’ve never been on a ship or a submarine, you’re just imagining what it’s like or thinking about Hollywood depictions of it. But here’s a fail-safe default that we use all the time. Almost every one of us has used this. Some people use it multiple times a day, and that’s an elevator. Elevators have a fail-safe default, which is if the elevator stops functioning, cable becomes frayed and breaks, the car that you’re standing in is going to grip the rails and not go anywhere, it will not go rushing to the bottom like so many trapped Jedis on a spaceship.

That’s not the way it works. And thank God for that, right? Because we do want a fail-safe default like that on our elevator systems. So anyway, I just thought I’d throw that out there in case people were wondering like, well, will I ever actually encounter a fail-safe system? Yes, in the real world, you will.

Now, let’s talk about how this works in the computer setting. So every modern multi-user operating system has something that will allow you to protect resources, some kind of a mechanism, and you can rarely see it directly, but you can see it working. So anytime you’ve ever set permissions on a file, you actually have manipulated this protection mechanism. So in Windows, this thing is called the Security reference Monitor or the SRM, and it’s been in there since Windows NT, whatever. What was the first version? Windows NT 3? 3.1? Something like that.

Jason Dion:
Well, it was Windows 3.0, Windows 3.1. And then once it got to 4.0 is when they started calling it NT because it was new technology.

Kip Boyle:
Okay. So Windows NT has a security reference monitor. So does Windows 2000. Every Windows-

Jason Dion:
I feel old now, I’m sorry. Now that I remember that, the fact that I knew that makes me feel really old. Sorry.

Kip Boyle:
I saw your brain go into overdrive for just a moment as you were trying to retrieve that record. But every Windows, since Windows 2000 has this security reference monitor deep, deep, deep inside of it. And what it’s supposed to do is whenever somebody tries to log on or access a folder or file or get to a printer, share or even file share or send a print job to a printer, anything that has an access control list, the security reference monitor is supposed to check to make sure that you have permission to do that before the system allows you to do it. And this is a form of a sales fail-safe, I can’t even say it, right? Fail-safe default, because it’s supposed to not give access by default unless you can present a token that says, no, I should have access to it. Does this come up in any of the certification exams that you teach Jason?

Jason Dion:
Yeah, so a little bit in security, plus definitely we talked about it in CISSP, C-I-S-S-P. So it’s definitely on the higher side when we start talking into security. And a lot of times we talk about this, about what are the ramifications if something fails and should it fail-safe or not? And there’s really two ways something can fail. It can fail closed or it can fail open. And we talked about this with those valves and we talked about real world bringing the water into the submarine or pushing the water out. The safe thing, the fail-safe could be that when something fails, we would push all the water out so the submarine comes back to the surface and the people can get off, but that would be really bad in the middle of a war because you come to the surface now they sink you and all that kind of stuff.

So these are designed decisions that have to be made. And you’ll notice we didn’t say fail open or fail close as the default. We said fail-safe. Now what is safe? That depends on your particular system and how you want to do it. And that’s one of the big points that CISSP will make, especially when we start talking about a firewall, right? Because one of the ways that an attacker can overcome your firewall or your intrusion protection system or even your next gen firewall is they can simply overwhelm it. If we can flood that table with just lots and lots of requests, the processor, if it can’t keep up on the firewall, it has two options. One, it can fail close, which means it’s just going to start blocking everything that it doesn’t have time to process, which means all of our users are no longer going to be able to access the network or we can fail open, in which case we’ve overwhelmed the firewall and now all of those protections go away and we have a wide open network for us to just jump in. And as an attacker, that’s a great thing.

And again, you can configure it either way. It really depends on what you want to do. And that’s the point that CISSP makes is you’ve got to decide is security more important or is operations more important? And if security’s more important, you fail shut. If you think that operations is more important, you fail open. And those are those design decisions you make in your network there.

Kip Boyle:
And I see this commonly split between in the military at high levels of classification, you’re going to want to default fail closed because you’re going to absorb the operational hit because you don’t want to lose control of these secrets in the private sector, you’re probably going to default open because when you default close, if you are like an e-commerce company or something like that, you’re not going to be able to serve customers. You’re not going to be able to book revenue. It’s going to really hurt for you to fail close. And so that’s a big difference that people sometimes run into. They do cybersecurity work in the military in the DOD, and then they get a job in the private sector, and the whole thing is just flipped upside down from what they’re used to.

Jason Dion:
And actually, speaking of CISSP, I’ll give you an example. Going back to when I was running the feeder network operations security center, my physical security manager who had been doing physical security for 10, 15 years, super, super smart guy. One of the domains on the CISSP is physical security. Well, he needed to get his CISSP for the position he was in because he got promoted and now he needed that certification and we kept sending him to classes, he kept studying, and the area he kept failing, he failed four times before he passed the exam, was physical security. Now why was that? Because we do things differently in the military than what they do in the civilian world.

And one of the examples of this is one of the questions I remember from the exam, and don’t worry, ISC2, this was from over 10 years ago. I’m sure it’s not in your database anymore. So I can share this question, but essentially it was a question that said, said, Hey, you walk into the server room, you see there’s smoke, you see there’s fire, what do you do? And there was four choices. One was grab the backup tapes, one was alert personnel in the area that there’s a fire and make sure they get out. One was lock the doors so the fire can’t spread. And whatever those things were, one was initiate the fire suppression system or whatever it was. And the answer in this case was get the people out. And the military guy, his mindset was, no, protect the data. Make sure you lock the door and you grab the backup tapes. And it’s just one of those things that the way the military would do it versus the way that the civilian sector, in the civilian sector it’s always people first.

So anybody who’s taking their CISSP, and you come from a military background, remember people first, because that will bite you in the butt. And he literally kept failing that one section over and over. After his third failure, I sat down, I said, “What are you doing?” And we started going through and I was like, “Let me give you some sample questions. I gave him one.” Then he is like, “Oh yeah, you lock the door.” I’m like, “No”. And once we identified that was the problem, he passed the next time up and I just had to recalibrate and said, “Okay, forget everything from the military. All you need to remember people first.” And once he did that, he was able to pass.

Kip Boyle:
Oh, that’s so important. And another way of saying that is you have to think about what do the test authors, what answer are they looking for? And I think CISSP was really designed by people who worked in the private sector. It wasn’t designed by people who work in DOD or in defense contractors. I remember when I took that test, that was a big part of passing it was asking myself, okay, what did the test writers think was the right answer, as opposed to what I thought was the right answer? Because at the time that I took it, the test writers were people that were way more senior in their careers than I was, and they had worked on many computers and mainframes, and I’d never worked on any of those systems. I was working in distributed systems, Unix and Windows. So yeah, I remember when I went to these prep courses, that’s what they were telling me, like, please be careful here because, and don’t know if it’s still that way.

Jason Dion:
Yeah, that’s one of the things I tell my students all the time. If you take any of my courses when you get to lesson three, I have it at lesson called exam tips and it changes from exam to exam, but I kind of bring out in three or four minutes like, “Hey, before we get started studying for the next 20, 30, 40 hours for security plus or whatever certification we’re doing, here are the things you need to remember.” And one of the big points I always make is, look, you may be a really, really smart person and you’ve been doing this for 5, 10, 15 years and you come from a military background or a civilian background or whatever background it comes from. Forget all that. When it comes to the exam. You have to remember what I teach you in this course because what I’m teaching you is based on the exam objectives.

It’s based on what CompTIA or ISC2 or People Sort or whatever the provider is, who writes it, what they think the right answer is. And you can argue with me all day long, and I get students who email me every day, “Jason, you said this and that’s not true.” I had one today where they were focused on one word in the question, and I’m like, “No, read the next word over. That’s what explains it.” And it goes back to what does the textbook say and how do they define it? Because some people define threats and vulnerabilities different than the way we do on this podcast and the way that we do in industry. And that can cause problems when you go to the exam. So one of the things that remind people is this is why you take a good course or you read a textbook so you can understand what is the language that that company uses because their language is going to be different, and that can make a difference on the exam.

Now if you go with a CompTIA versus the way you learned it at work, well, the reason we have to go with the CompTIA is because they have to choose a standard and they’ve chosen the standard that we’re all going to use because we all use different things at the workplace as you go. And you’ve probably seen this, I mean, you’ve had a lot of jobs over the last 20 years. If you’re working with a bank versus an insurance company versus an entertainment complex, they probably use different language in those areas and you have to learn to speak their language. And the exam isn’t going to take that into account. They’re just going to take into account what is the standard that they want to use.

Kip Boyle:
That’s right. That’s right. Absolutely. So that’s a great exam taking tip. And that one’s actually stood the test of time because I took my CISSP a long, long time ago. Another example that I want to give everybody about fail-safe defaults and how things are not what they may seem is let’s say you unpack a computer, and it doesn’t matter if it has Windows on it or Linux. These are discretionary control access systems. There’s a little orange book lingo for you there. But when you open these things up, by default, you are either the admin or the root user. By default, you have control over everything. And so these things come out of the box with a completely open demeanor. So it’s actually failed open. Now, why does it default open? Well, one of the reasons is because the people who made the computer don’t want to have to sit on the phone with you and explain how to set up security. They figure if you need more security than what you’re going to get out of the box-

Jason Dion:
That’s for you.

Kip Boyle:
So it’s kind of a support issue for computer makers, but also I think it recognizes the fact that not everybody needs a lot of security, or maybe they don’t need any security. So anyway, and this is one of the reasons why in the private sector anyway, right now, we continue to talk a lot about restricting the ability of ordinary people who are just using computers to get things done from being admin on their machine. Because when malware gets on the machine, if it can, it’ll inherit your permissions, the rights that you have on the machine. And if you’re not the admin, then either that malware is not going to work or it’s going to work with a more limited effect, or it’s going to have to contain logic to escalate its privilege in order to gain admin and then do what it needs to do. This is a really interesting phenomenon that I remember encountering at work and just sort of wondering about it. Why don’t these things come out of the box with more security? And over the years, I kind of detangled all this and realized why this was, and anyway.

Jason Dion:
Well, yeah. I mean, one of the things I always talk to my students about is that there’s always this challenge between security and operations, and it’s this teeter-totter effect. If I have more security, I end up getting decreased or slower operations, and if I get more operations, I have decreased or lower security. For example, if I wanted Kip to come over to my house and clean my house today, if I leave the door unlocked, it’s really easy for him to walk in and clean. So operations is achieved very easily. He doesn’t need a key, doesn’t need a code, just walks right in. But if I start adding things like he’s got to have a badge or he’s got to have a thumbprint, or he’s got to have a pin number, that creates a whole lot of stuff that I have to do ahead of time to secure this.

And that creates a lot more challenges for him to get here. So if he wants to get to my house, he’s got to go through the security gate, he’s got to then unlock my front door. He is then got to go up to my server room and unlock that. And he’s got to go through all these different places where there’s places to stop him and it adds more friction. And so the more friction, the more security, the less operations, is the way I always look at this, and this is always a trade-off. And we as security folk have to realize that it’s not all about security all the time. We do security for the benefit of our systems and for the benefit of our end users, for them to be able to accomplish a goal. And if you think about this in the organization you work in, this is going to depend on where you work.

If you work for, I don’t know, a manufacturing plant, they really don’t care about IT except in the fact that it allows them to sell more widgets or produce more widgets in the factory because they’re using ICS and SCADA systems. And your whole job is to make that operations and that security at a point that can get them the most bang for their buck. Alternatively, if you’re working for a cybersecurity firm and you’re running a security operation center, security should be your number one choice. And we’ve talked about this when we talked about passwords, authentication and things like One Password and LastPass and things like that. You shouldn’t have a bunch of extra features. You should only have the security thing you need because your job is security over operations. And the last thing I wanted to talk about before we wrap up this episode is one more real world example to bring this fail safe default to life.

And this is something I think everyone that listens to us can probably realize in their own life, because most of us these days drive cars, and most of us who drive cars drive an automatic. Few people these days actually drive sticks. So think about when you get in your car in the morning to go to work. Your car, you turn it on, it’s in park, right? If your engine is off and you try to turn off the engine without being in park, it won’t let you do that and it won’t let you take out the key. That is a secure or a fail safe default is that the engine must be off. You must be in park before you can move the key. Another one is if you want to start the car. When you go to start the car, you have to push down on the brake pedal.

And the reason is we don’t want to accidentally have that thing go into neutral or drive. And if you turn on the car, it starts rolling away. But if you have your foot on the brake pedal, it goes, okay, you’re behind the wheel. It’s safe for me to turn on this car and remove the parking issue. And those are the kind of things we talk about with this fail-safe is to make sure you are in a secure and safe method that is a known method so that you can then move forward from.

Kip Boyle:
Exactly. Well, there’s probably a lot of other real world fail-safe defaults that you can think of. And so I hope we’ve given you a few, and maybe as you’re going about your day, you might encounter one or two more, but as we wrap up this episode, the thing that we really want you to take away from this is all these security design principles can be helpful to you on the job because guess what? You’ve seen this already if you’re already working in cybersecurity or technology. Things change all the time. That’s probably the one thing we can count on is that everything’s going to change all the time, but these security design and architecture principles, they don’t change that much. So we can continue to rely upon them in an ocean of change. And so that’s why we wanted to bring them to you. Any last words, Jason?

Jason Dion:
Yeah. Not on fail-safe, but the one thing we did want to mention is I wanted to thank the audience for helping us with the Akylade Certified Cyber Resilience Fundamentals exam that came out last month and it’s gone through beta and now it’s in live production. We’ve had a lot of people taking that exam. And if you want to take that exam as well, you can. Before we talk about that, Kip, what is the CCRF? Can you give us kind of the ten-second elevator pitch?

Kip Boyle:
Yeah, absolutely. And by the way, congratulations on saying fundamentals because that’s not the word you wanted.

Jason Dion:

Kip Boyle:
I appreciate that.

Jason Dion:
I’m getting used to it though. It’s growing on me.

Kip Boyle:
Thank you. So if you get the fundamental certification from Akylade, what that’s going to mean is that you are going to have a basic vocabulary and understanding of the knowledge and the concepts around cyber resilience based on the NIST cybersecurity framework. And you’re going to be ready to join a team of other cybersecurity people and be productive right away in helping your organization become more cyber resilient and stay cyber resilient. That’s what this certification was designed to do. And there you go. There’s your ten-second elevator pitch.

Jason Dion:
And this is the first level in a two exam series. So the first level is the fundamentals as Kip just made fun of me for, and the reason he is making fun of me is because when we were brainstorming the names, I kept going for “foundations” and he liked “fundamentals”. So we ended up going with fundamentals. So it is the Certified Cyber Resilience Fundamentals, and the second level is the Certified Cyber Resilience Practitioner. And that comes right after fundamentals. So once you pass fundamentals, that’s so you can be a team member once you go to a practitioner that qualifies you to be a consultant, somebody going in and helping organizations and leading a team of cybersecurity analysts to be able to get their organization onto sure footing as they move forward with cyber resiliency. And both of those come from the textbook, which is called Mastering Cyber Resilience, which you can find on Amazon or any of your favorite places to buy books.

And you can find that if you go to Akylade.com and just click on books, you’ll see that textbook there or the certifications if you want to learn more about them, really low cost certification, great way for you to learn a lot of information about the NIST Cybersecurity framework and how to use it practically so it’s not just a knowledge-based certification here. When you get those two, it is something that shows you can do this on the job and act as a consultant doing this stuff that Kip and his team does on a daily basis and has for 10 plus years over at CRO at Cyber Risk Opportunities. So we hope you find it valuable.

We work with a lot of hiring managers to make it valuable and so that it’s recognized in the industry. We’ve gone through all the long arduous steps that took forever to go through so it could become an internationally recognized certification. But we did that because we want to make sure it’s valuable to you, and it’s not just a “money grab”, as I’ve seen some certifications do. So anyway, that was my pitch on CCRF. If you’re interested in it, you can find that at Akylade.com, A-K-Y-L-A-D-E.com. And yeah, you can learn more about it there and decide if it’s right for you.

Kip Boyle:
Well, thanks Jason for mentioning that as you knew, I forgot completely to mention it, so thank you very much. Okay, well listen everybody, thanks for being here, and as we wrap up, I just want to remind you that with every episode that we publish, we publish a full transcript as well as any notes that you might need to take full advantage of the information that we have provided to you. And it’s really easy to find the episode. You just go to YourCyberPath.com/ and then the episode number, which you can get straight out of the episode description in your favorite podcast player. You can also go to YourCyberPath.com to sign up for Mentor Notes.

Now, Mentor Notes is something that I write every other week, and it’s about 500 words, so it’s quick and easy for you to read, and I try very hard to make it as actionable as possible. My goal is to give you tips and tricks to either start your cybersecurity career or accelerate the cybersecurity career that you’re already in. And look, if you give it a try, if you don’t like it, the unsubscribed function is super easy. No harm, no foul. We’re not going to get bent out of shape if you don’t like this thing. So give it a try. You really don’t have anything to lose. Go to YourCyberPath.com and sign up. Well, listen, have a great week everybody. Thanks for being here. We’ll see you later.

Jason Dion:
Thanks all.


Headshot of Kip BoyleYOUR HOST:

    Kip Boyle
      Cyber Risk Opportunities

Kip Boyle serves as virtual chief information security officer for many customers, including a professional sports team and fast-growing FinTech and AdTech companies. Over the years, Kip has built teams by interviewing hundreds of cybersecurity professionals. And now, he’s sharing his insider’s perspective with you!

Headshot of Jason DionYOUR CO-HOST:

    Jason Dion
      Dion Training Solutions

Jason Dion is the lead instructor at Dion Training Solutions. Jason has been the Director of a Network and Security Operations Center and an Information Systems Officer for large organizations around the globe. He is an experienced hiring manager in the government and defense sectors.


before you go…

Don’t forget to sign up for our weekly Mentor Notes so you can break into the cybersecurity industry faster!