
Boring On - Security, Walking the Talk
A deep-dive conversation about cyber, security, and technology topics on a relaxed walk.
Boring On - Security, Walking the Talk
Boring On - Episode 1: Unforgivable Vulnerabilities
In this first episode we test out the walking, conversational format for exploring security issues and discuss the NCSC's guidance on forgivable and unforgivable vulnerabilities.
Boring On, Walking the Talk in Security. Episode 1 Unforgivable Vulnerabilities
James:Welcome to the first experimental, uh, Boring On Walking the Talk in security. This is Chris Boar, I'm the host James Bore, and going to be chatting today on our pleasant little walk through the woodland about the new NCSC research paper that's out on forgivable and unforgivable vulnerabilities, what it actually means and why it caused such a kerfuffle in the security landscape.
Chris:It's sort of a metaphor, isn't it? We're walking through the woods and can we see the trees for the woods or does it see the woods the trees?
James:There's one, yes. So we can, but it has caused a lot fuss. And I think most of that comes down to the terminology and the fact that people haven't really read the research and understood what it's doing. So first of all, the terminology is talking about forgivable and unforgivable vulnerabilities. And it's important to note that the research actually mentions three types. And the third type is non exploitable vulnerabilities. And with those it doesn't discuss them at all, other than to note them, but it does mean that both the forgivable and unforgivable are exploitable vulnerabilities.
Chris:Yeah, which means that they can actually be done. So a non exploitable vulnerability is almost not a vulnerability at all, is it?
James:No, not really. It's, well, it's still a weakness in the system. Yes. But the whole point of a vulnerability is that they are something that can be exploited by a threat. Yes. So if you can't exploit it, then it's not an issue. I think they're using non exploitable not just to mean there is no way exploit it, but also there are other controls in place that prevent it, such as if it's something that needs local access, then it's non exploitable if it's inside your building, in theory.
Chris:Yes, okay. So, let's address the forgivable, unforgivable thing, because you told me that's what's caused the controversy at the moment, that people are enraged by the idea of there being forgivable vulnerabilities.
James:Yes. and the issue is that people have gone. That means they just won't fixed, which isn't true, but they're not fixed already. But there's also a misunderstanding about what the research applies to. People have taken it as talking about vulnerabilities in all systems. Now, the research itself is talking about in software development exclusively, and it's talking about the fixing of these vulnerabilities by the developers in the creation of the system exactly not
Chris:a retrofit and they specifically talk about retrofit don't they how much more expensive it is how much more difficult and therefore that's outside the scope of this this. This is to do with developing and building a system
James:Yes, this is secure by design or what should be secure by design
Chris:Yeah.
James:And so
Chris:to address the unforgivable an unforgivable vulnerability as I see it, It's something which no coder or designer or build engineer should ever omit. In other words, it's an absolute epic fail, and it's addressed in Software Development 101. So, for example, it's failing to set dereference pointer to null, that kind of thing, which, for decades, we've known about, should avoid, and have somehow, amazingly to me, crept back in, because it, it was, like, in your DNA. But, I suppose it wasn't in the DNA. DNA. It was enforced by, uh, Company wide standards, rulebooks, style guides, and checking on that, which in those days was done by another programmer. Now it can be done by standard tools, like SASC, is that the one that does static analysis at compile time, and DASC, which is, um, runtime analysis, can pick up these kind of rookie errors that creep into system. And I have to stop myself because I'm in awe of the fact that people still make these really, really basic, uh, Mistakes that we know about, and I think that's what they're addressing. So the unforgivable is, it's literally unforgivable. It's like should be really, really angry about
James:those. And we should. I mean, these are, these are known. We've known about these specific vulnerabilities for decades. We know that you can't program that way, and that if you do, it causes problems. And so the, the, uh, Forgivable
Chris:is, I think the terminology is weak there because it invites you to think that those are okay and I don't think that can possibly be the intention. It's like we've set a bar so low that any programmer from the 1980s would have skipped over it easily. But now, people tripping over it all of the time. And that's possibly because there's many more programmers, and we've got, like, programmers and developers, and we got the same sort of enforced learning of the basic rules in difficult languages that actually tripped you up all the time if you weren't careful. with But A forgivable vulnerability isn't necessarily forgivable. It's just one that you might not fix. And part of of the reason for that is it might be extremely hard. to fix. And that, I think, is part of the metric they've established, because as I understand it, They've adopted a metric, which is, which originated, I I think in the health sciences, although we used to use it in software engineering and robotics years ago. um, and it's called cost effectiveness analysis, and that's essentially looking at the cost of an intervention, the knowledge and understanding that you've had, had. In order to be able to address it. And what
James:was the third aspect? It was, so cost, knowledge and understanding, and there was the impact. Impact, yeah. And so those
Chris:three things are the way in which you measure how much it costs you to do this. Cost can be a direct cost, like buying different software. It can be an indirect cost, like the time it takes you to actually go through the coding or go through code reviews. Or the cost of applying tools to analyze your code and see where you might make improvements to it. that's a kind of metric that I like. Because, It you a formal way of assessing something, and the way that MITRE have done it, as I understand correct me if wrong, is that they have essentially gone through, apparently by committee, they don't make quite clear their methodology, but they've gone through the most common vulnerabilities and exploits, and then for each of them, they've noted down three areas and three scores against those areas, which gives you a three by three matrix, And that 3x3 matrix in terms of the, um, gives you a set of scores. And then they've categorised those cumulative scores in terms of easy, medium or hard. And if it's easy, you should just do it. So easy means, um, You're brain dead if you don't do this. It means, doesn't mean it's easy, it means that it's standard engineering practice. Nobody should not do this, You should be taught about it, and if you don't, you should go for an intervention remediation. Hard means that that maybe it's technically unfeasible, or there's an extraordinarily high actual cost to dealing with it, and those hard ones I think, are the ones they've called forgivable. It doesn't mean that we're going to forgive them, it just means that you've got to think about them. setting a low bar, and it's saying that everybody should apply these basic standards. I sort of go along with that, actually. I don't see why the controversy is missing the point. I think we forget the word unforgivable, maybe we should focus on, oh sorry, Forget the word forgivable. We should focus on the fact that some things are things are unforgivable, really means that it isn't addressed, I think, directly in the report. because it just tells you which of the vulnerabilities are unforgivable to not remediate or mitigate against. But, the elephant in the room is, how do you enforce that? And, I'm out of date on these things, in my day, as the old man, we used to have standards and guidebooks and programmer code reviews, so that you would look at these things and you knew what you were doing, and then people would tell you if it was wrong or it would be Enforced on the code book. So, those kind of things, I think, are what comes out of this, and that is good, safe engineering practice. Yeah. We've got those standards for manufacturing because we've got all sorts of things. GAMP, good practice, G something P, good automated manufacturing practices, GAMP, and that tells you the standards you should always adhere to if you're manufacturing. So, we, good software
James:engineering practice, all we're talking about. Yes, we, we do have some of those, but, for example, there's Microsoft's SSDLC, the Secure Software Development Lifecycle. Now, the SSDLC can be boiled down as saying you should have these checks in place.
Chris:Yes.
James:But with manufacturing, of course, it's more like the full framework with quality assurance, with change control, with corrective actions. Yes. Now, we do have frameworks like that, but they're much more compliance side, whereas developers like to keep the bit out of it, so we end up instead with ssdlc's or sdlc's where it's just the software development life cycle and there's a note that these checks need to be in But there isn't the same management system around it.
Chris:Yes. Because
James:management sits outside of the development team, and they don't like to engage, and vice versa. They're speaking different languages.
Chris:that's like
James:you have
Chris:have development guidelines, but no system in place to ensure. That you're
James:actually
Chris:enforcing
James:meeting those guidelines. In good management, uh, good manufacturing practices, if there was something on the production line causing a severe flaw in all of your product, that would go outside of production to be dealt with. It wouldn't be the same team marking it. But when we look at software development life cycles, a lot of the time they stay inside because it's not as visible.
Chris:Yes, okay, and, and so, software engineers, or should I say programmers, might be very resistant to outside controls because, I was going to say we, they are experts in their own domain and know things, and Believe it or not, I have known and worked with software engineers who refused to follow the company standards because they knew better, and that's remarkable self confidence, shall we say. The way to address those things is to question those practices and engage with changing them. But in these cases, the guidelines are actually quite clear. And a sense, you can derive the guidelines from the examples that they give. So what MITRE MITRE seem to have done, correct me if I'm wrong, is to have identified a whole list of common vulnerabilities and then classify them with a binary classifier as their forgivable or unforgivable. The unforgivable ones the ones you must address, and so there's a list of them. And you can go through those sort of item by item doing it, but really it's the software engineering practices that allow those vulnerabilities through. It's not, the vulnerability is like building yourself a to do list, and that's fine, but But that's very item by item and ticking it off. you really want is a standard working practice that ensures don't fall into needing those to do lists, if saw, I mean, it's not a tick box exercise. it's a way of working that needs to be formalized. I mean, I do wonder to a certain extent if it comes in through the abuse of things like agile and I really like Agile development, and I understand. Um, the compromises it makes compared to Waterfall and the benefits it brings us, but it can become an excuse for just not really checking anything. Like, an agile stand up can be just, yeah, I did this, I did this, oh great, everyone, it's wonderful, you know, get on with it, team. It should be a case of checking things and calling
James:account. So in short, this is one that I've encountered a lot. In various different companies that agile when you've got the agile standups, it's all about what new features you put in place always, because that's the exciting stuff. That's what people want to do. And you've always got a backlog. So you prioritize what you can get people to do. And at that point you end up with all of these exciting new features, which themselves are introducing new quality and security flaws. And the backlog of security flaws just grows. But on top of that, generally these security flaws, unless you've got the right tools and the right checks in place, they're only picked up at the point of delivery, because they're usually picked up by a separate team. And When it's healthy, that will be the normal test QA team. When it's unhealthy, that will be the security team, and it be at the point before it goes to production, when you will finally get your chance to run the security checks on it. And what that ends up meaning is, oh, we've got a crow friend.
Chris:You like feeding the crows. Yeah, well, they're always handy.
James:There we go. are more intelligent than some coders. Anyway, so you get these massive backlogs building up of just general issues, quality issues. You get the security issues kept until, really, it's too late to do anything about them. So,
Chris:we impose unit tests. Which in sense should be putting in your test vectors, looking at all the cases, finding your edge cases, seeing where things could go wrong, and why don't those already solve this issue?
James:Right, now this is my case. pet peeve because you know I do a lot of work with threat modeling and I really like it as a tool. Now what threat modeling should be used at that early stage in development is to define your unit tests for security and a lot of the time there's also the issue of testing for So when you're testing for success you are looking for does the system do what I want it to. Security is all about the system doing stuff you don't want it to. So you've got to test for failure. You've got to test for abuse cases, But, we know the issues that arise that cause security flaws. You can run through threat modelling, say, what is the system meant to do, and that will allow you to identify what you should be testing for. Yes. And that should be part of your unit
Chris:tests. And the test vectors you put in, that's the inputs essentially that you put into the system, should be broad enough to cover all kinds of eventualities. I have an anecdote from my very earliest days in commercial software development, which was developing educational software for very young children, for eBray software, which was sold under the brand of good housekeeping software. It was loaded from cassette tape on things like Commodore 64s and 64s and ZX Spectrums and so on, but we had reports from users that if their children sort of slumped on the keyboard, software broke. And the software engineering team said there was no possible way of testing for that. So I employed my sister's boyfriend, who was quite technically savvy but not a not a computer expert. So, that's what I did. Um, to press every key on the keyboard, and then every combination of two keys on the keyboard, and then every combination of three keys on the keyboard, it was probably the most boring job ever invented. But it was a very good introduction to software testing, and I refuse to believe that you can't test a very wide range of eventualities in terms of your input data. So yes, you're right, if you're thinking success, you're saying, does this system what I want it to do? With the inputs it expects, but the edge cases are where the input is not what you expect, and so you need to do that. There's an example they give, isn't there? In SQL. Yeah, so you cast to an intval to make sure it's an int. I would say that you want to check if it is an intval and maybe throw something that says maybe I should look at this because something's not quite as expected here, but certainly enforcing those standards so that you force the input to be what you expect. can deal with safely. Surely that's I
James:I mean, is that an automatic
Chris:thing that you do. If
James:you look at manufacturing, because I think manufacturing much more mature than us in this area. So yes, let's say you are making chairs out of wood. Yeah. You've got a lot of machinery to do it. Now, are you going to, at the start of your multi million pound manufacturing. line, your assembly line, going to have something that checks just in case there's a nail in there That would cause one of your machines to break down and in millions of pounds of repair Well, yes, because it's very easy to do, But in programming we don't bother to test the inputs a lot of the time, And that's one of the reasons injection is so common. That could be a whole other
Chris:as well to why programming seems not to be subject to the same standard controls as so many other areas are, so in, in health systems, you know I was involved in medical imaging right from birth. the beginning with MRI. That was my PhD and then radio wave and ultrasound imaging to detect breast cancer. In those kind of systems, failure is unforgivable because it's literally life threatening. In the case of a prostate probe, for example, the probe is driven into the rectum of the patient in order to produce a spiralling 3D scan. It's a very uncomfortable and messy procedure. And one of the things you really want to avoid is that the probe goes in too far or And so, you automatically think about the damage you're causing. I'm sorry, that's a rather revolting one. No, but I think it's an important one because You have to think what's going to happen. So if the system dies while that probe is going in, you need to have a watchdog timer and other mechanisms, probably mechanical mechanisms, to prevent the harm. And it's automatic. I don't think there's anyone designing kind of system who wouldn't think to do that. You would automatically spend your time considering what might go wrong. When they brought out the the Philips Streamium, which was one of the first streaming media players, back in the days when you had personal video recorders were starting to come in, yeah, I was very impressed by talking with bunch of engineers. Supposedly I was training them. But actually, the most interesting conversation was one they had themselves, and that was to do with the, um, update over the air. the conversation was about what happens if the update fails. And then it's like, okay, so we save the state of the system beforehand. But what if that state of the system doesn't save properly? It's like there are multiple levels that you go to, thinking about what might go wrong, and what you might do about it. And it's, it's part of your development process. You're not trying to build a product that just does whizzy stuff. You're trying to build a product. I used to have a screen. Yeah, a PowerPoint slide, that showed a Windows blue screen error, you know, when the system crashed. and when I was talking about consumer electronics I would say, in consumer electronics, this would never be acceptable. Never have your TV just crash. Well,
James:how
Chris:we laughed,
James:and how wrong we were. I mean, there's something I'm going to on here, because we've had instances of certain electric car manufacturers, where they've issued software updates, which have broken the car. And that's something where you just have to think, The wrong people are in charge of this sort of testing in what world? Should a software update be able brick your car. It's bad enough when it's your phone, particularly nowadays, but your car is really serious. And I think the issue is the, the software evangelists, the ones who'd like to move fast and break things, are finally discovering what happens when you break things that matter to people. Yeah, and consumers have become so complacent about it. They won't
Chris:always be so. So we're, one of the things they said in the MITRE was to do with the number of vulnerabilities introduced per thousand lines of code. So you've got this measure called a clock, a K L O C, which is vulnerabilities per kilo lines of per thousand lines of code. And they say that it's remained static at one vulnerability per thousand lines of code. Presumably that's an average. But it's remained static 2014 to 2019, so it's only 5 years.
James:Yes. before that, originally it was measured at 0. 5 and then at 0. 75. Now we don't really have much information about how that was measured, so it's not reliable. But it's notable that it has increased, and that's a doubling. If you're going to take that number as accurate that means it's doubled in that time, and that suggests it's going to continue to double And on top of that, We have the estimate that lines of code are doubling on a regular basis So one method of
Chris:reducing your vulnerabilities would be to reduce your number of lines of code Which is actually good software engineering practice anyway Because going through your code and through your code and refactoring it, looking at what you're doing and so on is an important step that's often taken. Neglected.
James:Well, it's optimisation. I mean, we come across that when we're looking at governance systems. ISO 27001. Where people have piled in policies and controls and Bits and pieces and no one's ever sat down and said, which bits don't we need anymore? And no one's ever trimmed it. Everyone loves to add new stuff. But looking at it and saying, What can I throw out? Is possibly even more important.
Chris:Well, I'm wrong when I said about health. systems not being subject to that, because I worked with a major manufacturer of ultrasound scanners, and we were trying to reverse engineer their so that we could adapt it. And during the discussions of this, it became apparent that there were a lot of fixes. In fact, they've moved from an analog system to a digital system. and there were filters involved. Analog filters work differently from digital filters. A digital filter approximates what an analog filter does. But as you drilled back through it, what you found was that they'd replaced the analog filter with the digital. It wasn't quite as good. And so what they did was, they added another digital filter to correct it. and that wasn't quite good enough. So they added another filter. So, in other words, instead of addressing the root cause of the issue, which was to design your filters properly or to revert to some form of analog processing because that is desirable in some circumstances, they just have kept on adding things. And so it was very difficult. And the interesting thing was, there was no audit trail to say what had The only way to get there was to go back and reverse engineer the system step by step and then question people and someone would say oh, yes, We did that because or I think so and so did that because now that's wrong because always and this goes back to good automatic may good development practice If there isn't an
James:audit trail, I have no idea what's happening. So I'm going to We should probably end there because we're nearly at the end of the walk. and it's time for a coffee, but that is a brilliant idea. analogy of what we see in cyber security generally. People put in these systems which have weaknesses, they layer on technology to fix and then they find there's weaknesses in those. We've seen that with firewalls. You now have defense in depth with multiple providers because they keep on building firewalls with vulnerabilities. Security tools are built which compromise everything. We've seen it with CrowdStrike. Where they broke a large part of the world. We've seen it with Sonicball fireballs. we've it with just about every firewall manufacturer, We've seen it with various management software, so constantly it's putting these in place and then saying, Oh, well, you're irresponsible if you don't cover up our mistake with this new expensive solution. Yes, okay, so don't introduce new stuff all the time. Yeah. Think about what's already there and whether you fix it or whether you can just get rid of it to solve the problem.
Chris:So we're just coming out of the woods now, we went through a clearing, and we were in the clear for a bit, now we've lost our way, but we're finding a way out of the woods. And I think, in summary, what I think is that MITRE recommendation, it's a National Cyber Security Council recommendation, isn't it? And the work was based on the MITRE vulnerabilities Database. That is really good in its unforgivable. There are some things that are unforgivable in cyber. other things are forgivable not is a, is a matter for debate. but there are basic security 101 practices practices that you must adhere to. And I think that's the message I can fully endorse. Thank you for listening. Remember, you can subscribe at podcast.bores.com, and watch the video version on YouTube with the handle Boring On Podcast.