Home Projects Blog Contact

Compilling my own kernel to prevent security vulnerabilities.

Published

Tags: security vps server tech linux

This week has honestly been a doozy, I'm not just talking about the... "socio-economic climate"1 we find ourselves in, but in this week, we've seen two low-privilege escalation bugs, DirtyFrag and CopyFail.

Whenever you hear of a security vulnerability, you've got to first assess if you're impacted at all, that ought to decide the rate at which you should be patching and mitigating these vulnerabilities.

You can't treat every bug as a disaster, because some bugs require specific conditions, these two security vulnerabilities are truly awful, but not for me thankfully. If you were administering a large shared box, like for instance, a tilde instance where you couldn't trust your users, then I'd be panicking!

I might not be so lucky with the next vulnerability, imagine if a bug gets into a program I host which then leads to a user getting a shell somehow, it wouldn't be trivial to chain multiple security vulnerabilities together to completely pwn my box. So what steps can I take to prevent these security bugs from impacting me?

Ancient Linux kernel feuds and design decisions.#

To give you slightly more context, I'll explain how the Linux kernel, and more importantly its modules works. The Linux kernel is one of the most widely used kernels out there, especially in servers because you're not gonna want to pay Microsoft to license a crappy server OS.2, Linux was originally a monolithic kernel, which means that all of its drivers and functionality is literally baked into the kernel. This makes it easier to develop a kernel, but it's terrible for security and reliability because a single bug in a single subsystem (like networking for instance) can lead to the entire kernel going up in flames.3

A much better approach would be something like a microkernel, where each individual system feature is its own program. If that feature or driver crashes then it won't bring down the entire system or poison and infect other parts of kernel, it's much better for both reliability and security! By the way, it's not like we didn't have microkernels at the time Linux was being developed, Minix used to be one of the most well-known microkernels, it's abandoned now because no one uses it, Linux won out in the end but the creators of both these kernels were feuding as to which approach is better.

Ok, ok, that's all in the past, Linux won against Minix and now we live in a world where every driver must be baked in and we can't EVER do anything cool like inserting new drivers or removing them while the kernel is running... Right? Wrong! I only said Linux was originally a monolithic kernel, it has changed because having a monolithic kernel is honestly terrible, I mean, imagine having to restart your computer after every update, imagine not being able to use a device driver right away!4

In actuality, Linux is a hybrid kernel which means that it doesn't have the security and reliability improvements of microkernels but it does let us load in drivers, features and literally any module.

Ok cool, Linux is slightly modern, how do modules work then?#

Modules can do literally anything, they're used to provide drivers, to provide networking features, interfaces to programs, security enhancements. Literally anything! That means they are extremely powerful, and since Linux is an open system, anyone (including you, dear reader) can start writing Linux kernel modules to do anything! Unlike with Windows, where you need to pay heavy driver signing fees to Microsoft to get other Windows computers to accept your kernel drivers, you can write your own Linux driver, distribute it to everyone, and not have to give a single penny to Linus Torvalds, nor the Linux foundation.

Modules can be loaded, and unloaded at run-time, and here's how that works. For these demonstrations, I will use miku as a placeholder module name, you can just imagine it as providing something we really need (like a built-in Pocket Miku accessible via /dev/dsp for instance)

Loading a kernel module directly is done with the insmod (insert module) command.

1
insmod /path/to/miku_module.ko

Notice the .ko file extension? You can't load any old program as a kernel module, kernel modules are special programs and thus have their own file extensions and build process, ko naturally means "kernel object", and is a program built for the Linux kernel to load in.

Unloading kernel modules is done with rmmod (remove module)

1
rmmod /path/to/miku_module.ko

Yes, it's kinda bothersome that you have to provide the path to the kernel module when using both of these commands, this is why no one actually uses them, instead we use a smarter command named modprobe which looks for kernel modules inside the /lib/modules module so we can save keystrokes!

Here's inserting and removing kernel modules with modprobe

1
2
3
4
5
# Loading kernel modules
modprobe miku_module.ko

# Unloading kernel modules
modprobe -r miku_module.ko

If you'd like to learn more about how to use modprobe, then kindly RTFM5 for modprobe with man modprobe (or just google your question, there's no shame in it)

You can see a list of currently activated kernel modules by running lsmod (list modules), keep in that modern computers (even servers and phones) usually need a couple hundred kernel modules. It won't surprise you to learn that modern computers are featureful AF.6

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
user@pony.biz:~$ lsmod
Module                  Size  Used by
xt_recent              24576  0
nft_reject_inet        12288  1
nft_reject             12288  1 nft_reject_inet
xt_MASQUERADE          16384  1
bridge                389120  0
stp                    12288  1 bridge
llc                    16384  2 bridge,stp
xfrm_user              69632  1
xfrm_algo              16384  1 xfrm_user
[... 113 lines trimmed for your sake]

user@pony.biz:~$ lsmod | wc -l
123

Yes, my server uses 122 kernel modules7, and my desktop system uses about 181! That's a lot, do we even need this many?8 or is it just increasing the attack surface of the kernel?9

Anyway, how in the world does the system know which modules to load? and when does it actually do that?

Loading kernel modules at startup is such a simple task, it can basically just be boiled down to reading a list of kernel modules, and running the insmod command to insert them one at a time. Naturally, like with anything in Linux, there's several different competing standards and tools, most systems will be using a combination of the two following methods.

Firstly, you probably have a folder located at /etc/modules-load.d, this folder is dead-simple and it honestly can be thought of as a list of which kernel modules to load in, this might be read by your init system (most commonly systemd), and it basically just works by running through a list of kernel modules and running insmod to insert them, nothing too complex.

What if you want to block some kernel modules from ever being loaded? or what if you want to specify options for a kernel module? Well then the previous system is far too simplistic for that, your machine will usually include an entirely different mechanism located at /etc/modprobe.d which lets you do much more advanced kernel module blocking, installing and so on.

Now with the context we've got, we might be able to learn more about how to prevent security bugs like CopyFail and DirtyFrag.

What do both CopyFail and DirtyFrag have in common?#

Note: I'm not a cybersecurity specialist, I am not even gonna try to explain how these exploits work in any detail because I will get a detail wrong lol.

I'd like to know specifically which elements these two bugs are actually exploiting to gain root access, the root cause basically. So that I can take a cleaver and smash the entrance so it'll no longer affect me ever again.

In the case of CopyFail, the algif_aead module is largely to blame, its normal function is to provide access to "AEAD algorithms" to programs running on the computer. In theory, any program running on an affected machine can use the interface provided by this module, so, there's no way to limit a programs access to it, except by completely disabling it.

What even is "AEAD"? Wikipedia tells me AEAD means "Authenticated Encryption with Associated Data", Finding another resource, we can learn that "the AEAD cipher API is used with the ciphers of type CRYPTO_ALG_TYPE_AEAD (listed as type “aead” in /proc/crypto)", if we read a bit more through the cryptospeak, we can make out "IPSec"

Here's a question, are you still confused? Because I still have no idea what this is meant to do, I know this is tangentially related to IPSec, I have a faint idea of what IPSec is, and I know I definitely don't use anything like that. I am also guessing you probably don't use it! Because if you did, then you would've known what it was, and you would've known it was related to IPSec. I am willing to bet the majority of Linux servers out there affected by this bug don't actually even use IPSec, it's just that for some reason algif_aead is available and can be loaded in by the kernel automatically when a program requests it...

The fact that it's available by default is what makes it so terrible, it's the reason why CopyFail was able to reliably infiltrate so many different Linux systems in the first place. If it were just a random kernel module you would need to authorize, then this attack wouldn't be that big of a deal because it wouldn't impact anyone at all. But no, the ease of access to algif_aead contributed to this disaster.

Ok, ok, that's CopyFail. What about DirtyFrag? Well, the original announcement for this bug included the following mitigation instructions:

1
2
3
4
5
6
7
# Disable esp4, esp6, rxrpc from ever loading.
echo "install esp4 /bin/false" | sudo tee /etc/modprobe.d/dirtyfrag.conf
echo "install esp6 /bin/false" | sudo tee -a /etc/modprobe.d/dirtyfrag.conf
echo "install rxrpc /bin/false" | sudo tee -a /etc/modprobe.d/dirtyfrag.conf

# Unload esp4, esp6 and rxrpc right now.
rmmod esp4 esp6 rxrpc 2>/dev/null

I've re-written it to be slightly less ugly, and more readable, i have no idea why the original mitigation script was so ugly in the first place. Anyway, we now know the modules that this vulnerability is acting on are esp4, esp6 and rxrpc. What's so special about these modules? What do they do?

Rather than search "esp4" on Google or DuckDuckGo, which just yields endless results about the vulnerability itself and nothing about the kernel module, I decided to search "esp4" on a website specifically dedicated to kernel configuration, and I found out that ESP4 is a module dedicated to adding support for Encapsulating Security Payload in IPSec on IPv4, ESP6 does the same as ESP4 but it's meant for IPv6, and finally RXRPC adds support for RxRPC Network Protocol, which I honestly couldn't understand the purpose of.

There's the same pattern with CopyFail, of these two bugs using kernel modules I've never heard of, nor have I ever used. Honestly the real problem is not that these two had security flaws, because realistically all kernel modules are bound to have some security flaws. The problem is that these two kernel modules are available by default for programs to use, even if there is no reason for them to be available in the first place. Why is that the case?

The dilemma of distro makers#

It's stupid to say this out loud but there is a lot of hardware out there, and there are so many features that the kernel has to support. I don't use IPSec, but someone out there does, and so the kernel has to add support for it, but why does this have to affect me when I don't use IPSec? because the distribution I'm using arbitrarily decided to support IPSec, not that there's anything wrong with it, they have to make these features available on all computers, otherwise you risk not shipping a feature that users expect, or you miss a device driver that some users really need.

The problem isn't with the Linux kernel itself, but the dilemma of Linux distributions, because unless they are made for niche hardware, they will try to include as many kernel modules in installed systems, even when they will never be used. This is why esp4, esp6, rxrpc and algif_aead were available by default on my system, not because I was using them but because someone else running the same distribution as me might use them, and so they are automatically available on all distributions.

Distributions are heavily incentivized towards including all device drivers, obviously they don't compile them into the kernel, but they compile them as kernel modules and they bundle in as many as possible because you really don't want users to be inconvenienced, or have to do anything complex

The right thing to do would be to split kernel modules up in such a way that they'll only ever be loaded when the user actually requests them, how do we decide that?

What's the ideal solution to this problem?#

In an ideal world, distributions would put their heads together and come up with a universal solution that suits them. Users ought to be able to easily enable the features they want, with the exact kernel modules that are needed, and anyone else who doesn't use these features won't be bothered by security vulnerabilities affecting them.

How do we accomplish that? We could invent a hypothetical "package manager" for kernel features, something like apt but for kernel modules. So we could have an ipsec package that automatically installs all the relevant kernel modules (algif_aead, esp4, esp6 and so on), at a technical level, we could install all the available kernel modules on a folder, and then simply only ever enable the ones we need.

This means users could get the benefit of always having all the possible drivers and features available, but without the security downsides of having those things available all the time, they would just need to run a single command in order to load the module (or make it loadable)

What's the best solution right now?#

Well, we don't live in an ideal world, we live in a world that is figuratively and literally on fire. So how do I just get this over with as quickly as possible? The answer is to just block all the modules you don't use, that's it. That's the answer that has been staring us in the face all this time, and you could do it yourself if you wanted to.

But let's be honest, this blog post would be so boring if I told you the best solution, so how about I look for the most overkill solution, and deploy it? :)

What's the most overkill solution right now?#

You could block all the modules you don't use with the /etc/modprobe.d folder, that would work and that would be a sensible solution. The painful part is figuring out which modules you use, and blocking the rest, but again, I am not into boring solutions.

Gentoo is a Linux distribution known for taking customizability to the next level by compilling every application yourself. Yes, I really do mean every single application all from the kernel to the browser that you use.

In theory, this means limitless customization, you can really build your Linux machine exactly how you want, the downside is obviously having to set aside an entire day a week just to recompile your entire system for updates, There's a reason why Gentoo fans tend to be the same types of people who have a lot of time to spare, every other Linux user has no time to perfect their build, they've gotta actually do work.

Why focus on Gentoo? Well, compiling and customizing your own kernel is such a big part of Gentoo that they have some of the best documentation on doing this. I can't do exactly as they do, because I use Debian and I have no interest in switching to Gentoo, so I've got to violate one of the core tenets of Debian stability: I've got to make a FrankenDebian, here is what I did.

Note: I am not actually going to ever run this for reasons that will become obvious, read the very end section if you want to know why.

The gates of hell open for me.#

Here is a bit of an open secret in most Linux distributions: You're not getting the raw source code of what you install. Unless you use Arch Linux or Gentoo, where there is heavy focus on the latest and greatest, then you're likely downloading programs modified by your distribution of choice. Most Linux distributions fall on a spectrum based on how much they change the programs in their own distribution, there's obviously Arch Linux and Gentoo which try to be as hands-off as possible, and then there's Debian and RHEL on the other side where you rarely get the raw source code as-is, and usually the program you're install will have been modified to work specifically with that system.

This is, in general, a good thing because it means the programs that you install from the distribution itself have the highest chance of working, but it does mean that there has to be a huge process in the background specifically to patch things and keep the entire system working with few outages. Debian does a really good job at being stable and reliable specifically by patching programs to play nicely with the distribution, and this includes the Linux kernel.

What this means is that we can't just download the source code for Linux and start compiling it, we have to download Debian's version and compile that instead. Thankfully this is easy since apt provides a bunch of "source" commands for fetching build dependencies and the exact source code used to build packages, so here's what I ran to get the build dependencies for the Linux kernel (It's a lot), and also Debian's linux source tree.

1
2
sudo apt build-dep linux
apt source linux

Another thing, by default the resulting kernel is a debug build and also only compiled using one thread. Having the kernel be a debug build means it occupies way more space, and only using one thread for compilation is just begging the compilation process to be slow. We fix this by running the following:

1
2
export MAKEFLAGS=-j$(nproc)
export DEB_BUILD_PROFILES='pkg.linux.nokerneldbg pkg.linux.nokerneldbginfo'

The hard part now is actually configuring the kernel, there are tools that can help with this, such as modprobed-db, which records the kernel modules that are actually loaded and used at any arbitrary point. It helps you only compile and use the stuff you actually need instead of compiling all modules, like every distribution out there.

If you have a modprobed-db database at $HOME/.config/modprobed.db, then you can run the following command to make a kernel configuration that uses only those modules.

1
make LSMOD=$HOME/.config/modprobed.db localmodconfig

Make sure to double check the configuration you end up with, by running the interactive configuration menu.

1
make nconfig

At last, when you've made sure everything is fine, you can start building your kernel.10

1
make -j$(nproc)

As for actually installing the kernel, this is so widely different between systems that I have no idea how to explain this to you. You'll just need to find a way that works for you, and make sure you're using backups because it's very likely that your first custom kernel will just fail.

Bait and switch, I won't be using this kernel.#

Ok, ok, I built and configured a kernel for your entertainment, does this mean I will actually use it and deploy it? No! Deploying a custom kernel is an overkill solution, and I only have one server, I do not want to risk getting kernel panics and then being locked out of my own server. There is so much at stake here that I am not going to risk it all just for a single blog post.

Besides, deploying my own kernel means I am the maintainer, and I must update the kernel myself and also make sure to update the configuration. This task is monumental for any system administrator, and it is much easier to just block the modules you will never use instead of deploying your own minimal, hardened kernel. For me, the choice is clear, it was fun to explore configuring my own kernel but there is no way I am ever going to actually use this in a live system.

What's the moral story this time around?#

I guess I want this blog post to just serve a message to all the Linux distributions out there, please get your minds together and solve this problem, it's frankly unreal that a user can be affected by kernel modules that they never had any intention of loading. Cybersecurity experts preach preventing new security flaws from occuring over just reducing them, and an easy to block this entire category of security flaws would be to limit the modules that can be loaded until they're confirmed by the user.

There will obviously be a cost in terms of workflow and convenience, but it's a small cost compared to the many security flaws that occur through unused, but still loadable modules. Do the right thing, and prioritize our security by making this a reality instead of just a pipe dream.

That's it, ses..


  1. I love using the term "socio-economic climate" to refer to our world burning up in flames. ↩︎

  2. I don't like insulting people but which asshole at Microsoft decided to make this an actual product? It's simultaneously so terrible and so genius, it's the main vendor lock-in for governments and institutions, because they can't migrate off of Active Directory, which is an actually useful feature that is heavily baked into Windows Server. ↩︎

  3. Ok, not literally. But apparently your printer can catch fire, so there is that. ↩︎

  4. Intensely eyeing Windows. ↩︎

  5. By the way, whenever I say RTFM, I don't actually mean it. I'd never let my snobbiness snoot down to that degree, and I'd never gatekeep knowledge behind difficult-to-use commands. Plus, manpages are generally just awful and difficult to read, searching never works for me either. I can't understand why people praise manpages when they suck so bad. ↩︎

  6. Stands for "as fudge" by the way, I would never curse in my blog. Except for that one time I said "bitch" (*audible gasp*) at the start of my oldest blog post. ↩︎

  7. lsmod shows a column first, so we gotta subtract the line count by 1 to actually get the amount of kernel modules. ↩︎

  8. Subtle foreshadowing. ↩︎

  9. Not-so-subtle foreshadowing. ↩︎

  10. Yes I know we did enable parallel compilation earlier, but why not do it here too? ↩︎