The Pragmatic Engineer
The Pragmatic Engineer
How Linux is built with Greg Kroah-Hartman
0:00
Current time: 0:00 / Total time: -1:21:18
-1:21:18

How Linux is built with Greg Kroah-Hartman

Greg Kroah-Hartman, a longtime Linux kernel maintainer, breaks down the inner workings of Linux development, from its unique trust model to the benefits of open-source contribution.

Stream the Latest Episode

Listen and watch now on YouTube, Spotify and Apple. See the episode transcript at the top of this page, and a summary at the bottom.

Brought to You By

WorkOS — The modern identity platform for B2B SaaS.

Vanta — Automate compliance and simplify security with Vanta.

In This Episode

Linux is the most widespread operating system, globally – but how is it built? Few people are better to answer this than Greg Kroah-Hartman: a Linux kernel maintainer for 25 years, and one of the 3 Linux Kernel Foundation Fellows (the other two are Linus Torvalds and Shuah Khan). Greg manages the Linux kernel’s stable releases, and is a maintainer of multiple kernel subsystems. He is also the author of the books Linux Kernel in a Nutshell and Linux Device Drivers.

We cover the inner workings of Linux kernel development, exploring everything from how changes get implemented to why its community-driven approach produces such reliable software. Greg shares insights about the kernel's unique trust model and makes a case for why engineers should contribute to open-source projects. We go into:

  • How widespread is Linux?

  • What is the Linux kernel responsible for – and why is it a monolith?

  • How does a kernel change get merged? A walkthrough

  • The 9-week development cycle for the Linux kernel

  • Testing the Linux kernel

  • Why is Linux so widespread?

  • The career benefits of open-source contribution

  • And much more!

Takeaways

1. Linux is the most widespread operating system globally. Linux runs on 4 billion Android devices – compared to which everything else is “a rounding error”. Still, Linux is the most popular operating system for servers and embedded devices. It’s also used on many smart TVs, air traffic control systems, and even on the International Space Station. Fun fact: Linux even runs inside many iPhones – as it is the firmware used for the Qualcomm 5G modems inside these devices!

2. Getting a change merged into the Linux kernel is surprisingly straightforward. Create the change (called a patch), test it locally, and send it to the right maintainer for review. The patch needs to go through a hierarchical tree of maintainers accepting it before it can make it into the kernel. We go through a specific change being merged up this tree.

3. Linux won because devs being “selfish” works! Developers contribute to Linux in a "selfish" way to solve their own problems. But turns out, many devs have the same problems: so every contribution makes Linux a better fit for other devs to use! Kernel maintainers only accept contributions that make sense for the whole project. For example, embedded device vendors helped make the Linux kernel more efficient. This efficiency later greatly helped Linux become the best choice for mobile OS on Android.

4. The Linux kernel is run in a very unique way – because the project is unique. The Linux kernel has 4,000 contributors per year, releases strictly every 9 weeks – and yet has practically no meetings, no project managers and uses email and git. This setup works because project management happens outside of the Linux kernel: contributors bring completed work. Also, the kernel team invests heavily in automation for e.g. triaging. And turns out, email scales really well – for this group, that is! (Note that other projects built on top of the Linux kernel – such as Linux distributions like Red Hat or Debian – all work differently. What works for the Linux kernel thanks to unique circumstances, won’t work for those projects).

5. Git was created as a solution for the Linux kernel’s source control needs. We talked about this story with Greg outside the podcast: it’s a fascinating story about how git was built and open-sourced after the Linux kernel group was unhappy with existing source control solutions.

In what is amusing: git has become the de facto source control product across tech thanks to products like GitHub and GitLab. The Linux kernel does not use GitHub – don’t forget, they already solved their source control workflow problems by writing git!

The Pragmatic Engineer deepdives relevant for this episode

Timestamps

(00:00) Intro

(02:23) How widespread is Linux?

(06:00) The difference in complexity in different devices powered by Linux

(09:20) What is the Linux kernel?

(14:00) Why trust is so important with the Linux kernel development

(16:02) A walk-through of a kernel change

(23:20) How Linux kernel development cycles work

(29:55) The testing process at Kernel and Kernel CI

(31:55) A case for the open source development process

(35:44) Linux kernel branches: Stable vs. development

(38:32) Challenges of maintaining older Linux code

(40:30) How Linux handles bug fixes

(44:40) The range of work Linux kernel engineers do

(48:33) Greg’s review process and its parallels with Uber’s RFC process

(51:48) Linux kernel within companies like IBM

(53:52) Why Linux is so widespread

(56:50) How Linux Kernel Institute runs without product managers

(1:02:01) The pros and cons of using Rust in Linux kernel

(1:09:55) How LLMs are utilized in bug fixes and coding in Linux

(1:12:13) The value of contributing to the Linux kernel or any open-source project

(1:16:40) Rapid fire round

A summary of the conversation

The Linux kernel

  • The Linux kernel is around 40 million lines of code. The core kernel –the part every Linux platform runs – is about 5% of this. . The remaining code supports diverse hardware, drivers, devices, architectures, and chips.

  • A typical laptop runs approximately two to 2.5M million lines of kernel code; servers around 1.5M, while mobile devices around 4M.

  • The role of the kernel: abstract away underlying hardware and present a consistent interface to user space programs. This allows the applications to run on different hardware without modification.

  • A monolithic kernel

    • Drivers in Linux are part of the kernel

    • This is a monolithic architecture: all code, including drivers, operates in the same address space

    • The monolithic approach allows for more refactoring options and more code-sharing opportunities between drivers. This results in Linux drivers being, on average, one-third smaller than drivers in other operating systems because common functionalities can be identified and consolidated.

  • Do not break userspace. The core principle of Linux kernel development is to never intentionally break user space. This guarantee ensures that users can upgrade their kernel without fear of their existing applications crashing. Accidental breakages are treated as faults and are promptly addressed.

Linux kernel development process

  • Fixed 9-week cadence.

    • Following a release by Linus Torvalds, a two-week merge window opens.

    • During this merge window, maintainers submit all the new features that have been pending and proven to work in their respective development trees to Linus.

    • rc1: after the two-week merge window, Linus issues the first release candidate (rc1).

    • For the subsequent seven weeks, only bug fixes are accepted. No new features are introduced during this stabilization period; the focus is on regression fixes and reverting problematic changes.

  • Hierarchical structure of maintainers.

    • Around 4,000 developers contribute code every year

    • They send changes via email to maintainers responsible for specific kernel subsystems

    • Kernel subsystem maintainers then forward collections of accepted changes up the chain

    • Ultimately, these changes reach Linus for inclusion in the main kernel tree.

  • Trust is key in Linux kernel development. When a maintainer accepts code from a developer, they implicitly take responsibility for it. For critical parts of the kernel, maintainers need to have a high degree of confidence in the developer and the quality of their work, as the maintainer becomes accountable if the original developer disappears.

  • Email and git. These are the two tools used during development.

  • "Linux Next:" a separate development tree that integrates all the changes destined for the next kernel release on a daily basis. This allows for continuous testing and identification of potential integration issues.

QA and stable releases

  • Linux Next: automated testing. This includes building and booting the kernel across various architectures and virtual machines.

  • KernelCI: a project that provides a more extensive continuous integration infrastructure, running tests on a wider range of real hardware contributed by different labs.

  • The testing process involves a mix of automated tests and real-world usage by developers and testers. The "zero-day bot" automatically tests patches submitted to mailing lists.

  • Stable kernel releases: these are maintained independently of the main development branch. After each major kernel release by Linus, a stable branch is forked.

    • Greg and Sasha Levin maintain these stable branches. They issue new stable releases weekly, incorporating bug fixes that have first been merged into Linus's tree. This ensures that stable branches do not diverge from the main development line.

  • Long-term stable (LTS) kernels: Greg picks one kernel per year and supports it for an extended period, initially two years, sometimes up to six years. Android phones, for instance, often run on these older LTS kernels, which still receive backported bug fixes. Greg and Sasha concurrently maintain multiple LTS kernels.

  • Maintaining older codebases is more challenging. This is due to the ongoing evolution of the kernel. Changes made in newer versions to fix bugs might be difficult to backport to older, significantly diverged code. Context is often lost over time, making even seemingly simple backports complex.

Contributors

  • About 80% of kernel contributors are paid – by their employer! Companies invest in Linux development because it's often more cost-effective to contribute features and fixes than to develop their own operating systems.

  • Contributing to the Linux kernel is a valuable way for developers to also invest in their careers. It demonstrates the ability to collaborate, work with existing codebases, and solve real-world problems.

  • Core maintainers meet annually to discuss and refine the development process.

Rust support?

  • Most of Linux is written in C, but Rust is gaining momentum. Approximately 25,000 lines of Rust code are already in the kernel, primarily for bindings but also for some functionality like generating QR codes on kernel crashes.

    • Introducing Rust aims to improve memory safety in certain parts of the kernel. However, writing drivers in Rust presents challenges due to the need for bindings to the extensive C codebase and the different memory management models of C and Rust.

    • Memory safety in Rust primarily refers to the safety of object lifecycles and memory ownership, not necessarily the elimination of all bugs. Logic errors and even memory unsafety can still occur in Rust code.

    • The adoption of Rust is also driving improvements in the existing C codebase, as the need to create Rust bindings encourages a re-evaluation of C code for better safety and clarity.

  • Will the Linux kernel add Rust support?

    • There is resistance to introducing new languages from some core kernel developers, who prefer to maintain a single-language codebase.

    • Efforts are underway to write more drivers in Rust, including experimental GPU drivers. Rust can be particularly well-suited for simpler hardware drivers.

    • Governments increasingly mandate the use of memory-safe languages, which is another factor driving the adoption of Rust in Linux.

    • That said, the Linux kernel community is also actively working on improving the safety of existing C code through techniques like bounds checking and compiler extensions.

Why contribute to Linux, and how?

  • Building and testing the kernel locally is a prerequisite for submitting changes.

  • Contributing, even a single patch, offers significant professional benefits. It strengthens a developer's resume by demonstrating the ability to collaborate and work with complex, established codebases.

  • Contributing provides valuable learning opportunities, exposing devs to different perspectives, coding practices, and challenging technical problems.

  • Newcomers can find entry points by working on less critical parts of the kernel, such as fixing coding style issues or removing dead code in older drivers. A good place to start is Kernel Newbies

Where to find Greg Kroah-Hartman:

• Social: https://social.kernel.org/gregkh

• Website: http://www.kroah.com/log/about/

Mentions during the episode:

• Linux Kernel Foundation: https://www.linuxkernelfoundation.com/

• International Space Station: https://www.nasa.gov/international-space-station/

• Raspberry Pi: https://www.raspberrypi.com/

• GitHub: https://github.com/

• Kernel CI: https://kernelci.org/

• Linus Torvalds on LinkedIn: https://www.linkedin.com/in/linustorvalds/

• Engineering Planning with RFCs, Design Documents and ADRs: https://newsletter.pragmaticengineer.com/p/rfcs-and-design-docs

• A guide to the Kernel Development Process: https://docs.kernel.org/process/development-process.html

• Rust: https://www.rust-lang.org/

• The Linux Kernel Maintainer Summit: https://events.linuxfoundation.org/linux-kernel-maintainer-summit/

• Linux Braille Console: https://www.kernel.org/doc/html/v4.16/admin-guide/braille-console.html

• Code Complete: A Practical Handbook of Software Construction: https://www.amazon.com/Code-Complete-Practical-Handbook-Construction/dp/0735619670

• Kernel Newbies: https://kernelnewbies.org/

Production and marketing by Pen Name. For inquiries about sponsoring the podcast, email podcast@pragmaticengineer.com.

Discussion about this episode