The Pragmatic Engineer

The Pulse: Antigravity 2.0 takes ‘IDE’ out of its new IDE

Gergely Orosz — Thu, 21 May 2026 17:01:39 GMT

The Pulse is a series covering events, insights, and trends within Big Tech and startups. Notice an interesting event or trend? Hit reply and share it with me.

Today, we cover:

Antigravity 2.0 takes the ‘IDE’ out of its new IDE. Feedback about the redesigned IDE is overwhelmingly negative due to bugs, poor UX & model support, and eating through Gemini tok…

Why Rust is different, with Alice Ryhl

Gergely Orosz — Wed, 20 May 2026 16:22:00 GMT

Stream the latest episode

Listen and watch now on YouTube, Spotify, and Apple. See the episode transcript at the top of this page, and timestamps for the episode at the bottom.

Brought to You by

• Antithesis – if you write Rust code, check out Hegel-Rust: a free, open-source property-based testing library for Rust, built by the team behind Hypothesis. It’s now maintained by Antithesis. If you try Hegel and like it, your Hegel tests will run in Antithesis as written, as well.

• Sentry – application monitoring software built by developers, for developers. I’ve used Sentry for 10 years now, starting back at Uber. It shows you the full context on issues: stack traces, user actions, environment details, and more. A new, recent feature: Seer, their debugging agent — I like it! Check out Sentry.

• Craft Conference: meet myself, Kent Beck, Hillel Wayne and other standout speakers on 4 June, in Budapest, Hungary, at the conference dedicated to the art and science of software delivery craft. See details.

In this episode

Rust is one of the most admired programming languages around – and also one of the hardest to learn. What makes developers stick with it?

In this episode of The Pragmatic Engineer Podcast, I sit down with Alice Ryhl, a software engineer on Google’s Android Rust team, and a core maintainer of Tokio, which is the most widely-used async runtime in Rust.

We discuss what makes Rust different from other languages like TypeScript, Go, and C++, and why so many developers say that “once it compiles, it works.” We go deep into memory safety, ownership, borrowing, unsafe Rust, and Cargo.

We also cover how Rust is governed by RFCs, feature flags, its six-week release cycle, how engineers get paid to work on the language, and also look into how Rust’s use inside the Linux kernel is progressing.

My observations from the conversation with Alice

Here are 12 of my most interesting takeaways from talking with Alice:

1. Open source contributions can open career doors. Alice landed her Google job not by sending in an application, but by spending years answering questions on Rust forums, and contributing to documentation and code. She also became a maintainer of Tokio while a college student. One day, an email arrived in her inbox, asking: “do you want to work on the Android Rust team?”

2. Reliability is the backend pitch for choosing Rust over TypeScript. Alice is adamant that Rust is a backend language and not a TypeScript replacement on the frontend. Rust’s design for minimizing errors and maximizing reliability can make it a better choice than TypeScript on the backend.

3. Rust over C++ for development is a strong choice for avoiding a host of security vulnerabilities. In C++, a trivial off-by-one error in an array can become a massive security vulnerability. In contrast, Rust’s memory safety eliminates an entire class of such bugs, unless you are brave – or foolhardy – enough to use an “unsafe” keyword, that is!

4. Rust was designed to turn implicit failures into compile errors. Where other languages allow you to forget something, Rust makes an omission into a compilation error for things like null checks, uninitialized variables, or error propagation with the ‘?’ character. If you mess something up, it’s almost certain your program will not compile. If it does, at the very least you should see a lint warning. We previously covered how Andrey Breslav, creator of Kotlin, purposely made null safety an important part of Kotlin.

5. The hardest part of learning Rust is not syntax, but data structure design. Alice observes that newcomers reflexively build cyclic object graphs, like a Book object referencing Page objects that refer back to the Book. Such cyclic graphs are possible in Rust, but its ownership model makes this hard, meaning that Rust novices end up battling the compiler. Obvious solutions are to use structs and understand how ownership works in Rust.

6. Refactoring in Rust is safe and easy, thanks to the compiler. Alice: “I change a return type or struct field, then just fix the compiler errors until the compiler stops shouting. And then once I’ve done that, I’ve updated every place I need to update.” Rust’s focus on correctness makes refactoring it more straightforward than dynamically-typed languages and Java-style typed ones are to refactor.

7. Rust may be optimal for AI agents because of the compiler’s high-quality feedback loop. Alice’s refactoring trick of just doing what the compiler says also applies to agents: they can talk to the compiler, be told what to fix, and iterate. Combined with Rust blocking entire bug classes by design, this makes it one of the better languages for agent-generated code.

8. “Editions” allow Rust to make breaking changes without ‘breaking’ anyone’s code. Rust editions (2015, 2018, 2021, 2024) can be mixed freely across crates. A library on the 2021 edition works seamlessly with a binary on the 2024 edition. This is how Rust evolves syntax (like adding async/await as keywords) without forcing an ecosystem-wide migration.

9. Rust’s governance precludes a “benevolent dictator for life”. Unlike with Python and Linux, teams in Rust self-organize and delegate to each other. Tough questions are hashed out at in-person events like ‘Rust All Hands’. It’s a good illustration that open source projects can thrive across different structures.

10. Rust in the Linux kernel has graduated from “experimental.” At December 2025’s Linux Kernel Maintainer Summit, the kernel community agreed Rust is no longer experimental. Combined with US Department of Defense regulations pushing agencies away from non-memory-safe languages, this means we should see more Rust in the Linux kernel and everywhere else, too.

11. AI code review may matter more than AI code generation in safety-critical codebases. The kernel community is experimenting with AI bots that review mailing-list patches. Maintainers reportedly find them impressive, especially for kernel code; an area where quality and reliability has always been more important than quantity. It’s interesting that AI might be helpful as an extra quality gate.

12. Risk of AI-assisted Rust: false fluency. Since Rust’s compiler is so strict, code that compiles can be assumed to be correct. However, Alice describes AI agents adding Rust versions of C build flags with no purpose! She also cautions that junior engineers using AI to learn Rust run the risk of not understanding why the compiler accepts the code they produce.

The Pragmatic Engineer deepdives relevant for this episode

• The past and future of modern backend practices

• How Kotlin was built with Andrey Breslav

• How Swift was built with Chris Lattner

• How Linux is built with Greg KH

Timestamps

(00:00) Intro

(04:09) Tokio: an overview

(05:11) What Alice likes about Rust

(12:48) Rust for TypeScript engineers

(13:51) Moving from C++ to Rust

(14:34) Memory safety

(18:12) Garbage collection tradeoffs

(21:46) Ownership, references, and borrowing

(26:59) Unsafe in Rust

(31:21) Crates and Cargo

(35:55) Language design and RFCs

(43:02) Building new features

(46:30) Editions vs. versions

(49:47) Getting paid to work on Rust

(51:27) Contributing to Rust

(53:03) Rust in the Linux kernel

(55:45) AI use cases for Rust

(1:01:35) Learning Rust

(1:03:54) Book recommendation

References

Where to find Alice Ryhl:

• LinkedIn: https://www.linkedin.com/in/aliceryhl

• Website: https://ryhl.io

Mentions during the episode:

• Rust: https://rust-lang.org

• Tokio: https://tokio.rs

• Minecraft: https://www.minecraft.net

• Rust Users Forum: https://users.rust-lang.org

• Null’s creator regrets inventing it: https://news.ycombinator.com/item?id=12427069

• PHP: https://www.php.net

• Go: https://go.dev

• TypeScript: https://www.typescriptlang.org

• C++: https://en.wikipedia.org/wiki/C%2B%2B

• Pip: https://pypi.org/project/pip

• Why Cargo Exists: https://doc.rust-lang.org/cargo/guide/why-cargo-exists.html

• Linus Torvalds: https://en.wikipedia.org/wiki/Linus_Torvalds

• Rust Week: https://2026.rustweek.org

• Inside Amazon’s Engineering Culture: https://newsletter.pragmaticengineer.com/p/amazon

• How Linux is built with Greg Kroah-Hartman: https://newsletter.pragmaticengineer.com/p/how-linux-is-built-with-greg-kroah

• Community Grants Program: https://rustfoundation.org/media/tag/community-grants-program

• Zulip: https://forge.rust-lang.org/platforms/zulip.html

• The Linux Kernel Archives: https://www.kernel.org

• Linux Plumbers Conference: https://lpc.events

• Gemini: https://gemini.google.com/app

• The Rust Programming Language: https://doc.rust-lang.org/book

• Rust for Rustaceans: Idiomatic Programming for Experienced Developers: https://www.amazon.com/Rust-Rustaceans-Programming-Experienced-Developers/dp/1718501854

• Rustlings: https://rustlings.rust-lang.org

—

Production and marketing by Pen Name.

AI’s impact on software engineers in 2026: key trends, Part 2

Gergely Orosz — Tue, 19 May 2026 16:43:10 GMT

Earlier this year, we asked The Pragmatic Engineer subscribers about the AI tools you use, how you use them, and, crucially, what you think of them. We received more than 900 responses, and with this article we conclude our analysis of that data, covering:

Tradeoffs of AI tooling. Positive and negative sentiments, like how AI tools often mean less time spent on tedious, repetitive work, but that it often leads to unrealistic business expectations.
Adopting AI at scale is hard. Companies are struggling to achieve adoption that feels productive at the team and org levels. Our survey finds the benefits of AI heavily depend on the engineering culture that was in place before.
Impact on codebase quality. Codebase quality seems to be decreasing, but management at most places does not care. Meanwhile, maintenance duty is falling upon a shrinking number of engineers who still understand increasingly complex codebases.
Less experienced engineers and AI. For these folks, AI seems to be less helpful and they rack up higher AI token bills. Maybe they need more space to learn, mentorship, and support.
AI tooling ‘addiction’. Using AI agents “feels like a slot machine” encouraging “just one more prompt”-type behavior, while some folks think that the pricing of plans is built in a way to “lure” them to prompt more and more.
Changes since 2024. Fewer devs are negative about AI, but there’s not all that much more positivity. The models have become much higher quality, and better tooling improves trust.
Where are we headed? The concept of code ownership seems to be eroding, and collaboration within teams becomes less important.

Previously in this series, we covered:

AI tooling for software engineers in 2026: a detailed summary of survey responses, covering the most-used AI tools, trends, AI agent usage, company size and usage, and tools engineers love.
The impact of AI on software engineers in 2026: key trends. Part 1. Concerns about mounting AI costs, more engineers hitting usage limits, and AI tools having uneven effects upon different types of engineers

Full subscribers can access a more detailed report.

1. AI tooling tradeoffs

What can be said about the impact of AI tooling on engineers this year? Well, based on the responses that readers sent in to our survey, there are some common patterns:

Code and output tradeoffs

Productivity tradeoffs

Research published two weeks ago (5 May) by Microsoft claims AI expands the pool of people who can do high-value work. Our survey found similar, while MS’s findings are based on Microsoft 365 Copilot chat usage.

“AI used to do more high-value work” Source: Microsoft

Obviously, Microsoft has an interest in showing the benefits of its AI tools, but in this specific area, it’s telling that both studies agree AI does allow devs and non-devs to expand the type of work they do.

Software engineering tradeoffs

With AI tools, their ‘mileage may vary’ among individuals. For example, some respondents told us they can spend much more time in a “flow state” thanks to AI tools, as they don’t have to wait for input from peers, can keep unblocking themselves, and have fewer interruptions.

But others say the opposite: that because they can start so many more tasks in parallel with AI tools, they keep context switching which knocks them out of a flow state!

The positives and negatives of these tools seem dependent on the environment respondents work in, individuals’ personality traits, and where users are on the AI learning curve.

2. Adopting AI at scale is hard

One pattern in the responses is that adopting AI at company-scale remains challenging, including:

Costs: a growing concern covered in depth in Part 1
Usage: getting people to use AI tools continuously is not always straightforward
Onboarding and education: at larger companies, there needs to be support to help devs make the most of the tools
Reviewing AI-generated output: code review is a particular pain point
Integrating with internal systems: AI tools are more helpful when they are seamlessly integrated with internal systems, hence why many larger companies use in-house, deeply embedded coding agents

Adopting AI at company-level doesn’t lead to a magical fix for engineering problems, while the benefits of AI in an organizational sense seem to depend on what was in place before.

Case in point, AI seems to amplify pre-existing engineering culture. AI doesn’t change the underlying quality of an organization’s engineering culture. Teams with strong engineering practices get more positive benefits out of AI than those without. Teams that see benefits from AI tools already had:

Guardrails: testing and automation around the codebase and deployments
Documentation: they recorded their architectural decisions and engineering practices
A quality codebase: AI agents will replicate patterns already in a codebase

A few quotes on this from respondents to our survey:

“AI is an amplifier, not a fixer. Good software engineering practices get multiplied. So do the bad ones. Embedding this properly in teams is exciting and important”. – Staff+ engineer at a large company in Europe
“I feel like AI allows both faster prototyping and increased velocity on iterations to production software; it relies on existing best practices / project templates our team already have”. – Solutions Architect at a small company in the US

A workflow that makes one dev “10x” more productive may not work for another. This is another reason why rolling out AI tools doesn’t seem to magically make everyone more productive. A senior engineer working at a large company in Canada told us:

“It feels like AI workflows are very idiosyncratic in that some people derive (I hate this framing, but…) 10x more productivity benefit from them than other apparently equally clever, educated, and diligent developers. It feels like finding a workflow that clicks with your own habits and heuristics is more important than finding a global optimum for everyone”.

AI amplifies individuals differently, so the team impact is messy to figure out. A US-based principal engineer at a large company reflected on feeling disconnected from colleagues because of how they use AI differently:

“I use AI in what I think is probably a more sophisticated way than most of my colleagues, so there can be a disconnect between my work and theirs, which is not good news because I am “The Principal” on the team”.

“The tool that works for you” approach can lead to tooling chaos, even at a team level. While it’s empowering to allow devs to choose the AI tools they feel are the most helpful, over time it becomes chaotic when teams can choose their own tech stacks, and when at large companies there are dozens of different technologies. A staff+ engineer at a 200-person business in the Middle East wrote in their response:

“We’re still trying to figure out how to deal with tooling consistency on a team level. It’s one of our biggest struggles, but possibly more due to company structure than anything else. Everyone is using different tools with little coherence. It’s been rough.”

Some companies have briefly rolled back AI to deal with the negative effects first. From an engineering lead at a 10,000+ person company in Europe:

“Since the AI boom, the quality of technical writing and reasoning from senior engineers in my org has significantly deteriorated. There’s an overwhelming volume of low-quality work product that is generated entirely or in part by AI, which has made it very difficult to conduct meaningful review of RFCs or code. We’ve also seen costly production incidents caused by code written and/or approved by AI, and – while my employer initially bought heavily into the hype – we have now rolled back some of our AI tools to deal with the drop in quality.”

3. Impact on codebase quality

A concrete pattern in our survey data is that codebase quality is decreasing due to AI. The contributing factors are not surprising:

“AI slop”: more low-quality code generated, such as duplicated, verbose code, and poor abstractions
Too many code reviews, which means review quality slips
More bugs: due to faster code output and less strict reviews, more bugs sneak into codebases

We discuss the degradation of products and codebases in a recent deepdive, Are AI agents actually slowing us down?

One CTO at a European startup lists their negatives about increased AI usage:

“A lot of tiny bugs and low code quality if you are not careful, verify carefully, and have good structure and guardrails
AI agents generate too much and repetitive code, making systems harder to maintain
Developers lose understanding of the codebase and become numb to bad architecture and bad developer experience”

According to our survey, management often seems unfazed by decreasing quality, and instead focuses on the higher output. A principal DevOps engineer at a large European company said:

“In our company, we hand AI tools to inexperienced engineers who can’t distinguish good code from bad code and it’s falling on deaf ears in our leadership. They only seem to care about short to mid-term cost savings.”

The maintenance burden of AI-generated code is falling on the fewer engineers who understand and care. A staff engineer at a European company listed the problems that AI-generated code is causing:

“Drive by” contributions are up: many more occasional non-core-engineer contributors adding code but not sharing the maintenance burden
Contributing without adding guardrails: many engineers and most of engineering leadership are not using reasonable guardrails like tests
AI slop from folks who have nothing to do with the codebase: huge volume of slop incoming from people who don’t understand the codebase, but will commit and create PRs without fully understanding what they’re doing
Complexity is exploding: thanks to the above

The maintenance budget is falling upon fewer devs, while the task of refactoring bloated codebases and reducing complexity is left to those still sufficiently in touch with the codebase, thereby making the maintenance burden even worse.

But some leaders “get it.” A CEO at a 20-person company told us:

“While AI has made generating code ‘cheaper’, the monitoring and maintenance worry me; the things that have traditionally cost the most in software. We’re increasing the rate of shipping large amounts of code with less understanding and increasing the unpredictability, so how do we work the predictability back on top?”

There is industry pressure on companies to adopt AI tooling and impose its usage upon engineers, driven by a mix of factors:

Seeing actual benefits of AI and hearing that other teams and companies enjoy success with it
Fear of being left behind by competitors, or becoming less relevant
Anxiety about investor interest if a company is seen as not adopting the latest AI tools.

This often leads to:

Top-down mandates to use AI
Expectation of headcount reduction, with smaller teams doing the same amount or more
Management treating AI productivity gains as a baseline rather than a bonus.

One staff engineer at a 10,000-person US company explained it like this:

“AI is part of almost every work conversation. The entire company expects it to increase productivity and reduce the need to hire people. I keep trying to get better at using it and trying to make it more reliable so I can do more. I do worry about the quality of the work and atrophy of certain skills. It’s unclear to me if those skills even matter anymore.”

Pushing AI adoption blindly triggers red flags. Respondents shared what makes them worry about things going south in their workplaces:

Focus on tracking AI usage, but not the quality of the output. This will likely lead to product regressions and unhappy customers
Pushing for universal adoption. Some companies target 50%, 80%, or 100% AI usage for certain tasks, seemingly blind to how some targets can worsen the quality of output, or simply create wasteful usage. See the trend of tokenmaxxing.
A focus on velocity, but without recognition of quality work. Expecting more velocity and output seems to be the baseline, and there is no recognition for work truly well done.

The “move fast and break things” mantra famously championed by Facebook seems widespread across the industry with AI tools. A senior manager working at a large, European-headquartered company told us:

“I see a trend: move fast and break things, and end up breaking things too often. We have to learn to focus on testing and resiliency a lot more, as with AI-driven development we introduce more bugs than before. But the velocity gain is bigger for now”.

Output over quality is leading to the death of code review at some places. As a lead engineer working at a small company summarized:

“We’re at the death of code review. I used to do very deep code reviews where I’d take the time to understand the architecture and organization and provide feedback on maintainability and efficiency. I have no motivation in spending that time to review a giant PR where it’s clear that even the original author didn’t bother to do that”.

4. Less experienced engineers and AI

The first generation of software engineers who have never developed without the help of AI are now entering the industry. Here is a response from a young engineer working at a startup as an intern:

“I have never worked as a developer without AI. Writing this scares me a bit, actually, but it’s the truth!”

But this will be the new reality for those joining the industry. So, what needs to happen to help a new generation of “AI-native” grad engineers grow professionally?

AI is an amplifier which could amplify the lack of experience. A staff engineer in the US at a large company told us:

“Agentic AI is a fascinating mirror. It can code as well as the user who drives it. If that user is a junior engineer, now you have a faster junior engineer. If the user is a staff engineer, now you have a faster staff engineer.
What agentic AI doesn’t do is magically convert a junior engineer into a staff engineer, because the user driving it still needs enough experience to know what a good solution looks like”.

A junior engineer in Australia shares their experience of how frustrating working with AI tools is:

“I think AI agents are great for vibe coding or prototypes where the code quality and functionality doesn’t matter that much. I think it’s also useful for senior engineers who know what they’re doing.
For junior engineers like myself, these AI tools are stressful to use. I don’t have the experience or knowledge to tell AI exactly what to do or quickly confirm its output, so I spend a lot of time on just triple checking and redoing stuff. I’m overall frustrated, but I’m trying to embrace it as we’ve been asked to by the company”.

Less experienced engineers seem to use more AI tokens and rack up higher bills. Several respondents observed this: director-level folks noted that junior engineers are in the top-spender category in their orgs, and it is junior devs who spend tokens on unproductive use cases.

There should be more space for junior folks to grow because they use AI more, one staff engineer respondent said:

“Companies need to give some breathing room to Junior engineers and help them learn and acquire knowledge using AI tools as a booster and not as a replacement”.

Junior folks seem to be delegated fewer opportunities that could help them grow. This is because senior people can turn to AI for tasks, including those which they would have previously delegated to an intern or new grad. A few responses mention this:

“AI allows me to have work done that I would usually delegate to a junior or pay a SaaS for; e.g., writing drafts, summarizing the news. “ DevSecOps lead at a small company, Europe
“I’ve begun to automate any repetitive task that we previously relied on juniors and offshore contractors for.” - Engineering manager, at a large company, US
“I no longer have to delegate work by writing a very long document and briefing a junior engineer.” - Principal engineer, large company, Europe

Why not consider mentoring junior devs in your organization? It’s clear that less experienced engineers are having a rocky start to their careers, so delegating stuff to them instead of to AI could be of high value for newer generations of talent.

5. AI tooling ‘addiction’

It seems that the rapid feedback loops of AI-assisted development create addictive tendencies, and there’s a noteworthy presence of “addiction lingo” in some responses to our survey:

The Pulse: Forward deployed engineering heats up again

Gergely Orosz — Thu, 14 May 2026 16:09:57 GMT

The Pulse is a series covering events, insights, and trends within Big Tech and startups. Notice an interesting event or trend? Hit reply and share it with me.

Today, we cover:

Forward deployed engineering heats up again. Massive demand for the role at Google, OpenAI, and Anthropic. The latest version of the FDE role looks like the consultant / solution a…

TypeScript, C# and Turbo Pascal with Anders Hejlsberg

Gergely Orosz — Wed, 13 May 2026 17:06:59 GMT

Stream the latest episode

Listen and watch now on YouTube, Spotify, and Apple. See the episode transcript at the top of this page, and timestamps for the episode at the bottom.

Brought to You by

• Antithesis – verify your system’s correctness in the development loop. Build, extend and deploy mission-critical systems: without human review or traditional integration tests, all while avoiding embarrassing outages. Antithesis goes far beyond code review and runs your complete system, exhaustively analyzing its behavior, and exposes bugs as quickly as agents introduce them. Learn more.

• WorkOS – Anders has spent decades thinking in programming languages — WorkOS speaks the same language when it comes to enterprise infrastructure: SSO, SCIM, RBAC, and more, built right so you never have to. See how.

• turbopuffer – a ridiculously scalable, fast and cheap vector and full-text search engine. Cursor moved to turbopuffer after their existing vector database couldn’t keep up with the number of codebases they were adding. This move cut their semantic search costs by 95%, and they have so many good things to say about turbopuffer. Check it out.

In this episode

Anders Hejlsberg is a living legend and one of the most influential programming language designers of all time. He created Turbo Pascal, Delphi, C#, and also TypeScript. As well as that, he spent nearly a decade at the pioneering dev tools company, Borland, and is now in his 30th year of working at Microsoft, where he’s a Technical Fellow.

In this episode, we discuss what it takes to build programming languages that developers love to use, and trace his career from writing his first compiler to creating Turbo Pascal and Delphi, and helping to pioneer modern software development through C# and TypeScript.

Anders details how C# was designed by a small group of experienced language designers who met a few hours each week, and he explains why tooling was just as important as the language for TypeScript’s success, and what he has learned from building languages which stay relevant for decades.

We also look into how Anders uses AI today, which language features suit AI-assisted development, and what he thinks is changing in the craft of software engineering as developers move further away from writing code line by line.

It was special for me as well to talk with Anders: because my first-ever programming language was Turbo Pascal, and my first paying job was at a C# shop, back in the day. And I write most of my backend services using TypeScript, today.

My observations from the conversation with Anders

Here are 14 of my most interesting takeaways from talking with Anders:

1. The first computers had zero abstractions on top of the hardware. Anders started programming on an HP 2100 with 32K of memory and a paper tape boot loader. “Things were so simple. You could see all the way to the bottom. There was just no layering. It was right on top of the hardware,” he says. The past 50 years of computing have been a process of adding layers above the hardware.

2. Turbo Pascal won by being more than “just” a compiler. Anders also built an IDE to edit Turbo Pascal programs in, and a debugger to troubleshoot them with. In later languages which Anders designed, he always focused on the entire developer workflow, not just compilation.

3. “10x better for 1/10th of the price” is a proven winner. This is what Turbo Pascal did: it sold for $49.95 when competing compilers cost $500, and it was faster and more interactive than competitors’ products. Conveniently, the low price tag also killed off piracy

4. The first Turbo Pascal “debugger” was an elegant hack. The compiler printed the program counter address on a runtime error. Then you could re-run the compiler in a mode that stopped at that address, and the IDE would show which line was being syntactically processed at that point. There were no line maps or debuggers built for this debugging experience: Constraints force creativity!

5. C# might have not existed without a famous court case. Microsoft originally hired Anders to architect its Java tools (Visual J++), but the Sun versus Microsoft lawsuit (1997-2001) meant Microsoft could not build on top of Java, as the company that owned Java’s IP (Sun) sued MS for alleged unauthorized changes to the Java language. Microsoft realized it had to build a new language that combined VB’s productivity with C++’s power. This led to C# and .NET.

6. The original C# design team numbered only six. They held three meetings per week which lasted two hours each, during which they debated what to build. Anders believes that designing even large languages does not require a large team; it’s the right experience that matters most. In that group, this meant folks who had built languages before.

7. C# introduced the async/await pattern that many languages later ran with. Anders said this pattern spread to so many other languages because compilers can generate state machines that humans hate writing. Manual state machines require moving all stack states into objects and wrapping logic in a giant case statement. Devs generally loath doing this by hand, and async/await lets developers write sequential-looking code while the compiler does the painful rewrite behind the scenes.

8. TypeScript exists because Anders refused to build Script# for the Outlook.com team. Microsoft’s Outlook.com team asked Anders’ C# team to productize “Script#,” a language to cross-compile C# to JavaScript. Anders and the C# team pushed back, suggesting that a better approach was to fix JavaScript. Anders felt strongly that to be attractive to the best-of-breed developers in the JavaScript ecosystem, you want people to write JavaScript, and not another language like C#.

9. Open development on GitHub made TypeScript much better. TypeScript was open source from the beginning in 2012 on CodePlex, Microsoft’s open source platform. There was not much community activity there, and in 2014, the project moved to GitHub with its large, active community. “Open development” on GitHub is what Anders credits for making TypeScript as good as it has become.

10. The TypeScript compiler breaks many traditional practices. The compiler is built to support lazy evaluations, and deferred imports, and is functional by necessity. For example, with 500 files open, the compiler keeps abstract syntax trees (ASTs) cached for 499, and rebuilds just the one being edited. It only resolves the minimum types needed for the cursor’s current position.

11. Training data volume is what makes AI great at TypeScript and Python. Anders says the language best suited for AI is the one that AI has seen the most in its training set. How well AI agents work in a specific language largely depends on how much of that language exists on the internet. It has less to do with the design of a language.

12. But AI is limited for writing compilers – for now, at least. On Anders’ team, AI is limited in terms of seeing the “big picture”; for example, how types, symbols, binding, and parsing all relate. It’s because LLM training sets don’t contain much about compilers as yet.

13. Reviewing code could be the future of the craft. Anders predicts we’ll all be project managers in the future, managing armies of junior programmers, aka agents, which generate reams of code. Anders admits he is less interested in reviewing code, personally, but reckons code review could be made much more interesting; for example, AI generating commentary that guides reviewers through changes.

14. Designing a programming language is a 10-year play. As Anders puts it:

“Version one is great, but has all sorts of issues. You’ve got to do version two, but it’s not until version three that it really starts to be great. Then you’ve got to convince people to adopt it.”

The Pragmatic Engineer deepdives relevant for this episode

• Microsoft’s developer tools roots

• 50 Years of Microsoft and developer tools with Scott Guthrie

• How Linux is built with Greg Kroah-Hartman

• How will AI change operating systems? Part 1: Ubuntu and Linux

• How Uber uses AI for development: inside look

Timestamps

(00:00) Intro

(02:48) How Anders got into programming

(05:40) Building his first compiler

(07:44) Turbo Pascal

(12:25) Delphi

(14:53) Joining Microsoft

(19:41) Building C#

(29:11) Async/await

(34:01) The rise of JavaScript

(37:52) Building TypeScript

(42:58) How the TypeScript compiler works

(48:30) JavaScript’s strengths and weaknesses

(52:18) How Anders uses AI

(56:03) What language features work well with AI

(1:02:49) How software craftsmanship is changing

(1:07:49) Performance and efficiency

(1:09:29) Anders’ tool stack

(1:11:30) A 30-year career at Microsoft

(1:13:40) Book recommendation

References

Where to find Anders Hejlsberg:

• X: https://x.com/ahejlsberg

• LinkedIn: https://www.linkedin.com/in/ahejlsberg

• GitHub: https://github.com/ahejlsberg

Mentions during the episode:

• Turbo Pascal: https://en.wikipedia.org/wiki/Turbo_Pascal

• Borland: https://en.wikipedia.org/wiki/Borland

• Delphi: https://en.wikipedia.org/wiki/Delphi_(software)

• Visual Basic: https://learn.microsoft.com/en-us/dotnet/visual-basic

• Skype: https://en.wikipedia.org/wiki/Skype

• J++: https://en.wikipedia.org/wiki/Visual_J%2B%2B

• Java: https://en.wikipedia.org/wiki/Java_(programming_language)

• Legal Newsroom Archive: Sun Microsystems Inc. and Microsoft: https://news.microsoft.com/case-archives/sun-microsystems-inc-and-microsoft

• .NET: https://dotnet.microsoft.com

• Roslyn: https://en.wikipedia.org/wiki/Roslyn_(compiler)

• Async/await: https://en.wikipedia.org/wiki/Async/await

• Go: https://go.dev/

• JavaScript: https://en.wikipedia.org/wiki/JavaScript

• Brendan Eich’s website: https://brendaneich.com

• Steve Ballmer on X: https://x.com/Steven_Ballmer

• Luke Hoban’s website: https://lukehoban.com

• TypeScript: https://www.typescriptlang.org

• GitHub: https://github.com

• Language Server Protocol: https://en.wikipedia.org/wiki/Language_Server_Protocol

• Caml: https://caml.inria.fr

• PHP: https://www.php.net

• Algorithms + Data Structures = Programs: https://www.cl72.org/110dataAlgo/Algorithms%20%20%20Data%20Structures%20=%20Programs%20%5BWirth%201976-02%5D.pdf

—

Production and marketing by Pen Name.

Revisiting “No Silver Bullets” in the age of AI

Gergely Orosz — Tue, 12 May 2026 17:10:28 GMT

Before we start, some news: my tech compensation site focused on tech total compensation (TC) in Europe, TechPays has been acquired by Levels.fyi! TechPays was a project I’ve been building on the side with engineering manager Zsombor Erdődy-Nagy for a few years, and both of us are pleased that the site found a new and welcoming home. Read more.

Four decades ago, the writer of ‘The Mythical Man-Month’ (1975), drew on folklore about werewolves to publish a paper about the prospects of a so-called silver bullet for software development that would make professionals much more productive at their craft.

The Werewolf of Eschenbach, Germany, line engraving, 1685. This image appears in the “No Silver Bullets” chapter of Mythical Man-Month (1995)

Frederick P. Brooks published “No Silver Bullet – Essence and Accident in Software Engineering” in 1986, and as the title suggests, it is pessimistic about the existence of any silver bullets. The term refers to a super weapon capable of dropping otherwise near-unstoppable werewolves and other creepy supernatural beings in European folk tales.

Since its release, this paper might have become even better-known than Mythical Man-Month (MMM). In 1995, the second edition of that book included Brooks’ later essay as chapter 17, along with an additional chapter of reflections.

In this article, we look into whether the essay was correct in its disbelief in silver bullets, or whether any did indeed slay the beast of unproductivity for developers over the course of time. Also, how does AI agents generating so much code, as of today, challenge the entire premise – or not?

We cover:

“No silver bullets” – why has it held up? No single new technology or methodological breakthrough by itself introduced magnitudes-of-improvement to the areas that really matter in software engineering. Is that unusual?
Is SRE a silver bullet? Google’s Search team introduced the SRE discipline, and won orders-of-magnitude superior reliability to its competitors. But why only Google Search?
Was open source + GitHub a silent silver bullet? No development had a bigger impact on the wider tech industry than the open source wave since the 2010s. Has it been a silent silver bullet, an overlooked cause?
Could AI be a silver bullet? At first glance, AI generates 100x-or-more code output. But productivity, reliability, and simplicity improvements are a bit unimpressive – at least for now.

Brooks was a computer scientist who led IBM’s System/360 and OS/360 operating systems development, ‘The Mythical Man-Month’ was published in 1975. Last year, we did a deepdive into this engineering classic (Part 1, Part 2, Part 3, Part 4), delving into its predictions and legacy.

1. No silver bullets?

The paper delves into folklore for its motif, a ‘silver bullet,’ and uses it to pose the question of whether there would be any “silver bullets” on the horizon (in 1986) that could be similarly fatal to software engineering complexity. From the paper (emphasis mine:)

“Of all the monsters who fill the nightmares of our folklore, none terrify more than werewolves, because they transform unexpectedly from the familiar into horrors. For these, one seeks bullets of silver that can magically lay them to rest.
The familiar software project has something of this character (at least as seen by the non-technical manager), usually innocent and straightforward, but capable of becoming a monster of missed schedules, blown budgets, and flawed products. So, we hear desperate cries for a silver bullet, something to make software costs drop as rapidly as computer hardware costs do.
But, as we look to the horizon of a decade hence, we see no silver bullet. There is no single development, in either technology or management technique, which by itself promises even one order of magnitude improvement in productivity, in reliability, in simplicity.
Skepticism is not pessimism, however. Although we see no startling breakthroughs, and indeed, believe such to be inconsistent with the nature of software, many encouraging innovations are under way. A disciplined, consistent effort to develop, propagate, and exploit them should indeed yield an order-of-magnitude improvement. There is no royal road, but there is a road.
The first step toward the management of disease was replacement of demon theories and humor theories by the germ theory. That very step, the beginning of hope, in itself dashed all hopes of magical solutions. It told workers that progress would be made stepwise, at great effort, and that a persistent, unremitting care would have to be paid to a discipline of cleanliness. So it is with software engineering today.”

In 1995, Brooks revisited his idea that silver bullets weren’t real in the software domain. From the Mythical Man-Month’s anniversary edition:

“No Silver Bullet” asserts and argues that no single software engineering development will produce an order-of-magnitude improvement in programming productivity within ten years (from the paper’s publication in 1986). We are now nine years into that decade, so it is timely to see how this prediction is holding up.
Whereas The Mythical Man-Month generated many citations but little argument, “No Silver Bullet” has occasioned rebuttal papers, letters to journal editors, and letters and essays that continue to this day.
Most of these attack the central argument that there is no magical solution, and my clear opinion that there cannot be one. Most agree with most of the arguments in “NSB,” but then go on to assert that there is indeed a silver bullet for the software beast, which the author has invented. As I reread the early responses today, I can’t help noticing that the nostrums pushed so vigorously in 1986 and 1987 have not had the dramatic effects claimed.”

Brooks re-concluded that there had been no technological breakthroughs of the type postulated in NSB.

But motivation can also have silver bullet-like effects and always has had, he found via more research into scientific evidence that motivation can boost productivity. In his own words:

“Since “NSB,” Bruce Blum has drawn my attention to the 1959 work of Herzberg, Mausner, and Sayderman.
They find that motivational factors can increase productivity. On the other hand, environmental and accidental factors, no matter how positive, cannot; but these factors can decrease productivity when negative. “NSB” argues that much software progress has been the removal of such negative factors: stunningly awkward machine languages, batch processing with long turnaround times, poor tools, and severe memory constraints.”

Today, it’s a long time since the mid-nineties; with the benefit of hindsight, were there any silver bullets flying between then and 2022, which fit the bill as slayers of unproductiveness? I suggest a few, below. If you can name other silver bullets since the launch of Windows 95, please do so in the comments!

Version control: (late 1990s.) CVS, Subversion, and later, Git. Version control allowed engineers to collaborate much more fluently, leading to more teamwork and – in some cases – less full-on solo labor.
IDEs: (early 2000s). Modern IDEs like Visual Studio, IntelliJS, and others make context-rich editing easy and fast. They also allow for faster, less error-prone refactoring and more efficient debugging.
CI/CD and automated testing: (mid-2000s). CI systems started to spread during the 2000s with the likes of CruiseControl (2000s) → Jenkins and SaaS CI solutions from the 2010s (e.g., Travis CI, CircleCI, GitLab CI, GitHub Actions).
Open source and package managers: (2010s). Open source has been around for decades, but GitHub’s rapid adoption made it easier to create and discover, coupled with package managers in the Node, Python, and other language ecosystems to build on top of open source solutions.
StackOverflow: (2010s). The popular programming Q&A site made it easier to get unstuck by finding solutions to common problems, with the capability to ask questions and get responses from the large user community within hours. By 2025, the site was pretty much dead.
Cloud: (early 2010s). AWS launched in 2006, then Azure and Google Cloud in 2008, and they went mainstream in the 2010s. Today, cloud is everywhere, and increasingly more infrastructure startups build on top of hyperscalers (the biggest public cloud platforms), such as Vercel / Netlify (platform-as-a-service), Supabase / Turbopuffer (databases) and others.
Platform teams & DevEx teams: (mid-2010s). Mid-sized and larger tech companies created dedicated platform teams to own infrastructure and internal platforms. In the 2020s, larger tech companies have created ‘Developer Experience’ teams to build better internal dev tools and workflows. We previously covered Uber’s program/platform split, and Uber’s developer experience evolution.
SRE: the Site Reliability Engineer (SRE) profession grew during the 2010s, much influenced by Google.

These technologies increased developer efficiency and productivity, but none by itself was a productivity accelerator in isolation.

Obviously, by 2022 the craft of building software had developed greatly since ‘No Silver Bullets’ came out; and was more efficient, faster, and more collaborative than ever. One highly anecdotal way to identify this is via the disappearance of cake from some tech workplaces. Back in the day, cake was distributed at work for major product milestones being hit: the shipping of a new product was often marked with awards and tasty baked treats – at least on teams building browsers, like the IE and Firefox teams.

But by the 2010s, shipping frequency had increased by so much and was an everyday, unremarkable occurrence at some places, according to Matt Brubeck, a former engineer on the Firefox team:

“Back when Firefox 2 was released (six years ago this week!), the Internet Explorer team started a friendly tradition of sending Mozilla a cake as congratulations. This continued for Firefox 3 and Firefox 4. After Firefox switched from major releases once or twice a year to incremental updates every six weeks, they sent us a cupcake for the next few updates instead. :)” Mozilla engineer, Matt Brubeck

Fruits of success: Cake for Mozilla Firefox team in 2006 after shipping Firefox 1

Today, Firefox ships a stable version about once a month, as does Chrome. In this context, marking each release with more cake could inadvertently cause some health issues on the team – too much cake, that is! From this September, Chrome will switch to shipping every two weeks.

Agile and Scrum is worth a mention; not as a technology, but a methodology: Scrum encourages teams to move in smaller cycles and deliver more frequently, via sprints that typically range from a week to a month. In the early 2000s, this methodology spread quickly and brought efficiency improvements to many tech companies. However, by the early 2020s, many startups and some of Big Tech had moved on, as covered in How Big Tech runs tech projects and the curious absence of Scrum:

“Scrum got in the way of shipping on a daily basis. The whole idea of Scrum revolves around Sprints, of committing to tasks at the beginning of the sprint, working on these during the sprint, and demoing what we did at the end.
The process felt unnatural and like it had been forced on a fast-moving web team. We soon moved to a more fluid way of working, taking the Kanban approach. We stopped caring about sprints, and dropped most rituals that come with Scrum. We just cared about knowing what we’re working on now, and what it was we’d get done next.”

Basically, Scrum worked and still does so for teams wanting to shorten shipping cadence from months to weeks. But for teams shipping daily, it often gets in the way.

One area that improved significantly has been the pace of shipping incremental software. In 1975, shipping software several times per day with elements like version control, CI/CD, feature flags and engineers being oncall, might have sounded far-fetched. Back then, software delivery was measured in months and years. In this way, we’ve perhaps made improvements overall in the regions of 10x to 100x over the years.

But that came via combinations of new tools like version control and CI/CD, new approaches & methodologies, and testing – and also from shifting constraints; for example, it’s now possible to revert backend changes rapidly, and code shipped in binaries can be controlled by feature flags in many cases.

Even so, improvements were mainly in iteration speed and not necessarily in the complexity of the software shipped. With all that progress, shipping complex and high-quality software still takes comparable time, often years, as 50 years ago. A prime example is the upcoming video game, Grand Theft Auto VI, probably by now the most highly-anticipated game ever, which is set to launch in November, after at least six years – and potentially 12 – of total development time:

Initial planning started in 2014 (12 years of development)
Development started in earnest in 2020 (circa six years of full-on development)
The studio, Rockstar, confirmed development was underway in February 2022 (at least 4.5 years of full development)

GTA VI: more ambitious but also slower

The video game development timeline is as long as it ever was, and even longer, as developers strive to meet players’ expectations on things like graphics, lighting, and physics. GTA 6 looks like being the most complex installment in the long-running series. So, perhaps there’s not been much change in software delivery timelines because when we have more capabilities to work with, the goals get more ambitious and the bar for “standout” software keeps rising.

2. Is SRE a silver bullet?

Brooks’s definition of a silver bullet:

“A single development, in either technology or management technique, which by itself promises even one order of magnitude improvement in productivity, in reliability, in simplicity.”

In simplicity and productivity terms, I struggle to name a single approach that delivered a 10x-or-more improvement by itself. But in the area of reliability, one company that has pioneered novel approaches since the 2000s is Google. Google.com is probably the single most reliable piece of internet software of all. In the last 15 years, Google Search has suffered a single outage, on 8 August 2022, which lasted around an hour. Otherwise, there have been no global outages (of course, there have been several for other Google services).

In 2003, Google created the ‘Site Reliability Engineering’ (SRE) role. SRE veteran, Dave O’Connor, shared with us:

“The fervent belief of Google’s founders was that speed and reliability mattered more than features. This belief was coupled with the understanding that we couldn’t achieve it traditionally, which made it an existential issue. The level of investment in building out all layers of the serving stack was a case of “because we can”, but also “because we have to, as nowhere else does what we need”.
There was never a question of whether traditional ‘ops’ would work at Google. We needed a specialized role, staffed by folks familiar with the problem space and engineering methods required to make it work.
In 2003, the SRE role was born. Ben Treynor Sloss had been tasked with building Google’s “production team” and in his own words, he built “what happens when you ask a software engineer to design an operations team.” This turned into the birth of the SRE function at Google. From the outset, SRE was staffed in varying measures by systems/operations experts and software engineers. A large part of the remit of the team was to build the tools and practices required to operate Google’s fleet.”

Over time, the rest of the industry caught on to SRE and DevOps. From our SRE deepdive:

“Eventually, other companies caught onto the scaling issues, especially the hyperscalers. Each had their own approach, but over time, the notion grew industry-wide that making things reliable was a real-life engineering discipline, not simply ‘ops’.
This step saw a number of terms coined to describe this engineering, including ‘DevOps’. At its core, this was the notion that the disciplines and practices of reliability engineering should be ingrained into the overall engineering organization. At places other than Google, this mostly took the form of combined developer/operations roles (i.e. “you build it, you run it”), which differed from Google’s implementation, but the practices were similar.
Around this time, Google started opening up about SRE, eventually publishing the first SRE book, and follow ups. Conferences such as USENIX SRECon, Devops Days, and other movements have solidified reliability engineering as a discipline that scales well beyond Google. Indeed, the company has become a consumer of many state-of-the-art developments.”

So, at Google Search, the SRE role could be described as a genuine silver bullet for the tech giant. The company’s obsession with reliability helped it build what is probably the most reliable public-facing service of all. On the assumption that SRE plays a significant role in the approach, I would feel comfortable with calling SRE a silver bullet for Google Search.

SRE, as a concept, is commonplace across Google, but the reliability of its other services is not so impressive. For example, Google Cloud has had many outages, and Gmail also goes down every now and then. I’m sure that without SRE, reliability would be worse, but in general, Google services’ availability these days is probably a magnitude higher than the availability of most online services in the 2000s.

Similarly, GitHub has an SRE role but the service is at zero nines of availability, partially explained by a 3.5x increase in load in two years. But in other ways, the zero nines is likely self-inflicted.

This makes me wonder if the existence of silver bullets depends greatly on teams and individual contexts. SRE seems like a good case to consider:

a “silver bullet” for Google Search
… but not for other Google services
… and definitely not for the broader industry

Could it be that when implemented in the right place, in the right way, and with the correct investment, then SRE – and an incredible focus on reliability – will yield a 10x-or-higher increase in reliability?

My hunch is that Google Search has such standout reliability not just because of SRE, but because Search might be the only organization in Google with reliability as a founding value, embedded in the team’s culture, with unmatched investments of time and money.

Google has published several books that explain their techniques and practices, but for other teams to get those results, they would need to invest similarly in reliability.

3. Was open source + GitHub a silent silver bullet?

Perhaps there’s a silver bullet which is easily missed: open source. In the first-ever Pragmatic Engineer Podcast episode, I asked software engineer Simon Willison what the biggest “productivity leaps” have been during his career. He named open source:

The Pulse: Did capacity shortages turn Anthropic hostile to devs?

Gergely Orosz — Thu, 07 May 2026 16:45:31 GMT

The Pulse is a series covering events, insights, and trends within Big Tech and startups. Notice an interesting event or trend? Hit reply and share it with me.

Today, we cover:

Did Anthropic turn hostile on devs because capacity was running low? For the past few weeks, Anthropic has continually upset devs with its “dumber” model, and by removing Claude Code access from some paid accounts. After securing lots of compute from SpaceX, could the reason have been to conceal capacity issues?
Amazon finally allows Claude Code and Codex usage. The online retail giant wanted to improve its own coding agent, Kiro, so banned other AI coding tools. But that ban is now lifted.
Meta forcefully assigns engineers to data labelling ahead of job cuts. In several teams, 20-40% of engineers are given menial, data labelling work. Could that actually boost people’s job security – for now?
New trend: small “AI-forward” teams. Meta and Amazon’s CEOs say teams of 5-10 devs do better work than teams of 50. There are important caveats: it’s unclear what they’ll do with the “excess,” and if it’s limited to “mechanical” work like rewrites.
Industry Pulse. Why Meta tracks employees’ computer activity, OpenAI starts to move off Datadog, Apple lets slip it uses Claude Code, GitHub → Xbox transfers at Microsoft, VS Code inserted “coathored by Copilot” even when Copilot did nothing, analysis of the Coinbase layoffs.

Before we start, last week I covered big pricing changes coming from GitHub, but it seems I underplayed how big they will be. Reader Julien has helpfully clarified the actual impact (thank you!)

The multipliers will be increased for Pro and Pro+ plans on annual renewal (roughly a ~3x increase on average)
With other plans (Pro, Pro+, Business, Enterprise) it’s more drastic; they will adopt API token-based pricing and drop request-based pricing.

This is indeed a massive change; GitHub has heavily subsidized its usage, relative to API billing, and this change will make pricing unpredictable as of 1 June – at least initially. As Julien put it:

“We are basically waiting to see how much we will be able to use our actual subscription with the new pricing 😇”

1. Did Anthropic turn hostile on devs because capacity was running low?

Designing Data-Intensive Applications: The Cloud & Doing the Right Thing

Gergely Orosz — Tue, 05 May 2026 16:46:00 GMT

In 2016, Martin Kleppmann published ‘Designing Data-Intensive Applications’, which quickly became a go-to book for those of us building backend applications and distributed systems. In it, Martin combined his experience as a startup founder with observations from his time at LinkedIn, and invested years of rigorous, fulltime research in the title.

Nine years later, he felt the time was ripe for an updated edition, with cloud computing much more widespread than in 2016. So, Martin teamed up with software engineer and investor, Chris Riccomini, a former colleague at LinkedIn and the author of The Missing README, for a full refresh of the book which brings it right up to date for the present day.

My copy of the new edition

Martin was recently on The Pragmatic Engineer Podcast, where we discussed this updated volume and many related cloud computing matters. We also looked into some topics that have become less relevant over time, like details on MapReduce.

I asked Martin if this newsletter could share an excerpt of the updated edition of the book about a timeless, important topic, and he generously agreed. So, today we cover:

Cloud versus self-hosting tradeoffs
Doing the right thing as a software engineer

These excerpts are only part of the book; the first edition has been on my shelf for years and is now in well-worn condition. I jumped at the chance to get the second edition, and if you’re interested in building resilient systems, I recommend it as an excellent resource.

Get the second edition of the book

My usual disclaimer: as with all my recommendations, I was not paid for this article, and none of the links are affiliates. See my ethics statement for more.

The excerpt below is from “Designing Data-Intensive Applications” second edition, by Martin Kleppmann and Chris Riccomini. Copyright © 2026 Martin Kleppmann, Chris Riccomini. Published by O’Reilly Media, Inc. Used with permission.

1. Cloud versus self-hosting tradeoffs

This excerpt is from Chapter 1: “Trade-Offs in Data Systems Architecture”

For anything that an organization needs to do, one of the first questions is whether it should be done in-house or outsourced. That is, should you build or should you buy?

Ultimately, this is a question about business priorities. A common rule of thumb is that things that are a core competency or a competitive advantage of your organization should be done in-house, whereas things that are non-core, routine, or commonplace should be left to a vendor [20]. To give an extreme example, most companies do not fabricate their own CPUs, since it is cheaper to buy them from the semiconductor manufacturers.

With software, two important decisions to be made are who builds the software and who deploys it. The spectrum of possibilities is illustrated in Figure 1-2. At one extreme is bespoke software that you write and run in-house; at the other extreme are widely-used cloud services or SaaS products that are implemented and operated by an external vendor and that you access only through a web interface or API.

Figure 1-2. The spectrum of decisions on outsourcing software and its operations

The middle ground is off-the-shelf software (open source or commercial) that you self-host, or deploy yourself – for example, if you download MySQL and install it on a server you control. This could be on your own hardware (often called ‘on-premises,’ even if the server is in a rented datacenter rack and not literally on your own premises), or on a virtual machine (VM) in the cloud (infrastructure as a service, or IaaS). There are more points along this spectrum, such as taking open source software and running a modified version of it.

A related question is how you deploy services, either in the cloud or on premises – for example, whether you use an orchestration framework such as Kubernetes. However, choice of deployment tooling is beyond the scope of this book, since other factors have a greater influence on the architecture of data systems.

Pros & Cons of Cloud Services

Using a cloud service, rather than running comparable software yourself, essentially outsources the operation of that software to the cloud provider. There are good arguments for and against this approach. Cloud providers claim that using their services saves time and money and allows you to move faster compared to setting up your own infrastructure.

Whether using a cloud service is actually cheaper and easier than self-hosting depends very much on your skills and the workload on your systems, however. If you already have experience of setting up and operating the systems you need, and if your load is quite predictable (i.e., the number of machines you need does not fluctuate wildly), then it’s often cheaper to buy your own machines and run the software on them yourself [21, 22].

On the other hand, if you need a system that you don’t already know how to deploy and operate, adopting a cloud service is often easier and quicker than learning to manage the system. Hiring and training staff specifically to maintain and operate the system can get very expensive. You still need an operations team when you’re using the cloud, but outsourcing the basic system administration can free up your team to focus on higher-level concerns.

Outsourcing the operation of a system to a company that specializes in running it can potentially result in better service, since the provider gains operational expertise from providing the service to many customers. On the other hand, if you run the service, you can configure and tune it to perform well on your particular workload. A cloud service would likely be unwilling to make such customizations on your behalf.

Cloud services are particularly valuable if the load on your systems varies a lot over time. If you provision your machines to be able to handle peak load, but those computing resources are idle most of the time, the system becomes less cost-effective. In this situation, cloud services have the advantage that they can make it easier to scale your computing resources up or down in response to changes in demand.

For example, analytical systems often have extremely variable load. Running a large analytical query quickly requires a lot of computing resources in parallel, but once the query completes, those resources sit idle until a user makes the next query. Predefined queries (e.g., for daily reports) can be enqueued and scheduled to smooth out the load, but for interactive queries, the faster you want them to complete, the more variable the workload becomes. If your dataset is so large that querying it quickly requires significant computing resources, using the cloud can save money as you can return unused resources to the provider rather than leaving them idle. For smaller datasets, this difference is less significant.

The biggest downside of a cloud service is that you have no control over it:

If it is lacking a feature you need, all you can do is politely ask the vendor whether they will add it; you generally cannot implement it yourself.
If the service goes down, all you can do is wait for it to recover.
If you are using the service in a way that triggers a bug or causes performance problems, diagnosing the issue will be difficult. With software that you run yourself, you can get performance metrics and debugging information from the operating system to help you understand its behavior, and you can look at the server logs. With a service hosted by a vendor, you usually do not have access to these internals.
If the service shuts down or becomes unacceptably expensive, or if the vendor changes their product in a way you don’t like, you are at their mercy; continuing to run an old version of the software is usually not an option, so you’ll be forced to migrate to an alternative service [23]. This risk is mitigated if alternative services expose a compatible API, but for many cloud services there are no standard APIs, which raises the cost of switching, making vendor lock-in a problem.
If the cloud provider is in another country and a political conflict arises between that country and your own, you risk being locked out of the service due to imposed sanctions.
The cloud provider needs to be trusted to keep the data secure, which can complicate the process of complying with privacy and security regulations.

Despite all these risks, it has become more and more popular for organizations to build new applications on top of cloud services, or to adopt a hybrid approach in which cloud services are used for some aspects of a system. However, cloud services will not subsume all in-house data systems. Many older systems predate the cloud, and for any services that have specialist requirements that existing cloud services cannot meet, in-house systems remain necessary. For example, very latency-sensitive applications such as high-frequency trading require full control of the hardware.

Cloud-Native System Architecture

Besides having a different economic model (subscribing to a service instead of buying hardware and licensing software to run on it), the rise of the cloud has also had a profound effect on how data systems are implemented on a technical level. The term “cloud native” is used to describe an architecture that is designed to take advantage of cloud services.

In principle, almost any software that you can self-host could also be provided as a cloud service, and indeed, such managed services are now available for many popular data systems. However, systems that have been designed from the ground up to be cloud native have been shown to have several advantages: better performance on the same hardware, faster recovery from failures, being able to quickly scale computing resources to match the load, and supporting larger datasets [24, 25, 26]. Table 1-2 lists some examples of both types of systems.

Table 1-2. Examples of self-hosted and cloud-native database systems

Layering of cloud services

Many self-hosted data systems have simple system requirements; they run on a conventional operating system such as Linux or Windows, they store their data as files on the filesystem, and they communicate via standard network protocols such as TCP/IP. A few systems depend on special hardware such as GPUs (for ML) or remote direct memory access (RDMA) network interfaces, but on the whole, self-hosted software tends to use generic computing resources: CPUs, RAM, a filesystem, and an IP network.

In a cloud, this type of software can be run in an IaaS environment, using one or more VMs (or instances) with a certain allocation of CPUs, memory, disk, and network bandwidth. Compared to physical machines, cloud instances can be provisioned faster and come in a greater variety of sizes, but otherwise they are similar to traditional computers: you can run any software you like on them, but you are responsible for administering it yourself.

In contrast, the key idea of cloud-native services is not only to use the computing resources managed by your operating system, but also to build upon lower-level cloud services to create higher-level services. For example:

Object storage services such as Amazon S3, Azure Blob Storage, and Cloudflare R2 store large files. They provide more limited APIs than a typical filesystem (basic file reads and writes), but they have the advantage that they hide the underlying physical machines; the service automatically distributes the data across many machines so that you don’t have to worry about running out of disk space on any one machine. Even if some machines or their disks fail entirely, no data is lost.
Many other services are, in turn, built upon object storage and other cloud services. For instance, Snowflake is a cloud-based analytical database (data warehouse) that relies on S3 for data storage [26], and some other services, in turn, build upon Snowflake.

As always with abstractions in computing, there is no one right answer to what you should use. As a general rule, higher-level abstractions tend to be more oriented toward particular use cases. If your needs match the situations for which a higher-level system is designed, using the existing higher-level system will probably meet your needs with much less hassle than building it yourself from lower-level systems would. On the other hand, if no high-level system meets your needs, building it yourself from lower-level components is the only option.

Separation of storage and compute

In traditional computing, disk storage is regarded as durable (we assume that once something is written to disk, it will not be lost). To tolerate the failure of an individual hard disk, RAID (redundant array of independent disks) is often used to maintain copies of the data on several disks attached to the same machine. RAID can be implemented either in hardware or in software by the operating system, and it is transparent to the applications accessing the filesystem.

In the cloud, compute instances (VMs) may also have local disks attached, but cloud-native systems typically treat these disks more like an ephemeral cache and less like long-term storage. This is because the local disk becomes inaccessible if the associated instance fails, or if the instance is replaced with a bigger or a smaller one (on a different physical machine) to adapt to changes in load.

As an alternative to local disks, cloud services also offer virtual disk storage that can be detached from one instance and attached to a different one (e.g., Amazon EBS, Azure managed disks, and persistent disks in Google Cloud). Such a virtual disk is not a physical disk, but rather a cloud service provided by a separate set of machines that emulates the behavior of a disk (a block device, where each block is typically 4 KiB in size). This technology makes it possible to run traditional disk-based software in the cloud, but the block device emulation introduces overheads that can be avoided in systems that are designed from the ground up for the cloud [24]. The use of virtual disks also makes the application very sensitive to network glitches, since every I/O operation on the virtual block device is a network call [27].

To address this problem, cloud-native services generally avoid using virtual disks and instead build on dedicated storage services that are optimized for particular workloads. Object storage services such as S3 are designed for long-term storage of fairly large files, ranging from hundreds of kilobytes to several gigabytes in size. The individual rows or values stored in a database are typically much smaller than this; cloud databases therefore typically manage smaller values in a separate service and store larger data blocks (containing many individual values) in an object store [25, 28].

In traditional systems architecture, the same computer is responsible for both storage (disk) and computation (CPU and RAM), but in cloud-native systems, these two responsibilities have become somewhat separated, or disaggregated [9, 26, 29, 30]: for example, S3 only stores files, and if you want to analyze that data, you will have to run the analysis code somewhere outside of S3. This implies transferring the data over the network.

Furthermore, cloud-native systems are often multitenant, which means that rather than having a separate machine for each customer, data and computation from several customers are handled on the same shared hardware by the same service [31]. Multitenancy can enable better hardware utilization, easier scalability, and easier management by the cloud provider, but it also requires careful engineering to ensure that one customer’s activity does not affect the performance or security of the system for other customers [32].

Operations in the Cloud Era

Traditionally, the people managing an organization’s server-side data infrastructure were known as database administrators (DBAs), or system administrators (sysadmins). More recently, many organizations have tried to integrate the roles of software development and operations into teams with a shared responsibility for both backend services and data infrastructure; the DevOps philosophy has guided this trend. Site reliability engineers (SREs) are Google’s implementation of this idea [33].

The role of operations is to ensure that services are reliably delivered to users (including configuring infrastructure and deploying applications) and to ensure a stable production environment (including monitoring and diagnosing any problems that may affect reliability). For self-hosted systems, operations traditionally involve a significant amount of work at the level of individual machines, such as capacity planning (e.g., monitoring available disk space and adding more disks before you run out of space), provisioning new machines, moving services from one machine to another, and installing operating system patches.

Many cloud services present an API that hides the individual machines implementing the service. For example, cloud storage replaces fixed-size disks with metered billing, where you can store data without planning your capacity needs in advance, and you are then charged based on the space used. Moreover, many cloud services remain highly available, even when individual machines have failed.

This shift in emphasis from individual machines to services has been accompanied by a change in the role of operations. The high-level goal of providing a reliable service remains the same, but the processes and tools have evolved.

The DevOps/SRE philosophy places greater emphasis on the following:

Setting up automation; preferring repeatable processes over manual one-off jobs
Using ephemeral VMs and services rather than long-running servers
Enabling frequent application updates
Learning from incidents
Preserving the organization’s knowledge about the system, even as individuals come and go [34]

With the rise of cloud services, a bifurcation of roles has occurred. Operations teams at infrastructure companies specialize in the details of providing a reliable service to a large number of customers, while the customers of the service spend as little time and effort as possible on infrastructure [35].

Customers of cloud services still require operations, but they focus on different aspects, such as choosing the most appropriate service for a given task, integrating services with each other, and migrating from one service to another. Even though metered billing removes the need for capacity planning in the traditional sense, it’s still important to know what resources you are using for which purpose so that you don’t waste money on cloud resources that are not needed. Capacity planning becomes financial planning, and performance optimization becomes cost optimization [36]. Additionally, cloud services do have resource limits or quotas (such as the maximum number of processes you can run concurrently), which you need to know about and plan for before you run into them [37].

Adopting a cloud service can be easier and quicker than provisioning and running your own infrastructure, although you still have to learn how to use the cloud service and perhaps work around its limitations. Integration among services becomes a particular challenge as a growing number of vendors offer an ever-broader range of cloud services targeting different use cases [38, 39]. ETL is only part of the story; operational cloud services also need to be integrated with each other. At present, we lack standards to facilitate this sort of integration, so it often involves significant manual effort.

Other operational aspects that cannot fully be outsourced to cloud services include maintaining the security of an application and the libraries it uses, managing the interactions between your own services, monitoring the load on your services, and tracking down the cause of problems such as performance degradations or outages. While the cloud is changing the role of operations, the need for operations is as great as ever.

2. Doing the right thing as a software engineer

The excerpt below is a section from Chapter 14, “Doing the Right Thing”

In the final chapter of this book, let’s take a step back. Throughout, we have examined a wide range of architectures for data systems, evaluated their pros and cons, and explored techniques for building reliable, scalable, and maintainable applications. However, we have left out a fundamental part of the discussion, which we should now fill in.

Every system is built for a purpose; every action we take has both intended and unintended consequences. The purpose may be as simple as making money, but the consequences may be far-reaching. We, the engineers building these systems, have a responsibility to carefully consider those consequences and to ensure that our decisions do not cause harm.

We talk about data as an abstract thing, but remember that many datasets are about people: their behavior, their interests, their identities. We must treat such data with humanity and respect. Users are humans too, and human dignity is paramount [1].

Software development increasingly involves making important ethical choices. There are guidelines to help software engineers navigate these issues, such as the ACM Code of Ethics and Professional Conduct [2], but they are rarely discussed, applied, or enforced in practice. As a result, engineers and product managers sometimes take a cavalier attitude to privacy and the potential negative consequences of their products [3, 4].

A technology is not good or bad in itself – what matters is how it is used and how it affects people. This is true of a software system such as a search engine in much the same way as it is for a weapon like a gun. The ethical responsibility is ours to bear; it is not sufficient for software engineers to focus exclusively on the technology and ignore its consequences.

In contrast to much of computing, however, the concepts at the heart of ethics are not fixed or determinate in their precise meaning; they require interpretation, which may be subjective [5]. What makes something “good” or “bad” is not well defined, and serious discourse on the subject among computing professionals is lacking [6]. Reasoning about ethics is difficult, but also too important to ignore. What does this entail? “Ethics” are not a checklist with which to comply; it’s a participatory and iterative process of reflection, in dialogue with people involved and accountability for the results [7].

Predictive Analytics

Predictive analytics is a major part of why people are excited about big data and AI. It’s also an area that is fraught with ethical dilemmas. Using data analysis to predict the weather, or the spread of diseases, is one thing [8]; it is another matter to predict whether a convict is likely to reoffend, whether an applicant for a loan is likely to default, or whether an insurance customer is likely to make expensive claims [9]. The latter have a direct effect on people’s lives.

Naturally, payment networks want to prevent fraudulent transactions, banks want to avoid bad loans, airlines want to avoid hijackings, and companies want to avoid hiring ineffective or untrustworthy people. From their point of view, the cost of a missed business opportunity is low, but the cost of a bad loan or a problematic employee is much higher, so it is expected for organizations to be cautious. If in doubt, they are better off saying “no”.

However, as algorithmic decision making becomes more widespread, someone who has (accurately or falsely) been labeled as risky by an algorithm may suffer a large number of “no” decisions. Systematically being excluded from jobs, air travel, insurance coverage, property rental, financial services, and other key aspects of society is such a large constraint on an individual’s freedom that it has been called “algorithmic prison” [10]. In countries that respect human rights, the criminal justice system presumes innocence until proven guilty; on the other hand, automated systems can systematically and arbitrarily exclude a person from participating in society without any proof of guilt and little chance of appeal.

Bias & discrimination

Decisions made by an algorithm are not necessarily any better or any worse than those made by a human. Everyone is likely to have biases, even if they actively try to counteract them, and discriminatory practices can become culturally institutionalized. There is hope that basing decisions on data, rather than subjective and instinctive human assessments, could be more fair and give a better chance to people who are often overlooked or disadvantaged in the traditional system [11].

When we develop predictive analytics and AI systems, we are not merely automating a human’s decision by using software to specify the rules for when to say “yes” or “no”; we are leaving the rules themselves to be inferred from data. However, the patterns learned by these systems are opaque: even if the data indicates a correlation, we may not know why. If the input to an algorithm carries a systematic bias, the system will most likely learn and amplify that bias in its output [12].

In many countries, anti-discrimination laws prohibit treating people differently depending on protected traits such as ethnicity, age, gender, sexuality, disability, or beliefs. Other features of a person’s data may be analyzed, but what happens if they are correlated with protected traits? For example, in racially segregated neighborhoods, a person’s postal code or even their IP address is a strong predictor of race. Put like this, it seems ridiculous to believe that an algorithm could somehow take biased data as input and produce fair and impartial output from it [13, 14]. Yet this belief often seems to be implied by proponents of data-driven decision making; an attitude that has been satirized as “machine learning is like money laundering for bias” [15].

Predictive analytics systems merely extrapolate from the past; if the past is discriminatory, they codify and amplify that discrimination [16]. If we want the future to be better than the past, moral imagination is required, and that’s something only humans can provide [17]. Data and models should be our tools, not our masters.

Responsibility and Accountability

Automated decision-making raises the question of responsibility and accountability [17]. If a human makes a mistake, they can be held accountable, and the person affected by the decision can appeal. Algorithms make mistakes too, but who is accountable when they go wrong? [18] When a self-driving car causes an accident, who is responsible? If an automated credit scoring algorithm systematically discriminates against people of a particular race or religion, is there any recourse? If a decision by your ML system comes under judicial review, can you explain to the judge how the algorithm made its decision? People should not be able to evade responsibility by blaming an algorithm.

Credit rating agencies are a classic example of collecting data to make decisions about people. A bad credit score makes life difficult, but at least a credit score is normally based on relevant facts about a person’s actual borrowing history, and any errors in the record can be corrected (although the agencies normally do not make this easy). Scoring algorithms based on machine learning, however, typically use a much wider range of inputs and are much more opaque, making it harder to understand how a particular decision has come about and whether someone is being treated in an unfair or discriminatory way [19].

A credit score summarizes “how did you behave in the past?” whereas predictive analytics usually work on the basis of “who is similar to you, and how did people like you behave in the past?” Drawing parallels to others’ behavior implies stereotyping people; for example, based on where they live (a close proxy for race and socioeconomic class). What about people put in the wrong bucket? Furthermore, if a decision is incorrect because of erroneous data, recourse is almost impossible [17].

Much data is statistical in nature, which means that even if the probability distribution on the whole is correct, individual cases may well be wrong. For example, if the average life expectancy in your country is 80 years, that doesn’t mean you’re expected to drop dead on your 80th birthday. From the average and the probability distribution, you can’t say much about the age to which someone will live. Similarly, the output of a prediction system is probabilistic and may well be wrong in individual cases.

A blind belief in the supremacy of data for making decisions is not only delusional, but also positively dangerous. As data-driven decision making becomes more widespread, we will need to figure out how to avoid reinforcing existing biases, how to make algorithms accountable and transparent, and how to fix them when they inevitably make mistakes.

We will also need to figure out how to realize the positive potential of data and prevent it from being used to harm people. For example, analytics can reveal financial and social characteristics about personal lives. On the one hand, this power could be used to focus aid and support to help those who need it most. On the other hand, it is sometimes used by predatory businesses seeking to identify vulnerable people and sell them risky products such as high-cost loans or worthless college degrees [17, 20].

Feedback loops

Even with predictive applications with less immediately far-reaching effects on people, such as recommendation systems, there are difficult issues that we must confront. When services become good at predicting the content users want to see, they may end up showing them only opinions they already agree with, leading to echo chambers in which stereotypes, misinformation, and polarization can breed. We already know the impact that social media echo chambers can have on election campaigns.

When predictive analytics affect people’s lives, particularly pernicious problems arise because of self-reinforcing feedback loops. For example, consider the case of employers using credit scores to evaluate potential hires. You may be a good worker with a good credit score, but suddenly find yourself in financial difficulties due to a misfortune beyond your control. As you miss payments on your bills, your credit score suffers, and you will be less likely to find work. Joblessness pushes you toward poverty, which further worsens your score, making it even harder to find employment [17]. It’s a downward spiral due to poisonous assumptions, hidden behind a camouflage of mathematical rigor and data.

As another example of a feedback loop, economists found that when gas stations in Germany introduced algorithmic prices, competition was reduced and prices for consumers went up because the algorithms learned to collude [21].

We can’t always predict when such feedback loops may happen. However, many consequences can be predicted by thinking about an entire system (not just the computerized parts, but also the people interacting with it), in an approach known as “systems thinking” [22]. We can try to understand how a data analysis system responds to different behaviors, structures, or characteristics. Does the system reinforce and amplify existing differences between people (e.g., making the rich richer or the poor poorer), or does it try to combat injustice? Even with the best intentions, we must beware of the possibility of unintended consequences.

Surveillance

The excerpt below is from another section in Chapter 14, “Doing the Right Thing”

As a thought experiment, try replacing the word “data” with “surveillance”, and observe whether common phrases still sound so good [23]. How about this: “In our surveillance-driven organization we collect real-time surveillance streams and store them in our surveillance warehouse. Our surveillance scientists use advanced analytics and surveillance processing in order to derive new insights.”

This thought experiment is unusually polemical for this book, “Designing Surveillance-Intensive Applications”, but strong words are needed to emphasize this point. In our attempts to make software “eat the world” [24], we have built the greatest mass surveillance infrastructure ever seen. We are rapidly approaching a world in which every inhabited space contains at least one internet-connected microphone, in the form of smartphones, smart TVs, voice-controlled assistant devices, baby monitors, and even children’s toys that use cloud-based speech recognition. Many of these devices have terrible security track records [25].

What is new compared to the past is that digitization has made it easy to collect large amounts of data about people. Surveillance of our location and movements, our social relationships and communications, our purchases and payments, and our health data has become almost unavoidable. A surveillance organization may end up knowing more about a person than that person knows about themselves; for example, identifying illnesses or economic problems before that individual is aware of them.

Even the most totalitarian, repressive regimes of the past could only dream of putting a microphone in every room and forcing every person to constantly carry a device capable of tracking their location and movements. Yet the benefits that we get from digital technology are so great that we now voluntarily accept this state of total surveillance. The difference is just that the data is being collected by corporations to provide us with services, rather than government agencies seeking control [26].

Not all data collection necessarily qualifies as surveillance, but examining it as such can help us understand our relationship with the data collector. Why are we seemingly happy to accept surveillance by corporations? Perhaps you feel you have nothing to hide; in other words, you are totally in line with existing power structures, you are not a marginalized minority, and you needn’t fear persecution [27]. Not everyone is so fortunate. Or perhaps it’s because the purpose seems benign; it’s not overt coercion and conformance, merely better recommendations and more personalized marketing. However, combined with the discussion of predictive analytics from the last section, that distinction seems less clear.

We are already seeing behavioral data about driving, tracked by vehicles without drivers’ consent, affecting their insurance premiums [28], and health insurance coverage that depends on people wearing a fitness tracking device. When surveillance is used to make decisions that hold sway over important aspects of life, such as insurance coverage or employment, it starts to appear less benign. Data analysis can also reveal surprisingly intrusive things; for example, the movement sensor in a smartwatch or fitness tracker can be used to work out what you are typing (e.g., passwords) with fairly good accuracy [29]. Sensor accuracy and algorithms for analysis are only going to get better.

Takeaways

Thanks to Martin for writing this book, and to himself and Chris for doing a revamp for the second edition. The volume is now even more relevant to how we build systems in 2026 and beyond. You can purchase a hard copy from the publisher’s website or Amazon.

The first edition has a timeless quality because it focused on the fundamentals of large systems, and the new second edition follows the same approach, as laid out in its preface:

“Although the landscape of technologies for processing and storing data is diverse and fast-changing, the underlying principles endure. If you understand those principles, you’re in a position to see where each tool fits in, how to make good use of it, and how to avoid its pitfalls. This book focuses on those principles.”

Since the first edition appeared nine years ago, some things have changed in the tech industry:

Much greater focus on the cloud. Building large systems on top of cloud infrastructure is more common. This brings lower complexity as cloud primitives hide a lot of implementation complexity, but it also means accepting more risk because when the cloud is down, so is your system.
Systems which AI tools build upon are more relevant. Vector databases, DataFrames (for training datasets), and the processing of large amounts of training data with batch processing systems are relevant to anyone building production AI systems.
Local-first software. Martin focuses on this area in his work, and with AI, we could see more demand for running models locally. Operating systems like Ubuntu are also focusing on this.
Formal methods. The advent of AI-generated code means this topic is getting more attention industry-wide, and the second edition covers it.
Regulation and legal context. Regulations like the EU’s General Data Protection Regulation (GDPR) are something software engineers increasingly need to know about, and the book now covers it.

If I had to summarize the evolution of the book in its second edition, it would be more focus on cloud and AI, and more on local-first software, testing, and how regulations affect engineers. Interestingly, this mirrors how the tech industry has developed over time, too.

I very much appreciate that the book closes with the final chapter focused on “doing the right thing” as a software engineer. Software systems have wide-ranging societal impact, and engineers working on these systems have a great say in what gets built, and how it gets built. As engineers, we owe it the very least to ourselves to consider the broader impact of our decisions — and doing so might also force us to make important ethical choices. There’s less discussion of the ethics angle on software engineering: and I’m glad that Martin and Chris did not shy away from going deeper into this topic.

If you’d like to get more background on the book – and on the hard parts of building large-scale systems – check out our podcast episode with Martin Kleppmann.

The Pulse: AI load breaks GitHub – why not other vendors?

Gergely Orosz — Thu, 30 Apr 2026 14:23:43 GMT

The Pulse is a series covering events, insights, and trends within Big Tech and startups. Notice an interesting event or trend? Hit reply and share it with me.

Today, we cover:

Load from AI breaks GitHub – but why not other vendors? GitHub’s reliability is less than one nine, and getting worse. Prolific open source contributor, Mitchell Hashimoto, is quitting GitHub because he thinks it’s not suited for professional work. GitHub’s leadership blames the 3.5x increase in service load as the cause of degradation – or it might be self-inflicted.
Anthropic’s speedrun to destroy trust. Anthropic could do no wrong until recently, but in the past month, that’s all changed. Silently nerfing Claude Code, banning companies from Claude, and baffling price rises all add to a sense that Anthropic is in its “extraction” era of generating more revenue for the same or worse service.
Industry pulse. Dramatic price increases at GitHub Copilot, explosive growth at Codex, Google scrambling to build a good coding model, Cursor might be bought by SpaceX, AI agent deletes car business, and more.
Mitchell Hashimoto & the “building block economy.” Ghostty’s creator finds that open source “building blocks” are the best way to win massive adoption by software components – but it’s got harder to build a business on top of open building blocks.

The bottom of this article could be cut off in some email clients. Read the full article uninterrupted, online.

1. Load from AI breaks GitHub – but why not other vendors?

GitHub’s reliability has been beyond unacceptable recently: last month, third party measurements pinned it at one nine (right at 90%). This month, reliability has been down to zero nines – 86% – as per a third-party tracker, and last week, things got even worse: a frankly embarrassing data integrity incident, more outages, and a partial explanation from GitHub, eventually.

Data integrity incident

Building Pi, and what makes self-modifying software so fascinating

Gergely Orosz — Wed, 29 Apr 2026 14:30:17 GMT

Stream the latest episode

Listen and watch now on YouTube, Spotify, and Apple. See the episode transcript at the top of this page, and timestamps for the episode at the bottom.

Brought to You by

• Statsig – ⁠ The unified platform for flags, analytics, experiments, and more. Stop switching between different tools, and have them all in one place.

• Sonar — The makers of SonarQube, the industry standard for code verification and automated code review. As AI agents generate extreme volumes of code, verification can’t be optional: SonarQube acts as the independent, zero‑trust, multi-layered verification engine that checks every line of code against your quality, security, and architectural standards, so only safe, reliable, and auditable code reaches production. Try it out for yourself.

• WorkOS – Designing large systems is about tradeoffs. But one thing isn’t a tradeoff: enterprise features. WorkOS gives you APIs to ship enterprise features – SSO, directory sync, RBAC, audit logs – in days, not months. Visit WorkOS.com to learn more.

In this episode

Mario Zechner is the creator of Pi, a minimalist, self-modifying AI coding agent, that is the foundation upon which OpenClaw (created by Peter Steinberger) is built. Meanwhile, Armin Ronacher is the creator of Flask, and a longtime user of Pi. The pair are also friends.

I sat down with Mario and Armin for the latest episode of the Pragmatic Engineer Podcast for an interesting conversation about AI and their reservations about it – even though both are heavily invested in building AI-powered tools.

Mario explains why he built Pi, and gives his take on why it has become so popular. Armin walks us through how he uses AI tools, including building a game with Pi, and why he always puts human judgment firmly at the heart of his approach.

We cover the risks of over-automation, the limits of agentic workflows, and why strong engineers with informed judgment still matter. We also get into the challenges of working with code written by non-engineers, and whether open source can withstand a tidal wave of agent-generated code.

My observations from the conversation with Mario and Armin

Here are 9 of my most interesting takeaways from talking with Armin and Mario:

1. Pi was built because Claude Code became unpredictable. Mario was a big fan of Claude Code at first. But as the team behind it pushed velocity and added features, he found that bugs multiplied and the tool’s behavior started to change. Mario wanted an AI harness that behaves in a stable, consistent way. He observed that the addition of new features caused Claude Code to act unpredictably, so resolved to add as few features as possible to Pi.

2. It should be MUCH easier to build specialized tools for specific tasks. Different projects need different harness types because, as Mario points out, the same hammer is not ideal for every single construction job. As such, Pi is built with the goal of allowing the creation of specialized harnesses. It can modify itself so that a user can create the bespoke harness needed for any task. Mario believes it’s a preview of how self-modifiable software might look in the future.

3. Automation bias is one of the biggest risks of working with AI agents. Once devs confirm that an AI agent can produce acceptable code, they start to review its output less often, even though agents can – and do! – produce slop. Mario advises being far more sceptical with agents, and cautions that the quality of their output isn’t guaranteed, however well they performed previously.

4. AI agents decrease code quality, but this is not on purpose. From talking with 30+ engineering teams, Armin found that code quality is down everywhere, and serious projects are shipping with “vibe slop.” A potential cause of this is that keeping agentic output clean and of high quality takes deliberate effort, but it’s not clear to many devs exactly how to do this. There’s also PR review fatigue and automation bias (the assumption that AI agents invariably generate good code).

5. New trend: AI makes it harder for senior engineers to reject pointless complexity. Historically, senior engineers kept software complexity at bay simply by saying “no” a lot. But Armin observes that these days, more junior engineers and product managers deploy agent-scripted counterarguments when a senior colleague kicks an idea to the curb. This makes decision-making exhausting, and more bad ideas make it into production as a result.

6. Junior engineers > AI agents. Mario points out that, unlike humans, agents don’t retain lessons in the same way, nor feel the pain of bad code. Junior engineers do, and the pain of maintenance teaches them to simplify interfaces and avoid bad abstractions – which are both qualities of an effective senior engineer. In this way, a junior engineer is more valuable than an AI agent!

7. Agents refactor less because they feel no “pain.” Humans rewrite bad interfaces because maintaining them hurts, whereas agents will obliviously churn out and extend a terrible structure, ad infinitum. This is a big reason why AI agents keep adding more tech debt.

8. Frictionless shipping can actually be harmful. Armin notes that some friction is desirable; for example, multi-reviewer approvals on critical services, SLO gates (different gates based on the service level objective offered), and migration checklists. The good thing about friction is that it makes humans stop and think.

9. Does not being in San Francisco help people stay grounded about AI? I asked Mario how he keeps level-headed about AI while building one of the most popular AI agent harnesses. In response, he credits living in Austria, being a father, and enjoying the great outdoors, as his antidotes to all the hype.

The Pragmatic Engineer deepdives relevant for this episode

• The creator of OpenClaw: “I ship code that I don’t read”

• Building great SDKs

• What is inference engineering? Deepdive

• The impact of AI on software engineers in 2026: key trends

• Cycles of disruption in the tech industry

• The AI engineering stack

Timestamps

(00:00) Intro

(07:30) How Mario, Armin, and Peter Steinberger met

(15:15) How 30 dev teams use AI agents: learnings

(21:50) The importance of judgment

(24:26) Challenges when non-engineers write code

(28:30) Downsides of over-automation

(32:18) Pi

(48:09) OpenClaw + Pi

(50:54) “Clankers”

(57:32) Open source and AI

(1:00:22) Complexity as the enemy

(1:02:50) Building an AI-native startup

(1:11:52) “Slow the F down”

(1:16:40) MCPs vs. CLI

(1:25:03) Predictions and staying up to date

References

Where to find Mario Zechner:

• X: https://x.com/badlogicgames

• LinkedIn: https://www.linkedin.com/in/mariozechner

• Website: https://mariozechner.at

Where to find Armin Ronacher:

• X: https://x.com/mitsuhiko

• LinkedIn: https://www.linkedin.com/in/arminronacher

• Website: https://mitsuhiko.at

• Blog: https://lucumr.pocoo.org

Mentions during the episode:

• Python, Go, Rust, TypeScript and AI with Armin Ronacher: https://newsletter.pragmaticengineer.com/p/python-go-rust-typescript-and-ai

• Pi: https://pi.dev

• OpenClaw: https://openclaw.ai

• Flask: https://flask.palletsprojects.com/en/stable

• The creator of Clawd: “I ship code that I don’t read”: https://newsletter.pragmaticengineer.com/p/the-creator-of-clawd-i-ship-code

• Amiga 500: https://en.wikipedia.org/wiki/Amiga_500

• i486: https://timeline.intel.com/1989/meet-the-i486

• Peter Steinberger on X: https://x.com/steipete

• Sentry: https://sentry.io

• Nat Friedman on X: https://x.com/natfriedman

• Chroma: https://www.trychroma.com

• Siemens: https://www.siemens.com

• Y Combinator: https://www.ycombinator.com

• The Final Bottleneck: https://lucumr.pocoo.org/2026/2/13/the-final-bottleneck

• Children’s Learning With Tablet Technology is Often Too Passive: https://news.utexas.edu/2017/08/22/childrens-learning-with-tablet-technology-is-often-passive

• Amp: https://ampcode.com

• OpenCode: https://opencode.ai

• Agent Design Is Still Hard: https://lucumr.pocoo.org/2025/11/21/agents-are-hard

• How Linux is built with Greg Kroah-Hartman: https://newsletter.pragmaticengineer.com/p/how-linux-is-built-with-greg-kroah

• Mario’s post on X about complexity:

• VibeTunnel: https://vibetunnel.sh

• Thoughts on slowing the F down: https://mariozechner.at/posts/2026-03-25-thoughts-on-slowing-the-fuck-down

• StackOverflow: https://stackoverflow.com

• David Cramer on LinkedIn: https://www.linkedin.com/in/dmcramer

• Stainless: https://www.stainless.com

—

Production and marketing by Pen Name.

How will AI change operating systems? Part 1: Ubuntu and Linux

Gergely Orosz — Tue, 28 Apr 2026 14:25:18 GMT

AI is affecting how many of us software engineers build; we’re prompting more code and producing much more of it. The tools are also adapting, with command-line interfaces gradually becoming more popular than IDEs. But what about operating systems? To find out, I reached out to the leading Linux distribution – the team at Ubuntu – and the Windows team, about how AI is changing their operating systems.

Today’s article focuses on Linux and Ubuntu, and we’ll cover Windows in a follow-up issue. Obviously, I reached out to Apple but heard nothing back, unsurprisingly. If you’re reading this and happen to work at Apple, it’d be great to learn more!

Jon Seager is VP of Engineering at Canonical – the company behind Ubuntu – and has provided new details about what the team there has built for AI support, and some new ideas that they’re brewing up. Today, we cover:

Hardware enablement: support for GPUs, NPUs and DPUs. When you turn on a machine with AI accelerators, Ubuntu aims for the hardware to perform at its full potential. This means having proper driver support for PCs and cloud data centers’ computing units.
Hardware partnerships. Working closely with NVIDIA, AMD, and Intel means Ubuntu can support those vendors’ new hardware from release day.
CPU architecture variants. New versions in a CPU family add to, or change, features. An operating system needs to support a new version of the CPU architecture variant in order to fully utilize it. Ubuntu does this for the x86‑64 family, making it a lot more performant on newer CPUs – while still supporting older CPUs.
Local-first bet & plans for agentic workflows. There’s a big focus on running local models and using “inference snaps” which help choose the right model with the right quantization. There is the intention to support agentic workflows at the OS level, one day, which is currently at the early exploration stage.
Developer ecosystem. There’s a plan to add more support for AI dev tools, a focus on sandboxing at the OS level, a push to support ARM64 laptops more, and we touch on the popularity of Windows Subsystem for Linux (WSL).
Engineering culture. A skeptical attitude to AI at Canonical has given way to one where experimentation is encouraged and devs lean into AI tools, but there are no targets for token usage or amounts of AI-generated code.
What other Linux distributions are doing. Arch Linux takes the “DIY your AI setup” approach, Omarchy makes it easy to install AI tools, while Red Hat Enterprise Linux ships with AI integrated into the command-line and support for AI accelerators & popular AI tools.

The bottom of this article could be cut off in some email clients. Read the full article uninterrupted, online.

Read the full article online

1. Hardware enablement: support for GPUs, NPUs & DPUs

Jon mentioned he detects a “Dotcom Boom”-era vibe in the industry, like around when “web 1.0” was created, and indeed, lots of startups today aim to be the Google-style success story of this “AI era”. At Canonical, the team asked: what does that mean for Ubuntu as an operating system?

For instance, should Ubuntu join the competition and try to position itself closer to AI, or keep focusing on what they’ve done for decades: build an operating system? Jon said:

“We need to make sure to remain a relatable and accessible system. I don’t think we should blur the line between application features and the OS itself. So, the most powerful thing we can do is hardware enablement.”

Hardware enablement means that if a computer (typically, a laptop) has AI-related hardware, Ubuntu should allow it to make full use of it. This involves adding support for GPUs, NPUs, DPUs and other types of accelerator cards. Let’s briefly go through each.

GPUs

As is likely widely known by readers, ‘GPU’ stands for Graphics Processing Unit. Originally built for graphics rendering, its #1 use case is no longer in video games but for AI training and inference. GPUs come in two forms:

Integrated GPUs: located on the same die (integrated circuit) as the CPU, like GPUs on Apple’s M-series processors
Discrete GPUs: separate chips on their own board; often for gaming, or in standalone GPU rigs for AI and ML workloads

NVIDIA leads the market in discrete GPUs for rigs with its Blackwell family, and in standalone GPU cards with the NVIDIA RTX series. Other vendors like AMD offer GPUs for data centers (like the Instinct MI300 Series) and for PCs with the AMD Radeon series.

Hands full: NVIDIA CEO Jensen Huang with the Blackwell GPU (left) and GB200 superchip. Source: Forbes

NPUs

Neural Processing Units (NPUs) are also called “AI accelerators.” This is a dedicated block on the System-on-a-chip (SoC), on modern processors especially designed for running AI inference efficiently on‑device. Since 2022, many modern processors have had a dedicated NPU block, including all Apple’s M-series chips (from M1 and up), Intel’s Core Ultra and Core Ultra “Series 2”, AMD’s Ryzen AI 300 series, and also Qualcomm’s Snapdragon X Elite and Snapdragon X Plus.

AMD’s Ryzen AI Pro Series 3000 processors have dedicated NPUs, like most modern laptop processors

A number shared for each NPU is TOPS. TOPS means Tera (trillions) of Operations Per Second, and the said operation is a “multiply-accumulate” (MAC) one, which Qualcomm describes as:

“A multiply-accumulate (MAC) operation executes the mathematical formulas at the core of AI workloads. A matrix multiply consists of a series of two fundamental operations: multiplication and addition to an accumulator. A MAC unit can, for example, run one of each per clock cycle, meaning it executes two operations per clock cycle. A given NPU has a set number of MAC units that can operate at varying levels of precision, depending on the NPU’s architecture.”

How TOPS is calculated: TOPS = 2 × MAC unit count × Frequency / 1 trillion.

“Frequency” refers to the clock speed (cycles per second) at which an NPU and its MAC units (as well as a CPU or GPU) operate, which directly influences overall performance. Processors at higher frequencies allow for more operations, but higher frequencies also mean more energy consumed, heat generated, and battery life decreased. The TOPS number that’s quoted for processors is generally the peak operating frequency.

NPUs are often ideal for low-power, local inference, and for running smaller, local models. They can be useful for things like Local speech‑to‑text (dictation, captions, meeting transcription), video background blur/replacement or auto‑framing, small local language summarization, etc. NPUs are more typical of laptop and PC processors, although some phone processors ship with them like the iPhone (A-series chips) and Google’s Tensor processor in Pixel phones. Basically, NPUs promise to bring efficiently-running local models on laptops one step closer.

DPUs

Data Processing Units (DPUs) are typically found in data centers, moving massive amounts of data fast. NVIDIA’s explanation:

“The CPU is for general-purpose computing, the GPU is for accelerated computing, and the DPU, which moves data around the data center, does data processing.
A DPU is a new class of programmable processor that combines three key elements. A DPU is a system on a chip, or SoC, that combines:
An industry-standard, high-performance, software-programmable, multi-core CPU, typically based on the widely used Arm architecture, tightly coupled to the other SoC components.
A high-performance network interface capable of parsing, processing and efficiently transferring data at line rate, or the speed of the rest of the network, to GPUs and CPUs.
A rich set of flexible and programmable acceleration engines that offload and improve applications’ performance for AI and machine learning, zero-trust security, telecommunications, and storage, among others.”

NVIDIA BlueField-3 DPU

Several major chipmakers manufacture DPUs, of which NVIDIA’s BlueField family is the most widespread. Others include AMD Pensando DPUs (Elba, Giglio), and Intel IPU / DPU cards (E2100, E2200 series).

DPUs are most commonly deployed inside Hyperscale cloud providers (AWS, Azure, GCP, OCI), or in AI and high-performance computing (HPC) data centers, or larger private clouds. DPUs make sense when GPU traffic is huge, or when the network telemetry overhead is so great that it could overwhelm the CPUs processing the data transfer.

2. Hardware partnerships

It’s easiest to add support to hardware by working with leading chip manufacturers, so Ubuntu has relationships with hardware vendors for that reason. As a result, the OS sometimes offers day-one support for cutting-edge AI supercomputers.

Partnership with NVIDIA

In September 2025, Canonical announced it would package and distribute the full NVIDIA CUDA toolkit directly within Ubuntu’s repositories. This deal collapsed into a single standard apt install, something that had previously been a multi-step manual installation process of downloading from NVIDIA’s site, importing GPG keys, pinning a separate APT repo – and praying nothing broke.

Packaging and distributing the CUDA toolkit makes developing with CUDA easier. From Jon:

“One of the trickiest things for developers who have to use this tech is the dance of matching the right version of Python, with the right version of CUDA, with the right driver. Projects end up with different versions of CUDA, and then machines end up breaking because the driver configuration gets inadvertently broken along the way.
The number one thing we can do as an operating system is to make this setup as easy as possible.”

Ubuntu’s strategy of working directly with chipmakers seems to be working. NVIDIA recently discontinued its custom NVIDIA DGX OS — a modified Ubuntu it maintained for years — and now ships plain Ubuntu. Jon:

“Previously, NVIDIA shipped NVIDIA DGX OS for which NVIDIA had an agreement with Canonical where they could take Ubuntu, modify it with the kernel modules and software they needed, do some product-specific optimization, and ship that as NVIDIA DGX OS.
This more recent development sees NVIDIA just shipping Ubuntu as it comes.
When NVIDIA released the DGX Spark, a $4,000 AI workstation with an ARM64 chipset, it shipped running vanilla Ubuntu as the only supported operating system.”

NVIDIA DGX Spark AI supercomputer: one of several NVIDIA DGX servers powered by NVIDIA’s DGX OS

At CES 2026 in January, Canonical announced Ubuntu support for the NVIDIA Vera Rubin NVL72 rack-scale architecture, with day-one platform readiness in Ubuntu, version 26.04 LTS (Long-Term Support: at least 15 years for enterprise customers).

The NVIDIA Vera Rubin NVL72 rack

AMD and Intel

It’s clear Ubuntu and NVIDIA enjoy a strong partnership, but Canonical aims to remain neutral, Jon says:

“We have an amazing partnership with NVIDIA, but we do the same with Intel, the same with AMD, the same with Qualcomm, and the same with MediaTek because in reality there is hardware being released every day, and if we don’t maintain those partnerships, the ecosystem becomes even more fragmented than it already naturally is.”

Last December, Ubuntu announced native support for AMD ROCm, and also ships with Intel’s OpenVINO toolkit. Ubuntu 26.04 LTS will be the first major distribution to natively package all three GPU compute stacks — NVIDIA, AMD, and Intel — with long-term enterprise support. Under Ubuntu Pro, ROCm LTS releases receive up to 15 years of security maintenance.

Security maintenance means that if vulnerabilities or critical incompatibilities are discovered in an LTS version, Canonical will patch them even if the upstream vendor no longer supports those versions and no longer backports security patches.

AMD Instinct accelerators are gaining traction in HPCs and sovereign AI deployments, as enterprises look for alternatives to CUDA-locked hardware. AMD’s SVP and Chief Software Officer, Andrej Zdravkovic, said the partnership would make it “easier for developers and enterprises to deploy AMD solutions on supported systems.”

Chip vendors want to collaborate because it means less work for them to add operating system-level support. Jon:

“It’s a win-win on both ends. Silicon companies are in the business of building the best chips they can, and partnering with Canonical means they have to concentrate on fewer things which are not their core focus. My hope is that partnering with Canonical helps them to focus on what they’re best at, while enabling us to help with what we’re best at: integrating, shipping and maintaining a Linux distribution.”

3. Architecture variants

Modern x86 processors support multiple instruction set generations: x86_64 v1, v2, v3, v4, and v5. ARM has a similar hierarchy. Each generation adds capabilities, such as AVX-512 instructions that accelerate machine learning workloads.

Let’s take the x86_64 instruction set. The instruction set is versioned. These are the versions:

For x86_64: v1, v2, v3, v4, v5…
For ARM: ARM v8.2, v8.3, v9…

Until recently, Ubuntu ran slower on newer CPUs in order to keep supporting older ones. So, when installing Ubuntu compiled for AMD64, the OS supported architecture variants for AMD64 v1.

Supporting v1 has the advantage that the oldest of AMD64 processors can run this Ubuntu version. But if Ubuntu decided to support v2 instructions, then v1 processors could not run the OS! The OS did not use the new instructions; for example, a modern processor with hardware accelerators like AVX-512, didn’t use them.

Canonical has reworked its build infrastructure to produce binaries with specific architecture variant support. So, in the case of running an x86_64 v3 compatible processor, you can download an Ubuntu OS variant that’s compiled specifically for x86_64 v3.

One tradeoff the Ubuntu team had to make was building binaries several times, which takes up more processing time and storage at their end. Then again, the Ubuntu team doing this once means that users don’t need to do recompilation, which made it an easy tradeoff, Jon told me.

Now, Ubuntu supports x86_64 v3 as an architecture variant and plans to do more. Jon says:

“Today, we’ve released x86_64 v3 as a variant, but the capability in our build and delivery pipelines unlocks the ability to add variants for the next RISC-V RVA versions, for ARMv9, ARMv10, ARMv11 and so on.
We will start now onboarding variants to make sure that when you go and buy your latest Snapdragon laptop, your operating system and all of the parts of it are using the silicon to its fullest.”

Adding support for architecture variants was a significant undertaking. Jon explains:

“This work was especially complex because combined with having the hardware physically available in the build farm, Canonical also needed to make the build scheduler aware, and thread the capability through the build systems of Debian packages, Snaps, OCI images, virtual machine images, etc. As it stands, the capability exists for Debian packages, and support for further package types will land shortly.
In addition to the build infrastructure, work needed to be done on downstream package managers (apt, snap, …) and schedulers to ensure they pull the right version of packages, and consideration needs to be given to what happens if a VM containing x86_64 v3 code ends up trying to boot on v1 hardware, and so on.”

4. Betting on local-first & plans for agentic workflows

If you’ve tried to run an LLM locally on your machine, you’ll know it comes with friction. Jon:

The Pulse: AI token spending out of control – what’s next?

Gergely Orosz — Thu, 23 Apr 2026 16:51:01 GMT

Hello from Florida – today and tomorrow, I’m at React Miami. I’ve always wanted to attend this conference, and finally made it happen. If you’re around, say hi!

(L-R): Myself, NeetCode founder, Navdeep Singh, & YouTuber & Twitch streamer, ThePrimeagen at React Miami

Let’s get to today’s topics:

New trend: token spend breaks budgets – what next? In the past 2-3 months, spending on AI agents has exploded at many tech companies, and the ramifications of this are starting to dawn on engineering leaders. We’ve sourced details from 15 companies, including the different ways they are coping with this realization.
New trend: more AI vendors can’t keep up with demand. Related to massively increased spending, GitHub Copilot and Anthropic are starting to limit less-profitable individual users, so they can serve business users whose spend has easily 10x’d in the last few months. The exception is OpenAI and Codex.
Morale at Meta hits all-time low? Business is booming but devs at Meta are furious and worried due to looming layoffs, and an invasive tracking program rolled out to all US employees.

1. New trend: token spend breaks budgets – what next?

Designing Data-intensive Applications with Martin Kleppmann

Gergely Orosz — Wed, 22 Apr 2026 16:19:26 GMT

Stream the latest episode

Listen and watch now on YouTube, Spotify, and Apple. See the episode transcript at the top of this page, and timestamps for the episode at the bottom.

Brought to You by

• Statsig – ⁠ The unified platform for flags, analytics, experiments, and more. Stop switching between different tools, and have them all in one place.

• Sonar – The makers of SonarQube, the industry standard for code verification and automated code review. Sonar helps teams close the “architecture gap” by preventing code complexity and structural decay. Learn how Sonar is empowering the Agent Centric Development Cycle with new architecture management capabilities that ensure both humans and AI agents respect your system’s blueprint.

In this episode

Martin Kleppmann is a researcher and the author of Designing Data-Intensive Applications, one of the most influential books on modern distributed systems. As of this month, the second, heavily updated edition of the book is out.

In this episode of Pragmatic Engineer, we discuss Martin’s career in tech building startups, how he ended up writing this iconic book, and what he’s focused on, these days, after moving from industry, into academia.

We talk about the tradeoffs behind modern infrastructure, how the cloud has changed what it means to scale, and the thinking behind Designing Data-Intensive Applications, including what’s changing in the second edition.

Martin reflects on lessons from building startups like Rapportive, which he sold to LinkedIn, and shares how his experience in both academia and industry shaped his perspective.

We also explore what’s ahead: why formal verification may become more important in an AI-assisted world, the challenges of building local-first software, and his recent research into using cryptography to improve transparency in supply chains without exposing sensitive data.

Key observations from Martin

Here are 12 of my most interesting takeaways from talking with Martin:

1. Seeing Kafka as it was built at LinkedIn heavily shaped the ideas behind the book. Kafka (a popular event streaming platform) was open-sourced while Martin was at LinkedIn. Seeing this large system up close helped Martin build a mental model of how various data systems fit together, what they have in common, and their fundamental principles.

2. Martin wrote the book because he wished he had this resource when they were “drowning” in design decisions at his startup. At Rapportive, they hit database performance problems and were searching in the dark, with no idea what to do, because they lacked foundations. Martin wrote the book, so hopefully others won’t have to learn the fundamentals the hard way that his team did.

3. Knowing system internals as a superpower for application developers. Martin maintains that Designing Data-Intensive Applications is not a book for people who build databases or even infrastructure, but it’s helpful for application developers to develop an intuition for making good design decisions and debugging performance issues they will encounter.

4. Multi-region and multi-cloud are risk/cost trade-offs, not best practices. Martin does not believe that there is a “best practice” in deciding whether to go multi-region or multi-cloud. This decision is a tradeoff between risk and costs. It’s a business decision to be made. Designing Data-Intensive Applications gives engineers the vocabulary to articulate the tradeoffs, not to dictate answers.

5. Scaling down can be as challenging as scaling up. When talking about scaling systems, most engineers associate this with scaling up. But building a system that can operate efficiently and scale down when there’s less traffic is an exciting (and challenging) problem as well! Solutions like Serverless are valuable building blocks for scaling down efficiently.

6. Replication for fault tolerance is more relevant these days than sharding. Though the book has a full chapter on sharding, Martin said that the cloud has reduced the need for manual sharding for the majority of teams. This is also because machines are increasingly bigger, and more workloads fit on a single machine. Sharding across machines is increasingly a specialist concern; replication for fault tolerance, however, is still relevant at every scale.

7. MapReduce might be “dead,” but it is still worth knowing about. The second edition of the book cut most MapReduce coverage because Martin observed that, these days, practically nobody uses it: technologies like Spark and Flink have replaced MapReduce. The second edition of the book has a reference to MapReduce purely as a learning tool, for understanding partitioned batch systems.

8. Distributed systems theory makes deliberately paranoid assumptions: this is on purpose! The theory assumes that there’s no upper bound on how long it might take for a message to go over the network: it might arrive in 100 microseconds or 10 years. Clocks, crashes, and network delays all get similarly worst-case treatment. Occasionally, reality will hit some of these extremes!

9. An engineer’s job is increasingly about surfacing risks — including societal ones — to decision-makers. Martin believes that engineers need to articulate tradeoffs in a way that enables business leaders to make informed decisions. These tradeoffs include reputational and societal risks, not just technical ones.

10. Formal verification was too expensive to use across the industry, and LLMs may change this. Martin said that he never used formal verification in his time in the industry because it was too time-consuming. Now he sees two things happening at once:

LLMs are producing so much code that human review becomes the bottleneck
LLMs are getting good at writing formal proofs as well

Put both together, and we might see more formal verification happening!

11. Building local-first software has difficult engineering challenges. Decentralized access control sounds trivial, but it becomes pretty hard without a single server to arbitrate. For example, a revoked user can make a concurrent edit, and different devices will disagree about what happened. Martin is currently working in this problem space.

12. Industry and academia dismiss each other, and this is not great for either field! The tech industry calls academia “theoretical” and misses useful research. Academia, in turn, often calls industry work just engineering and misses the interesting problems they solve. Martin has worked in both industry and academia, and would like to build better respect in both directions. The best PhD students he works with have a few years of real engineering experience.

The Pragmatic Engineer deepdives relevant for this episode

• Building Bluesky: a distributed social network (Martin is an advisor at Bluesky)

• Inside Uber’s move to the cloud

• The history of servers, the cloud, and what’s next

• The past and future of modern backend practices

• How Kubernetes is built

Timestamps

(00:00) Early career

(05:46) Building Rapportive

(10:47) Working at LinkedIn

(14:09) Writing Designing Data-Intensive Applications

(23:00) Reliability, scalability, and repeatability

(26:24) DDIA: the second edition

(30:50) Tradeoffs of using cloud services

(39:02) How the cloud changed scaling

(42:53) The trouble with distributed systems

(49:02) Ethics for software engineers

(52:45) Formal verification

(1:00:12) Academia vs. industry

(1:03:50) Local-first software

(1:09:50) Computer science education

(1:18:32) Martin’s current research and advice

References

Where to find Martin:

• LinkedIn: https://www.linkedin.com/in/martinkleppmann

• Bluesky: https://bsky.app/profile/martin.kleppmann.com

• Website: https://martin.kleppmann.com

• Distributed Systems lecture series: https://www.youtube.com/playlist?list=PLeKd45zvjcDFUEv_ohr_HdUFe97RItdiB

• Designing Data Intensive Applications, 2nd edition: https://learning.oreilly.com/library/view/designing-data-intensive-applications/9781098119058

Mentions during the episode:

• Selenium: https://www.selenium.dev

• SauceLabs: https://saucelabs.com

• Rapportive on YC’s website: https://www.ycombinator.com/companies/rapportive

• Kafka: https://engineering.linkedin.com/teams/data/data-infrastructure/streams/kafka

• The Log: What every software engineer should know about real-time data’s unifying abstraction: https://engineering.linkedin.com/teams/data/data-infrastructure/streams/kafka

• Materialized View:

• The Missing README: A Guide for the New Software Engineer: https://www.amazon.com/Missing-README-Guide-Software-Engineer/dp/1718501838

• How AWS S3 is built: https://newsletter.pragmaticengineer.com/p/how-aws-s3-is-built

• MapReduce: https://en.wikipedia.org/wiki/MapReduce

• Prediction: AI will make formal verification go mainstream: https://martin.kleppmann.com/2025/12/08/ai-formal-verification.html

• Isabelle proof assistant: https://isabelle.in.tum.de

• Rocq: https://rocq-prover.org

• Lean: https://lean-lang.org

• TLA+: https://github.com/tlaplus

• FizzBee: https://fizzbee.io

• Local-First Software: You Own Your Data, in spite of the Cloud: https://martin.kleppmann.com/papers/local-first.pdf

• How AI assistance impacts the formation of coding skills: https://www.anthropic.com/research/AI-assistance-coding-skills

• Cryptography: https://en.wikipedia.org/wiki/Cryptography

—

Production and marketing by Pen Name.

Learnings from conducting ~1,000 interviews at Amazon

Gergely Orosz — Tue, 21 Apr 2026 12:49:16 GMT

Steve Huynh, formerly Principal Engineer at Amazon, shares observations from Bar Raiser, and an excerpt from his new book, Technical Behavioral Interview

Tech interviews have two parts: the technical interview – with a focus on things like coding, software architecture, problem solving – and the behavioral part – with a focus on past experience, and the situations that show you’d be a good fit at the company you’re interviewing with, along with things like attitude, motivation, culture fit. Technical interviews are going through a big change, thanks to AI tools: some companies are bringing in new, AI-assisted types of interviews, while others are trying to make “pre-AI” type interviews work.

What doesn’t seem to be changing is the second type of interviews: the behavioral ones. I’ve found the topic of behavioral interviews from a software engineer’s perspective somewhat under-discussed – even though this interview carries huge weight in securing an offer and what level you come in at. No matter how strong your technical skills are, especially at mid-sized and larger companies, you are unlikely to get an offer if you are deemed to not be a fit for what the company is looking for.

Steve Huynh was an engineer at Amazon for 17 years – I previously did a podcast episode with him on the reality of being a principal engineer at Amazon. During this time, Steve conducted nearly 1,000 interviews, of which around 600 were Bar Raiser ones. Bar Raiser interviews are unique to Amazon: it’s an interview conducted by someone outside of the hiring team, with the goal of ensuring that the new hire raises the company’s talent bar.

After leaving the e-commerce giant, Steve spent 2 years researching and writing the book Technical Behavioral Interview: An Insider’s Guide.

My copy of Technical Behavioral Interview: an Insider’s Guide

Today, we cover two topics on interviews and behavioral interviews:

1. Learnings from conducting ~1,000 behavioral interviews at Amazon. Steve reflects of major observations from his 17 years at Amazon, covering:

You’re over-prepared for one interview and unprepared for the other
How you deliver the story matters as much as the story itself
The interview is an audition for what it’s like to work with you

2. What companies are looking for during behavioral interviews. An excerpt from Steve’s new book, Technical Behavioral Interview, covering ~75% of a full chapter of the book (out of the 14 total chapters.) We get into:

Understanding fit: role and company
The four dimensions that determine your level
What each level looks like
Reading and calibrating your own level
Researching what companies really value

Longtime readers might remember Steve from my podcast with him a year back: What is a Principal Engineer at Amazon? With Steve Huynh

My usual disclaimer: as with all my recommendations, I was not paid for this article, and none of the links are affiliates. See my ethics statement for more.

With this, it’s over to Steve:

1. Learnings from conducting ~1,000 behavioral interviews at Amazon

A Bar Raiser is a specially trained interviewer whose job is to ensure that every hire raises the average talent level at Amazon. I had veto power over any candidate. I sat on nearly a thousand interview loops across every level from intern to Principal Engineer.

After 50 or so interviews as a Bar Raiser, the patterns became impossible to miss. And this was the biggest one:

The candidates who didn’t get offers seldom failed because they lacked technical skill. They failed because of how they presented themselves.

For sure, technical preparation is crucial, and I’m not telling you to skip it. But most candidates have massive blind spots when it comes to non-technical matters, which is a big problem. Why? Because that blind spot is where most hiring decisions are made.

The Bar Raiser who trained me put it this way:

Technical skills are the ante. They get you into the game. But they’re not what wins you the hand.

I didn’t fully appreciate what that meant until I’d seen candidates who were technically very strong get rejected because of everything else.

Think about it. By the time you’re sitting in a final round of interviews, you’ve already passed at least one technical screen or take-home assignment. The company already knows you could probably do the job. They already know you want to work with them.

But that’s not what the final round is for.

The final round is when the team figures out whether they want to work with you. Being technically proficient is part of it, but it’s not all of it. Can you explain your thinking clearly when you’re stumped? How do you handle it when things go wrong? Can they picture you in a design review or in a tough conversation with a partner team?

Fit.

Fit is what decides most hiring outcomes, yet it’s the thing most candidates spend the least time preparing for. After nearly a thousand interviews, I can tell you exactly where the gap is and how you can close it.

Learning #1: You’re over-prepared for one interview and unprepared for the other

The average candidate preparing for a tech interview probably spends 95% of their time on technical preparation and 5% on everything else. Some spend literally zero on everything else.

I get why. Technical preparation feels concrete. You can grind coding problems and measure your progress. You can study system design patterns and feel yourself getting sharper. There’s a clear input/output relationship. Do more problems, get better at problems.

For most technical interviews, even if you haven’t seen the exact problem before, you can still do a decent job. It’s simply not possible to prepare for every problem, so it’s expected that you can reason through an unfamiliar coding question and pick up on hints the interviewer gives you. You can work through a system design problem by applying fundamentals you already know. It’s expected that you will encounter new questions during an interview, so it isn’t fatal if you’re a competent engineer who can think on your feet.

However, the non-technical rounds are the opposite. You cannot wing them and expect to do well. When an interviewer says, “Tell me about a time something went wrong on a project and how you handled it” and you haven’t thought about that question before, there is no hint they can give you. There’s no reasoning your way through it in real time. You either have a prepared story ready to go, or you’re going to mumble your way through a word salad while the interviewer watches.

I’ve seen this play out hundreds of times. A candidate would crush the coding round, then I would ask them about a difficult decision they made, and they would fall apart. They would pick a half-remembered example, start rambling, backtrack to add context they forgot, in the process losing track of the question. Then, five minutes later, they would land on something like, “So, yeah, it worked out in the end.”

These candidates were often strong coders, but that didn’t matter. At the debriefs, the feedback was always some version of “I couldn’t get a concrete answer about their experience. Every story was vague and unconvincing.” We couldn’t extend an offer when a candidate couldn’t articulate how they worked.

The technical bar was met, but the hiring decision was made in the behavioral round.

Here’s what’s frustrating about this. Non-technical preparation takes a fraction of the time for technical.

If you’re going to spend 80 to 100 hours preparing for an interview cycle, spending a single weekend on your stories might be the highest-leverage investment you make.

Ten hours of story prep can completely change the outcome of your behavioral rounds. Meanwhile, your 80th hour of LeetCode will give you almost nothing you didn’t already have at 60.

The returns on technical prep diminish rapidly. The returns on story prep are exponential because almost nobody does it at all.

What to do: How are you currently splitting your interview prep time? If it’s 99% technical and 1% everything else, you’re over-indexed on the part with diminishing returns and under-indexed on the part where hiring decisions get made. You don’t need to cut your technical prep dramatically. Just reallocate. If you’re planning to spend 80 hours preparing, take 10 of those hours and move them to non-technical preparation. That reallocation will do more for your odds than 10 more hours working on practice problems.

Learning #2: How you deliver the story matters as much as the story itself

You can have the most impressive accomplishment of your career ready for your interview and completely waste it with bad delivery. The most common version of this is what I call the “ramble and stumble.”

The candidate starts talking, and you genuinely can’t tell if they’re figuring out the story as they go or if they’ve simply never said these words out loud before. Or they might give you five minutes of context and then still backtrack to add details they forgot. By the time they reach the outcome, you’ve lost track of how you got there.

Here’s something that’s always struck me as odd. If you had a big presentation at work, you’d spend hours preparing for it, right?. You’d think about the structure, the flow, the key points. You’d rehearse it. You might even do a couple of dry runs with a colleague. Nobody wants to walk into a presentation and wing it.

But in a job interview, where the stakes are arguably higher than any single presentation you’ll ever give? People wing those constantly. They walk in having never practiced their stories out loud. They might have thought about them, but they’ve never spoken the words, heard how they sound, or timed how long they take. Then they’re surprised when the words come out as a mess.

Think about any other high-stakes skill. You wouldn’t expect to be good at golf without practicing at the driving range. You wouldn’t expect to give a great keynote the first time you stepped on stage. Nobody calls a musician fake for rehearsing before a concert.

But for some reason, many people feel that preparing interview stories is inauthentic. As if it’s cheating somehow. As if the “real” version of you is the one that stumbles through an unrehearsed answer under pressure.

It’s not. The real you communicates clearly what you’ve done and what you’re capable of.

What to do: Good delivery doesn’t require a lot of charisma or natural presentation skills, but it does require practice. Start with the two questions that come up in virtually every interview: “Tell me about yourself” and “Why do you want to work here?” Write down your answers. Then record yourself delivering them. Watch the recording and take notes. Where did you ramble? Where did you fill space with filler words? Did you look nervous? Then do it again. And again. Keep going until you watch the recording back and think “That sounds like someone I’d like to work with.”

Once those two are solid, pick stories from your career and do the same thing. This process will be uncomfortable at first. Most people hate watching themselves on camera. Do it anyway. Thirty minutes of this will up-level your interview performance much more than 20 hours of coding exercises could ever do.

Learning #3: The interview is an audition for what it’s like to work with you

Most candidates think the interview is an exam. If you get the right answers, then you’ll pass the test and get the job. That’s simply not how it works. Yes, you are being evaluated, and what you say matters. But there is no answer key. The interviewer doesn’t have a rubric with the “correct” responses to which they compare your answers. They’re forming an impression of you as a person, and that impression is far more nuanced than “right” or “wrong.”

By the time you’re sitting across from the interviewer, you’ve already jumped through some technical hoops. The company already has evidence from your resume that you can code or design systems at the level they need. That bar has been cleared. The final round goes deeper on the technical side, but it’s also trying to answer a completely different question: Would we want this person on the team? Would we trust their judgment in a crisis? Would they make our team’s software better or worse?

As a Bar Raiser, my specific job was to determine whether a candidate would raise the bar, meaning that they would be better than at least 50% of the people already at the company in that role.

The thing most people don’t realize is that the type of coding we asked about in interviews wasn’t what we did on the job. Nobody was writing algorithms on a whiteboard during their workday. The questions we asked tested problem-solving ability in an artificial environment.

But the behavioral questions, the soft questions, those tested situations we dealt with every single day. Navigating disagreements, handling projects that were going sideways, influencing without authority, making tradeoffs with incomplete information. These weren’t hypothetical scenarios pulled out of a textbook. They were just another Tuesday.

So when I asked a candidate to tell me about a time they had to push back on a stakeholder, I wasn’t waiting to hear the right answer; I was picturing them in our next planning and prioritization meeting. When they described how they handled a conflict on their team, I was asking myself whether I’d want to be in that room with them. Every answer was a preview of what it would be like to work alongside that person day to day.

The candidates who treated it like a test tried to figure out what I wanted to hear and then gave me that answer. That’s exactly the wrong approach. They gave polished, rehearsed answers with no rough edges and perfect endings where everything worked out and every decision was the right one. I’d walk out thinking “I have no idea what it would actually be like to work with this person.” And when that uncertainty showed up across multiple interviews in the debrief, it almost always turned into a “No.”

What to do: For each story you’re preparing, stop thinking about what the interviewer wants to hear. Instead, think about what you’d want to hear from someone interviewing to join your team. You’d want to hear how they actually think. You’d want the real version of what happened, including the parts that were hard and the calls that were close. You’d want to walk away feeling like you understood what it would be like to work with them on a tough problem. Give your interviewer that same thing. Be honest and let them see how you think. That’s worth more than any polished answer.

What ~1,000 interviews taught me

After all those interviews, the lesson I keep coming back to is simple.

The people who get hired are the ones who can walk into a room and tell a clear story. This story is about their work and their capabilities, and makes the interviewer think, “I want to work with that person.”

Being able to tell this story is a skill. And like any skill, it gets better with practice. Most people never practice it because they don’t think of it as something you can prepare for, but you can. And a little preparation here goes further than almost anything else you can do for your career.

2. What companies are looking for during behavioral interviews

The below are excerpts from Chapter 2 from Technical Behavioral Interview: An Insider’s Guide. Some sections have been cut out and lightly edited for this article. Copyright © 2026 Steve Huynh. Used with permission.

Technical skills alone don’t determine your offer. Otherwise, those who can solve the coding and system design problems would get the same result. Instead, companies use behavioral interviews to answer two critical questions: Do you fit with both the role and the company? And if you do fit, at what level will you be most effective?

Get both right, and you will receive an offer at the appropriate level. Get the fit wrong, and you’ll be rejected regardless of your skills. Get the level wrong, and you’ll be either down-leveled or rejected for being underqualified.

This chapter explains how companies make their assessments of fit and level by analyzing the signals in your stories. Once you understand these dimensions, you’ll pick better stories and signal the right level.

Understanding Fit: Role and Company

The primary consideration for any tech role is whether you have the technical skills to do the job. Companies will assess this mostly through the technical parts of the interview, for example, coding challenges, system design, or whatever technical evaluation matches your role. If you can’t demonstrate the core technical capability, nothing else matters.

But technical skills alone don’t predict success. Companies learned this the hard way by hiring smart people who couldn’t work effectively in their environment. That’s why behavioral interviews focus on two additional types of fit:

Role Fit: Can you handle the specific challenges and working conditions of this position? A backend role at a fast-growing startup requires different capabilities than a backend role at an established enterprise. The technical skills might be similar, but the role demands will be different.

Company Fit: Will you thrive in the environment in which this organization operates? This goes beyond surface-level culture. They are assessing whether your working style, decision-making approach, and values match with how the company gets things done.

How Companies Detect Fit Through Signals

Companies can’t directly ask the question, “Would you fit here?” What candidate would torpedo their chance of success by answering with a “No”? Instead, companies look for signals in your stories that indicate alignment or misalignment.

Role Fit Signals emerge from how you describe handling situations similar to what the role requires:

If the role requires working with ambiguous requirements, do your stories show comfort with uncertainty?
If the position involves cross-team coordination, do you show an ability to cope with organizational complexity?
If the job needs rapid iteration, do your examples show shipping quickly and adjusting based on feedback?

Company Fit Signals come from the choices you made and how you describe them:

A company that values “bias for action” looks for stories that show you moving quickly despite incomplete information.
An organization that prizes “customer obsession” wants to hear examples of you going deep to understand user needs.
A place that emphasizes “radical transparency” seeks stories that show you sharing information openly, even when you’re uncomfortable.

The same story can send different signals to different companies. You spending three weeks perfecting a solution might demonstrate attention to quality at one company but analysis paralysis at another. Moving fast and fixing issues later demonstrates good judgment at a growth startup but recklessness at an established healthcare company.

Common “Mis-Fits”

Even a talented candidate will get rejected sometimes if they are not a good fit. The same behaviors that are positive at one company can signal poor fit at another.

Independence vs. Collaboration: This covers both how you work and how you make decisions. Some companies need people who pick up a problem, run with it, and come back with a solution. Others expect you to bring the team along at every step. These often go together: companies that want you to work solo also tend to want you to make calls on your own, and companies that want collaborative work also want group buy-in on decisions.

If every story you tell involves going off and building something alone, consensus-driven companies will worry you’ll steamroll people or make choices that won’t stick. Flip it around: if every story involves checking with the group before you act, companies that prize individual ownership will wonder whether you can make a decision without a meeting.

Speed vs. Thoroughness: Startups often need rapid experimentation, where you ship MVPs and iterate based on feedback, while companies in healthcare or finance require careful validation before any release. This tension also shows up in how teams think about code quality: some organizations will happily spend extra weeks on clean architecture, while others want a working solution on deadline even if the code needs cleanup later. Whereas stories about methodical testing might bore a startup, your “ship it and fix it” examples could terrify a medical device company.

Excellence vs. Pragmatism: Some organizations value technical excellence and clean architecture above all else. Others need pragmatic solutions that ship on deadline even if imperfect. Focusing on perfect code fails at deadline-driven companies, just as accepting technical debt everywhere fails at companies maintaining critical infrastructure.

Innovation vs. Stability: Some roles require creating new solutions and challenging existing approaches, while others need you to maintain and optimize proven systems. If you say that you’re constantly reinventing established processes, teams that value stability will not consider you a good fit. Conversely, stories that show you only follow existing patterns will disappoint teams that are looking for creative problem-solving

Direct vs. Diplomatic: Some cultures prize radical candor and want you to say exactly what you think. Others value maintaining harmony and face-saving communication. If you are too blunt, you will not fit in well at a relationship-focused company. If you are not direct enough, you will not like working at a company that values “disagree and commit.”

Data vs. Intuition: Some companies require data to justify every decision (”data-driven” cultures), while others trust experienced judgment and move on gut feel. Showing that you make decisions based on instinct does not impress analytical companies, and telling a company that values experienced judgment that you conduct three A/B tests to choose a button color will get you struck off their list.

Specialist vs. Generalist: Large companies often want deep experts who master one domain, while smaller companies need people who are comfortable wearing multiple hats. Know which sort of company you are walking into.

Once you understand fit, you can pick stories that match the company and the role.

The Four Dimensions That Determine Your Level

Companies assess your level through four dimensions that appear in every story you tell. Each dimension reveals different aspects of your capability. Together, they show the company where you operate most effectively.

Scope (Dimension #1)

Scope provides a measure of the number of people on your team and, extending outward as you advance, whose work was affected by your actions. The greater the number affected, the higher your level for this dimension.

Entry Level: Your work affects your own productivity and starts to help other team members. For example, you might improve how you handle assigned tasks or fix issues that were slowing down a few teammates.

Mid Level: Your work affects aspects of the team and shapes how it operates. You might redesign a process that changes a significant part of how your team works or solve problems that affect most of the team’s effectiveness.

Senior Level: Your work directly impacts your entire team and is beginning to influence at least one other team. Perhaps you create solutions that change how your whole team operates and affect workflows in adjacent teams, or you solve problems that require coordination with other groups. You may also start collaborating more closely with product or design partners on your immediate team’s work.

Staff Level: Your work directly impacts at least two teams and is beginning to have an influence on the broader division or organization. Examples of this include developing technical strategies that change how multiple teams make decisions and solving problems that require buy-in across several parts of engineering. Your influence extends beyond engineering into product, design, and program management as you shape solutions that affect how cross-functional partners work.

Principal Level: Your work affects many teams or changes how large parts of the organization operate. Perhaps you have created technical strategies that have influenced how dozens of teams make decisions. Or you have solved problems that cut across a large engineering organization. At this level, your influence regularly extends into business strategy, shaping decisions alongside product, design, program, and business leadership.

Contribution (Dimension #2)

Contribution captures what you did, not what happened around you. It is important to be precise about the line between “I” and “we.” Companies will expect to see evidence of increasing leadership and ownership as you advance in your career.

Entry Level: You execute assigned work and are beginning to take ownership of small pieces. Examples: implementing solutions designed by others; fixing bugs in existing systems; taking full responsibility for well-defined features within larger projects.

Mid Level: You own complete solutions from problem to implementation while also guiding others. Perhaps you have identified issues, designed the approaches, implemented them, and you have verified that they work, and you have helped your teammates understand the reasons for your decisions.

Senior Level: You lead initiatives requiring coordination. You’re expected to make progress even when the requirements are unclear or the path forward is uncertain. Examples of this include driving technical decisions for your team; mentoring others through complex problems; architecting solutions to be implemented by others; and ensuring quality work outcomes for many people.

Staff Level: You lead cross-team initiatives and establish technical direction, often in situations where the right approach isn’t obvious and stakeholders have competing priorities. This could look like defining technical approaches that are adopted by multiple teams, creating systems that enable other teams to solve problems on their own, or driving agreement on complex technical decisions across several teams.

Principal Level: You create organizational capabilities and establish new ways of working. At this level, you’re frequently operating in highly ambiguous environments where you must define the problem before you can solve it. You might define technical standards that guide dozens of teams, build systems that enable others to solve entire classes of problems, or transform how the organization approaches its hardest challenges.

Impact (Dimension #3)

Impact shows what changed for the better as a result of your work. Companies want to see that your work produced results worth the investment. Strong stories put numbers on the impact and connect technical wins to business or user outcomes.

Entry Level: You improve your personal productivity and are starting to help the team work better. Examples include reducing the time you spend on repetitive tasks, fixing issues that were slowing down teammates, or improving the quality of code in the areas you touch. Even simple measures matter at this level: time saved or bugs prevented.

Mid Level: You improve team effectiveness in specific areas and influence team-wide practices. Perhaps you reduced deployment times for specific workflows, eliminated categories of bugs in your domain, or you created tools that have made the team more productive in particular areas. You can quantify these improvements and connect them to broader outcomes like feature velocity or reliability.

Senior Level: You transform how your entire team works and are starting to have an impact beyond your team. For example, you might have introduced new workflows that changed your team’s capabilities. Or perhaps you eliminated major sources of operational problems, or the improvements that you have created have been adopted by adjacent teams. Your impact extends beyond just engineering metrics to product outcomes, user experience, or operational costs.

Staff Level: You improve how multiple teams operate and drive organizational improvements. These sorts of impact come from achievements such as establishing practices that several teams adopt, solving infrastructure problems that were impeding multiple teams, or creating new capabilities that open up new types of work across teams. Your measurable impact can be tied to business metrics like revenue, customer retention, or time-to-market.

Principal Level: You create organizational capabilities and drive strategic changes. Impact at this level could come from establishing technical foundations that dozens of teams use to build upon, solving problems that were blocking major business initiatives, or creating leverage that compounds benefits across the company. Your impact is measured in business outcomes and strategic capability, not just technical improvements.

Difficulty (Dimension #4)

Difficulty reflects the complexity of problems you’ve tackled, the constraints you have faced, and the trade-offs you have managed. Under this category, solving easy problems with big impacts is less impressive than hard problems solved well.

Entry Level: You work on straightforward problems within established patterns. For example, you might face challenges learning new technologies or debugging unfamiliar code, but the path forward becomes clearer once you understand the problem or ask for help.

Mid Level: You work through challenges and obstacles in your work. The problems you tackle have more moving parts and less obvious solutions. These could be competing requirements or having to work through technical complexity you haven’t seen before. Or perhaps you have had to manage dependencies within your team that affected your timeline or figure out solutions when the approach wasn’t immediately obvious.

Senior Level: You manage constraints and make technical decisions with team-level architectural implications. The problems you solve involve multiple interacting systems and competing concerns. You might have to balance needs across multiple stakeholders with different priorities. Maybe you make architectural decisions that affect how your whole team works, or you have to work around technical limitations that require creative solutions, or solve problems that require you to address both technical and business factors.

Staff Level: You manage competing trade-offs across multiple teams while handling problems with significant technical and organizational complexity. Examples of difficulty at staff level include:

Balancing different technical approaches when teams have genuinely conflicting needs.
Creating solutions that affect how several teams work together.
Making architectural decisions that have to work across diverse contexts.
Getting teams to agree when the technically optimal solution differs for each team.

Principal Level: You handle fundamental trade-offs between competing organizational needs or solve problems where no clear solution exists. The complexity at this level often involves novel problems that lack established patterns or precedents. You might balance technical excellence against delivery speed at organizational scale; work within organizational constraints while maintaining technical integrity; create approaches for entire classes of problems the company hasn’t solved before; or make decisions that affect company strategy and require executive buy-in.

What Each Level Looks Like

Here’s how the same types of accomplishments look across each level. These aren’t templates. They’re meant to help you develop a sense for the difference between a mid-level story and a senior one. Compare adjacent levels and notice what actually changes as you move up and down.

Researching What Companies Really Value

You’ll never have perfect information about what a specific company values, but a little focused research will often reveal surprising insights that most other candidates will miss. The difference between having even partial intelligence and going in blind can be whether or not you emphasize the right things in your stories.

Start With Your Recruiter

Most candidates treat recruiters as gatekeepers to avoid, but if you do this, you will waste your best source of insider information. Recruiters want you to succeed, because their performance is based on the number of accepted offers received by the candidates they put forward. They have prep materials, they know the interviewers’ focus areas, and they understand what they are looking for.

Ask your recruiter directly: “What should I know about this company’s current challenges?” Or “What competencies matter most for this role?” Or “Can you share any interview prep materials?” Many recruiters have documents about interview format, team priorities, or even the specific behavioral competencies they evaluate. The questions that are used as examples in the prep materials have a high likelihood of being asked in the interviews.

Mine Publicly Available Information

When companies repeat certain words when describing job opportunities, they’re telling you what matters. For example, a job posting that mentions “fast-paced” several times signals something different than one emphasizing compliance. Those words are there for a reason.

Where to dig:

Engineering blogs: How do they describe their wins? What problems do they celebrate solving?
Tech talks and conferences: What topics do their engineers present? Speed of delivery? Scale? Innovation?
Open source contributions: What they choose to open source reveals their priorities. If they open source developer tools, this suggests they value community. If they are happy to make internal tools public, this shows transparency.
Technical documentation: The existence of public API docs or technical guides (and the quality thereof) shows how they support both users and their own teams.
Status pages and postmortems: Companies that publish detailed postmortems demonstrate that they value learning from failure. A company that shares their incident response processes likely has a strong operational culture.

Even companies without engineering blogs will leave traces. Product release patterns tell you about their development pace. Technology choices show their priorities: newer frameworks suggest a focus on innovation, whereas relying on proven technologies indicates they prefer stability.

Look for Patterns in Discussions

Glassdoor, Blind, and Reddit contain gold buried amongst rubble. Ignore the rubble (e.g., individual rants). Instead, look for patterns across multiple posts. If five different people mention “lots of process” or “no work-life balance” or “amazing learning culture,” that’s a pattern you will want to know about.

Pay attention to what people complain about and what they praise. Complaints about “too many meetings” may suggest the company has a collaborative, consensus-driven culture, or, alternatively, that productivity within the company is inhibited by an excessive number of meetings. Praise for “autonomy” indicates they trust their people to make decisions without checking in. Both types of comments reveal what behaviors the companies will reward.

Talk to Current Employees

If you know someone at the company, ask them directly what behaviors get rewarded and, conversely, what behaviors will cause people to struggle. Skip surface-level queries about culture, and ask specific questions:

“When someone gets promoted here, what do they do to earn it?”
“What behaviors get negative feedback?”
“How does the team make decisions when there’s disagreement?”
“What surprised you most about working here?”

Current employees will tell you truths the company website never would. Perhaps they’ll tell you that at their company, “customer obsession” really means checking usage data before writing code, or that “ownership” means being available to resolve production issues at two o’clock in the morning.

What You’re Really Looking For

All this research serves one purpose: understanding what stories will resonate at your interview. Think of it as finding the real intersection between your experience and what they care about.

If research reveals they prize speed over perfection, then emphasize stories that tell how you shipped quickly and iterated. If they value technical depth, highlight examples of diving deep to understand root causes. If they care about collaboration, make sure your story focuses on cross-team work rather than solo accomplishments.

The research will also help you decide whether this company is the right place for you. If everything you learn suggests they value the kinds of behaviors you don’t naturally demonstrate or don’t want to develop, then perhaps you don’t need to pursue that particular role.

Putting It All Together

Companies aren’t just evaluating whether you can do the job. They’re also assessing whether you’ll thrive in their specific environment and at what level you’ll be most effective. These two dimensions determine not just whether you will get an offer, but also whether that offer will position you for success.

Understanding fit helps you know which of your experiences will connect most with what the company values. This small company needs someone who ships fast and figures things out alone. That enterprise needs someone who navigates processes and builds consensus. Neither is inherently better than the other. They’re simply different environments that reward different approaches.

Understanding levels helps you position your stories appropriately. The same project can demonstrate entry-level execution, mid-level ownership, or senior-level leadership depending on your actual contribution and how you frame it. Get this wrong and you will either get rejected for overreaching or down-leveled for not properly communicating your capabilities.

The payoff is immediate. You’ll pick better stories, focus on the right details, and make it easier for interviewers to see what you can do. You’ll make better decisions about which roles actually match who you are and what you want to do. The goal isn’t to get any offer. The goal is to get the right offer at the right level at the right company to ensure your success.

Takeaways

Gergely, again. Thanks to Steve for both sharing his learnings, as an interviewer, and for sharing nearly a full chapter from his whole book. The book goes a lot deeper than the above sample chapter. A few of the ones I found helpful:

High-signal storytelling (Chapter 3): a framework for explaining your work in a way that “sticks” with the interviewer
9 competencies with many examples and stories throughout the book: ones like “delivery” (Chapter 6), “earning trust and dealing with conflict” (Chapter 8) and “Strategic leadership and thinking big (Chapter 13)”
Examples of what interviewers typically see as key signals, yellow flags and red flags

If you would like to have a fresh resource to prepare for behavioural interviews at tech companies, the full book offers far more explanations, tactics and exercises to do so:

Get the full book on Amazon

Steve also writes a newsletter titled A Life Engineered: you can sign up to it here.

It’s helpful to understand how and why companies hire, and what they look for. To us engineers, hiring processes often look illogical from the outside. We’ll ask things like:

“Why does the interview process not resemble day-to-day work?”
“I already have open source code I wrote: why does the company need to do a coding interview to confirm what is clear: that I need to code?”
“Why did I get a rejection, even though I did well on all of the interviews?”

It feels to me that there are similarities between hiring and dating: both parties show up with goals and expectations in their head, which are often not communicated. Sometimes there’s a match; sometimes there is not. This phase of a relationship is often about “selling:” as a candidate on the job market, it’s about selling yourself, and convincing the company that you would be a fit for what they are looking for.

Doing your research on the company is underrated, and not all that many candidates do so, in my observation. When I was a hiring manager at Uber, roughly half of the people who got on the call with me did not do any research about the company, and perhaps 1 out of 10 candidates did any research on the team they interviewed for – when we had public blog posts about our work, on the company blog! So those showing up prepared helped them stand out in the “motivation” dimension, from the get go.

It all starts with being able to pass the “technical” interviews – but it’s a mistake to sleep on the “behavioural parts.” To state the obvious: candidates who do not do well on the technical interview rounds will not get offers. But I’ve personally had to say to several candidates who did great on the technical side of things, but turned out to be misaligned with what we were looking for, as confirmed on the behavioral rounds.

And I do believe you can uplevel in doing better on these behavioral rounds: starting with researching what the company’s culture is like, practicing how to present yourself better, and putting yourself in the shoes of the interviewers, understanding what they are looking for.

I know plenty of software engineers who refuse to do any preparation for interviews, staying “if the company doesn’t want me as I am, they don’t deserve me anyway.” This is a valid strategy, and can work for highly in-demand professionals, the same way as showing up to a first date in sweatpants and slippers can still work out for highly attractive and desirable people. For the rest of us not as incredibly in-demand for a position we’re applying for: it’s probably worth putting in additional effort, in hopes for better outcomes during interviews.

The Pulse: ‘Tokenmaxxing’ as a weird new trend

Gergely Orosz — Thu, 16 Apr 2026 16:47:08 GMT

The Pulse is a series covering events, insights, and trends within Big Tech and startups. Notice an interesting event or trend? Hit reply and share it with me.

Today, we cover:

Tokenmaxxing: weird new trend. At Meta, Microsoft, Salesforce and other large companies, devs are purposefully burning tokens (and money!) to inflate their AI usage and hit AI usage metrics which they treat as targets.
Are coding AI-agent subsidies doomed? At the same time as Anthropic stopped subsidizing enterprise plans, Uber managed to burn through its entire 2026 AI token budget in just 3 months. I expect per-engineer AI budgets to be rolled out across more companies soon.
Industry Pulse. The myth of Claude Mythos, Claude’s degradation, Cal.com going open source due to AI threat, Vercel open sources its “agent factories” tool, sensible AI usage guidelines in the Linux kernel, and more.
Cal.com goes closed source – but is it really because of AI? The open source Calendly alternative moved a good part of its code to a closed repo, citing AI and security concerns. But perhaps this was just a business model change that would have happened, AI or not.

1. Tokenmaxxing: weird new trend

The impact of AI on software engineers in 2026: key trends. Part 1

Gergely Orosz — Tue, 14 Apr 2026 16:01:02 GMT

Recently, we ran a survey asking readers of The Pragmatic Engineer how you use AI tools, which tools you use, what does and doesn’t work, and what it’s like working with AI, in general.

For today’s issue, we’ve dug into your 900+ responses to look for trends in AI tool usage among software engineers and engineering leaders. This article surfaces insights that are less about specific tools, and more about the effect these tools have on tech professionals. We cover:

Costs. Unsurprisingly, companies pay for most tool usage, and those responsible for budgets are increasingly nervous that AI-related costs are headed only one way: up.
Usage limits. Around 30% of respondents say they have hit limits. Switching tools, upgrading plans, or moving over to API pricing are common responses.
Impact on “Builders.” Folks who make larger code changes and do “quality-of-life” work are builders, and they’re also dealing with more AI slop. Some also grapple with a loss of professional identity.
AI tools speed up “Shippers.” Engineers who focus more on getting things done are the most positive about AI tools. But they also add tech debt faster and might build the wrong things.
“Coasters:” learning faster while generating AI slop. Less adept engineers can uplevel faster with AI, but they generate a lot of “AI slop” while doing so, which frustrates builders.
Changing software engineer & engineering manager (EM) roles. Engineers have to orchestrate and context switch more often, while engineering managers can be more hands on. It’s interesting to see the engineer and manager roles becoming more similar.
Other impacts on the craft. We’re going from “how” to build to “what” to build, solo devs are seeing improved results, workloads are increasing with AI tools, and more.

See also two other articles in this mini-series:

AI tooling for software engineers in 2026: a detailed summary of survey responses, covering the most-used AI tools, trends, AI agent usage, company size and usage, and tools engineers love.
The impact of AI on software engineers in 2026: key trends. Part 2. Tradeoffs of AI tooling, why adopting AI at company-level is hard, what’s changed in two years, and more.

Full subscribers also have access to a more detailed, 65-page report on our findings.

1. Costs

Concern about the cost of AI tools is a trend throughout the survey, with around 15% of respondents mentioning it in some way.

Tech companies foot the bill for the majority of spending on AI tools. More respondents say their employers pay for AI coding tools than those who say they pay themselves, and predictably, employers fund more expensive packages than what individuals buy personally.

Companies commonly pay for “max” plans with the likes of Claude Code, Cursor, and Codex (around $100-200/month per engineer), although some companies’ budgets only stretch to $20/month per engineer – around the price point of GitHub Copilot, and the cheapest Claude or ChatGPT subscriptions.

The most-mentioned AI tool spending patterns:

When companies pay: ~$200/month plans. Many have enterprise subscriptions, sometimes with subsidies and vendor lock-in. Some companies allow usage-based coverage on top of monthly plans.
When personally paying for tools: ~$20/month or free tiers. This can stack up across different tools. Around 5% of respondents have separate work and personal subscriptions, and free tier usage is widespread for personal use.

For now, companies seem to be in the experimentation phase with AI tools, and several respondents say that they believe their companies have unsustainable AI-tooling budgets. This is likely because businesses are figuring out the best way of leveraging the tools, and the message to engineers at such places is to not worry about price and usage while that unfolds. A CTO at a small, US-based company shares:

“Right now, we’re not sweating the costs because we’re trying to evolve best practices for the tools, but that has resulted in some devs really blowing through budget, so we may start instituting caps on spending.”

Breaking the budget

At small and mid-sized companies, leadership teams seem more comfortable about going over budget, than engineers running out of budget. There are more accounts from C-level folks and founders about racking up large bills than there are from engineers. A CPTO (Chief Product and Technology Officer) at a mid-sized company:

“I ran up several monthly bills of $600 with Cursor. We have the dev team subscribed to ~$100/month plans. We’re now in the process of moving the rest of the team to Claude Code, as we can get more resources for around $100/month in cost.”

Top spenders can be allocated higher budgets. A number of tech businesses have separate, larger budgets for their heaviest AI users. A senior C++ engineer working in the video game industry says:

“I’ve become my team’s AI champion. In theory, my limits are higher than normal, but I keep myself limited to what others can use, so I can show them useful things they can do.”

UK and EU companies worry more about budgets than US-based ones. Most responses which mention finance teams pushing back against spending even $30-50/month per engineer on AI tools are based in the UK and EU. One amusing example is a 10-person, seed-stage startup where the CEO questioned why they were paying as much as £25/month per engineer for one of the cheapest AI tools around.

In general, it feels like European companies want to see clear value-add in order to justify an increase in tooling spend, whereas US companies are more comfortable with investing first and measuring impact later. At present, the impact of these tools is hard to quantify.

A niche approach is that of AI teams educating devs to use cheaper models. Some European companies go as far as offering education to new joiners on using cheaper models. From an AI Enablement Lead at a 1,000+ person, digital transformation company:

“Within our organisation, we’ve had incidents where our Claude users have overshot their limits. We’re now attempting to educate devs in knowing the difference between different models (knowing when to use Claude Sonnet versus Claude Opus).”

Cost trajectory worries

The cost trajectory of AI tools is generally considered unsustainable in our survey. Devs using the tools heavily tend to hit usage limits, and their employers then have to pay more. At places with API-based pricing, usage is increasing. Those in leadership positions who are responsible for budgets are generally concerned about the direction of costs.

Subsidies are keeping things at bay – for now. A common enough pattern in our survey is of heavily-subsidized enterprise plans that come with vendor lock-in. Several responses raise concerns about what will happen when the subsidies run dry. Experienced engineering leaders recall that cloud providers also played the same game of subsidizing for a few years, then raising prices when a customer was fully “locked in.”

The AI hype cycle is dampening awkward conversations about budgets at some places. A principal engineer at a Fintech tells us:

“The AI hype has created a special, generous budget for AI tools, and there’s no effective budget – yet!”

But some finance teams are getting grumpy. A CTO at a sports-tech company says:

“It’s hard to keep our CFO supportive about investing in these tools because the productivity benefits have proven difficult to conclusively prove. The point that resonated the most was the loss of value when people hit daily limits: having to stop work immediately! Surprisingly, our CFO is still pushing back, despite having experience of getting a lot of value through their own AI usage with their spreadsheets.”

Most survey respondents think the price of AI tools will have to rise. If that happens, it would cause problems at several companies – particularly those in Europe:

“I cannot see how the spend on AI tools is fiscally sustainable in its current form; Max 100 with Claude Code is $100 a month. A single small task powered by Kimi K2.5 using OpenCode is $5, mostly in input cost. If we assume that the third party inference providers are doing so at a sustainable price, the much more expensive Opus model cannot be sustainable, never mind profitable at these plan costs.” — Founder at a seed-stage company, Europe.
“From the economic perspective, at some point, these companies will need more funding or profit, I’m curious how much it costs them to have a proper agent, and still become profitable. It feels slow when you run out of credits when working on repetitive tasks.” — Principal Software Engineer at a seed-stage company, Europe

2. Usage limits

Another major trend in our survey results is the topic of usage limits:

Hitting limits: ~30% of respondents. Running out of tokens or hitting reset limits is frustrating and disruptive, especially when you’re working on a task or are in a flow state. The majority of respondents who complain about hitting limits are on cheaper plans (typically $20/month.) But this issue is also mentioned about higher subscription levels.
Under the limit: ~20% of respondents. Avoiding usage caps generally correlates with being on more expensive plans with higher limits, or in roles with enough non-coding work for it not to matter, or when devs do enough work “manually” for AI usage to not be an issue.

Why users of AI tools hit limits

Common reasons cited in the survey:

Being a new AI user or a power user. These are two distinct groups, but an engineering manager at a mid-sized company in Canada says that each one similarly blows through token limits for different reasons:

“We’re mindful of trying to manage costs by setting AI spend limits across the org. We have two subsets of users at odds with each other:
Individuals who are still learning and blow through their credits at an inordinate rate, forcing us to keep limits low.
Power users who hit the limit through regular use and apply pressure to raise the limit.
It’s a tough balance.”

Using Opus for all work. A few engineers mention being careful about how they use Opus because it previously ate up their token budgets. Here’s a software engineer at a mid-sized company in Europe:

“I made the mistake of using Opus in the past and burning through budgets quickly. Now, my routine is to start in ‘plan’ mode with Opus. I then paste the acceptance criteria and description of the issue and let the plan mode figure it out. I then switch to Composer or Sonnet and have the agent take over from there.”

Mistakes that eat up tokens are easy to make. These include starting on work or a problem from the wrong end, using AI directly for a task rather than opting for a simple script, trying some new tool or technique that ends up consuming tokens (OpenClaw and Ralph Loops are cited), and others.

What happens when the limit is hit?

Hitting the limit with an AI tool is inconvenient and happens to many developers, who take a variety of next steps:

Switch the model or tool. Around a quarter of respondents who hit limits mentioned switching. From a software engineer at working at Atlassian:

“In my company, for Cursor and Windsurf we have monthly limits. Our internal coding tool (called codelassian) also has daily prompt and hourly token limits. When I hit a limit in one tool, I switch to the other.”

Upgrade to a pricier plan. When it’s an option, this is a no-brainer at most places, especially as the alternative would be devs twiddling their thumbs waiting for the limit to be reset. A senior engineering manager at a mid-sized company says:

“In my team, we are regularly hitting session limits with Claude. We upgraded some teammates to the Max 20x plan – and on this plan we have not been hitting limits, so far.”

Adopt API-based pricing. This is the easiest way to keep working without abandoning a task you’re knees-deep in. A senior engineer at a large company says:

“The company provides both the Claude and Copilot corporate offerings. When the limits are reached, I tend to use API keys that my teammates give me.”

3. Impact on “builders”

We identified three different types of professional in the survey:

Builders: those who care about quality, good architecture, following good coding practices, and who talk about the craft of software engineering, etc.
Shippers: those who primarily focus on outcomes for a product, features, testing, and experimenting with users. A fair number of leaders, managers, and engineers who were more hands-off with coding before AI tools are in this category, as are product engineers.
Coasters: engineers who are not considered particularly good or great engineers, but they get the work done. They often do this without much taste or concern for quality, and seem to be mostly coasting along and doing what they’re told.

The overall consensus in our survey results is that AI will amplify and multiply tendencies and patterns that existed before, and the impact of the tools varies accordingly among users. Let’s start with the impact we’ve observed upon builders in the responses:

The good and bad of AI tools, as shared by respondents in the Builder archetype

Builders say they get value from AI tools in the following areas:

Larger code changes. Builders generally find AI helpful for work like:

Refactoring
Migrations
Improving test coverage
Carrying out large codebase changes

All these are changes that are laborious, but not very challenging technically. They also require experience in knowing what you want to do and how to do it.

Accomplishing “quality of life” tasks. Builders mention that with AI tools, they get to fix and improve things like nagging bugs that otherwise wouldn’t be “worth it” in time invested, but the barrier to entry is lower with AI.

A good example of this is in last week’s podcast with David Heinemeier Hansson (DHH), the creator of Ruby on Rails, in which he revealed how one of their engineers optimized P1 – the fastest 1% of web requests:

“One of our most agent accelerated people asked: “What about P1? What about the floor? Can we fix the floor?” He found that the floor [of request speed] was 4 milliseconds.
Well, 4 milliseconds can add up if you have a bunch of fast requests. So, he just said: “We’re going to optimize P1. The fastest 1% of requests, we’re going to make them even faster.” He took it from 4 milliseconds to less than half a millisecond. He did this P1 project over a couple of days, like a side project.
He had an intuition that there was something here. He let agents run with it. The work ended up being 12 pull requests, and about 2,500 lines of code changed.
This is exactly why the explosion of the pie suddenly lets us look at problems we would never have contemplated looking at before.”

Typing is no longer a bottleneck. Some builders report falling even more in love with coding with the help of AI and agents, since physically typing out code is no longer a bottleneck for them. They enjoy being able to prompt. From one “builder”:

“For someone who loves to build – but also values code quality, performance, reliability, and security – I ship a lot more quality code faster, if for no other reason than because the AI can read and write 100x faster than me. I get to stay at the conceptual level of shipping a product, and I can dive into debugging with the agent as needed. But if the agent has a good handle on the situation I can give it as much of the tedious parts as I wish.” – Staff Engineer, at a large tech company, US

The negative sides of AI tools, as experienced by builders:

More AI slop. Builders seem to be the most overwhelmed and derailed by reviewing a lot more AI-generated code. They can get frustrated with low-quality code shipped by colleagues which could be categorized as “AI slop.”
More debugging. AI-generated code introduces bugs and issues, and builders tend to spend the most time debugging and fixing those issues.
Identity loss. Some builder-types report a sense of identity loss and even some grief. Much of this relates to no longer doing hands-on coding because they cannot justify it, since AI agents generate pretty decent code faster than someone can type it.

4. AI tools speed up “Shippers”

The “shipper” archetype thrives when they get things to production quickly. This group is by far the most enthusiastic about AI tools in survey responses. They are also the ones who praise – or hype up – the tools because of their personal experiences of shipping much faster with them.

Good and bad things about AI tools for shippers

The biggest upsides mentioned by shippers:

DHH’s new way of writing code

Gergely Orosz — Wed, 08 Apr 2026 17:16:28 GMT

Stream the latest episode

Listen and watch now on YouTube, Spotify, and Apple. See the episode transcript at the top of this page, and timestamps for the episode at the bottom.

Brought to You by

• Statsig – ⁠ The unified platform for flags, analytics, experiments, and more. Stop switching between different tools, and have them all in one place.

• WorkOS – The infrastructure B2B and AI-native companies use to sell to enterprise. It covers everything enterprise security requires: SSO, SCIM, RBAC, Audit Logs, AI governance, and more. Engineering teams ship it in days. Trusted by 2,000+ fast-growing companies, including OpenAI, Anthropic, Cursor, and Vercel. WorkOS.com

• Sonar – The makers of SonarQube, the industry standard for automated code review. Sonar helps reduce outages, improve security, and lower risks associated with AI and agentic coding. See how SonarQube Advanced Security is empowering the Agent Centric Development Cycle (AC/DC) with new capabilities.

In this episode

David Heinemeier Hansson (DHH) is the creator of Ruby on Rails and Omarchy, co-founder and CTO of 37signals (maker of Basecamp and HEY), and the author of several books including the best-seller, Remote: Office Not Required, co-written with Jason Fried.

Six months ago, in an episode of the Lex Fridman podcast, David shared how he doesn’t use AI tools to write code: he types out all his code. But things have changed a lot since then.

In this episode, we discuss his approach to building software, how it’s changed in the last six months, and why he now takes an agent-first approach, and how he barely writes any code by hand. We go into how he uses AI agents: which alter how he builds and explores ideas, but also how his standards of quality and craft remain the same.

We also discuss how 37signals thinks about product development, from the role of designers to the importance of aesthetics and taste. David gets into how he sees beauty and functionality as closely linked, and why strong opinions about design lead to better software.

Finally, we look into the uneven impact of AI which amplifies senior engineers while creating challenges for junior developers, and what this may mean for the role of the software engineer.

Key observations from DHH

Here are 12 of my most interesting takeaways from talking with DHH:

1. His philosophy on AI has not changed, but the available tools very much have. Autocomplete-style coding assistants were genuinely annoying for experienced developers six months ago. Things changed with the shift from tab-completion to agent harnesses, plus the emergence of powerful models like Opus 4.5 – when agents started producing code which DHH does want to merge with little to no alteration.

2. Beautiful code and products aren’t matters of vanity; they’re signals of correctness. Dipping into philosophy, DHH says: “When something is beautiful, it’s likely to be correct.” He argues that Steve Jobs wanted the inside of a computer to be beautiful because people who care about circuit board layout are also those who sweat on the details of the UI.

3. DHH’s development workflow, today: he runs tmux to have two models running, and neovim in the center. Specifics:

One fast LLM running (typically Gemini 2.5) in one split terminal
A slow but more powerful model in another terminal (usually Opus)
NeoVim for reviewing diffs via Lazygit

4. Ruby on Rails seems to be enjoying a Renaissance thanks to AI. Rails is one of the most token-efficient ways of building web apps and is well-suited for agent workflows. Testing is part of the framework, which helps agents write tests and validate their own outputs. It also produces code that humans can read and verify, which matters when reviewing agent output at speed.

5. A big win from using AI agents is tackling stuff that you wouldn’t have before. A senior engineer at 37signals ran a “P1 optimization” project to improve the fastest 1% of requests. They optimized the P1 from being at 4 milliseconds, to under half a millisecond. This is the sort of work that wouldn’t have been considered previously!

6. Running several AI agents feels less like “project management” and more like “wearing a mech suit.” Being a project manager of agents did not appeal to DHH, but now that he’s building with several agents, he feels like he’s in control of the work which is being hyper-accelerated.

7. Senior engineers benefit from AI a lot more than juniors. At 37signals, senior engineers gain more from AI tools as they can validate whether an agent’s output is production-ready. DHH also notes that Amazon reached the same conclusion, and no longer lets junior programmers ship agent-generated code to production without review.

8. 37signals has one designer for every two engineers. The company has around 20 software engineers and 10 designers. Designers do far more than design; they’re also product managers and “implementers” rolled into one. On top of making things look good, they figure out what should be built, how it should work, and often build the first version. DHH compares design at 37signals to jewelry design: “you should know the properties of gold. You should know how it bends.”

9. AI agents could turn 37signals’ “designer model” into the industry standard. AI tools now empower designers to implement more of their vision directly, and DHH suspects the rest of the industry is converging toward what 37signals has always done: working with small teams, where designers are also builders.

10. Command Line Interfaces (CLI) feel like the ultimate AI interface, which validates the Unix philosophy of the 1970s. DHH is building CLIs for all 37signals products because they let agents chain tools together. “GitHub also has a CLI, and Sentry as well,” he says. “You can tie all these things together so an agent can check errors, write a fix, post a PR, and report back to basecamp.”

11. The demise of the two-month product development cycle described in the book ‘Shape Up: Stop Running in Circles and Ship Work that Matters’. The 2019 title by Ryan Signer covered how 37signals worked at the time, and DHH reveals that this methodology now needs rewriting because AI acceleration has made that timeline feel slow.

12. Eight hours of sleep is non-negotiable – even during an AI gold rush! DHH believes the dopamine loop of shipping with agents is intoxicating and can lead to higher risk of burnout. So, he sleeps eight hours and doesn’t use an alarm.

The Pragmatic Engineer deepdives relevant for this episode

• Are AI agents actually slowing us down?

• How Claude Code is built

• The future of software engineering with AI: six predictions

• The AI Engineering Stack

• Mitchell Hashimoto’s new way of writing code

• How Linux is built with Greg Kroah-Hartman

Timestamps

(00:00) Intro

(02:11) Omarchy and Ruby on Rails

(08:25) 37signals overview

(10:12) Launching HEY

(18:38) Building HEY

(22:47) Designers at 37signals

(28:08) The craft of design

(31:52) Why DHH now embraces AI workflows

(39:45) The AI inflection point

(44:23) DHH’s agent-first workflow

(55:09) AI’s impact on junior developers

(1:03:08) Developer experience with AI

(1:16:43) What does AI mean for developers?

(1:23:33) 37signals teams and hiring

(1:38:20) Work-life balance with AI

(1:41:41) Why DHH keeps building

(1:45:24) Closing

References

Where to find DHH:

• X: https://x.com/dhh

• LinkedIn: https://www.linkedin.com/in/david-heinemeier-hansson-374b18221

• Website: https://dhh.dk

• Newsletter: https://world.hey.com/dhh

• Podcast: https://37signals.com/podcast

Mentions during the episode:

• Omarchy: https://omarchy.org

• Linux: https://www.linux.org

• Ubuntu: https://ubuntu.com

• Arch Linux: https://archlinux.org

• Hyprland: https://hypr.land

• Ruby on Rails: https://rubyonrails.org

• Basecamp: https://basecamp.com

• Fizzy: https://www.fizzy.do

• Jason Fried on X: https://x.com/jasonfried

• HEY: https://www.hey.com

• Shape Up: Stop Running in Circles and Ship Work that Matters: https://basecamp.com/shapeup

• Zoltán Hosszú applying to 37signals: https://zoltan.co/37signals

• Daring Fireball: https://daringfireball.net

• Smalltalk: https://en.wikipedia.org/wiki/Smalltalk

• DHH: Future of Programming, AI, Ruby on Rails, Productivity & Parenting | Lex Fridman Podcast #474:

• Homer’s typing Bird:

• Real-world engineering challenges: building Cursor: https://newsletter.pragmaticengineer.com/p/cursor

• Building a best-selling game with a tiny team – with Jonas Tyroller: https://newsletter.pragmaticengineer.com/p/thronefall

• Andrej Karpathy on X: https://x.com/karpathy

• Reflexive AI usage is now a baseline expectation at Shopify:

• Claude Code: https://code.claude.com

• OpenCode: https://opencode.ai

• MacBook Neo: https://www.apple.com/macbook-neo/

• tmux: https://github.com/tmux/tmux/wiki

• Kimi K2.5: https://kimik2ai.com/k2.5

• Agent first, agent native: https://basecamp.com/agents

• Sentry: https://sentry.io

• Moore’s law: https://en.wikipedia.org/wiki/Moore%27s_law

• The Bitter Lesson: http://www.incompleteideas.net/IncIdeas/BitterLesson.html

• Scaling Uber with Thuan Pham (Uber’s first CTO): https://newsletter.pragmaticengineer.com/p/scaling-uber-with-thuan-pham-ubers

• Waymo: https://waymo.com

• Elon Musk: “There will not be a steering wheel” in 20 years: https://www.axios.com/2017/12/15/elon-musk-there-will-not-be-a-steering-wheel-in-20-years-1513304216

• Leopold Aschenbrenner — 2027 AGI, China/US super-intelligence race, & the return of history:

Dwarkesh Podcast

Leopold Aschenbrenner — 2027 AGI, China/US super-intelligence race, & the return of history

Listen now

2 years ago · 89 likes · 16 comments · Dwarkesh Patel

• Terminator 2 Things & Ideas we would have never thought:

• Commodore 64: https://en.wikipedia.org/wiki/Commodore_64

• PlayStation: https://www.playstation.com

• Jevons paradox: https://en.wikipedia.org/wiki/Jevons_paradox

• OpenClaw: https://openclaw.ai

• The creator of Clawd: “I ship code I don’t read”: https://newsletter.pragmaticengineer.com/p/the-creator-of-clawd-i-ship-code

• John Carmack on X: https://x.com/ID_AA_Carmack

• TDD, AI agents and coding with Kent Beck: https://newsletter.pragmaticengineer.com/p/tdd-ai-agents-and-coding-with-kent

• Extreme Programming Explained: Embrace Change: https://www.amazon.com/Extreme-Programming-Explained-Embrace-Change/dp/0321278658

• Smalltalk Best Practice Patterns: https://www.amazon.com/Smalltalk-Best-Practice-Patterns-Kent/dp/013476904X

• From IDEs to AI Agents with Steve Yegge: https://newsletter.pragmaticengineer.com/p/from-ides-to-ai-agents-with-steve

—

Production and marketing by Pen Name.

Cycles of disruption in the tech industry: with software pioneers Kent Beck & Martin Fowler

Gergely Orosz — Tue, 07 Apr 2026 16:27:53 GMT

The recent Pragmatic Summit saw two legends of software development share a stage in what was one of the most popular sessions at our debut live event in San Francisco. In front of a packed audience, Martin Fowler and Kent Beck tackled a range of highly-relevant topics, with me hosting proceedings.

Martin and Kent go back decades, and Martin jokes that his career is “mostly about writing down Kent Beck’s ideas.” They first collaborated in the 1990s, and each has published influential books – ‘Extreme Programming Explained’ and ‘Test-Driven Development’ by Kent, and ‘Refactoring’ and ‘Patterns of Enterprise Application Architecture’ by Martin.

At the Pragmatic Summit, they each shared a wealth of hard-earned learnings and decades-worth of perspective, along with a healthy dose of skepticism. Needless to say, the conversation did not disappoint, and this article summarizes what we discussed in their own words. You can also check out the full recording.

We cover:

Technology shifts similar to AI. The arrival of the microprocessor, introduction of object-oriented languages, the Internet, and agile software development principles were all major changes – but one big difference was that it took time for these technologies to be adopted. Not so with AI.
Agile and AI similarities. With Agile, company incentives were often misaligned, “snake oil” vendors were everywhere, and a “mid pack” of developers who resisted the change saw their career prospects hit. These trends look likely to repeat with AI.
What’s happening inside companies. There’s some confusion – and even panic – at large companies, while AI tools don’t work nearly as well on large and complex codebases as on greenfield projects. Also, a “re-soloing” of software development is inbound.
Avoiding burnout with AI agents. Set and maintain boundaries, and pay attention. Martin suggests to catch when you start producing “negative value”: that’s when to take a break.
Unhealthy performance metrics. Companies are starting to measure things like frequency of pull requests – when they should be looking to quantify outcomes and results.
Lower quality on purpose? It seems every business is optimizing for speed with AI, but quality can get dropped. Also: building features is more obvious with AI, than investing in “futures.”
Test-Driven Development (TDD): tests no longer optional? Kent pioneered TDD, and today it’s more relevant than ever for working with AI.
Thriving in an AI-native industry. Focus on working with agents to express your craft, try to get more enjoyment in understanding your domain, and take on more ambitious work.

Martin Fowler (center), Kent Beck (right), and me at The Pragmatic Summit

Before we start, a programming note: this week, there will be no The Pulse on Thursday — I’ll be attending AI Engineer Europe in London on Thursday and Friday, including doing a fireside chat, and hosting one with Linear CTO Tuomas Artman.

1. Technology shifts similar to AI

Do you recall a tech change as similarly promising and unpredictable as AI?

Martin: “Nothing has hit with the magnitude of AI. This is a whole size different from anything we’ve faced before. On a smaller scale, we were very much involved in the growth of object oriented languages, which scared a lot of people. It didn’t scare us so much because we were part of it.

Looking back, the internet had a huge impact on us all, and of course, Agile software development, too. Agile had a very big impact on a lot of organizations: you could tell by how hard they resisted it. We had to persuade people of the importance of these technological changes; yes, even the internet! It may sound surprising but there were people who didn’t think it was important.

The thing about AI is that today there is no argument about how important it is.”

Martin Fowler (left) speaks at the Summit

Kent: “The other analogy I have is the introduction of the microprocessor. Before that, computers were big boxes; you couldn’t move them around. If you wanted another computer, you’d mortgage your house for it. Having a computer was a big deal.

I was a kid in Silicon Valley with my dad as a programmer when the Intel 4004 hit the market [in 1971]. We went: “Wait a minute, that chip is a computer? Oh my goodness!” The possibilities of computing suddenly expanded thanks to it. If you could figure out how to write software on this chip and figure out how to design hardware around this thing, you could suddenly do things you hadn’t even imagined.

And so I think part of AI is this expansion of imagination. I’m writing projects that are ridiculously ambitious: I’m working on a persistent Smalltalk. I’m writing library-quality code for Rust.”

Kent predicts AI will expand software engineering like the Intel 4004 did. Source: Intel

Balancing skepticism and curiosity

What was the feeling in the industry during those revolutions, and the differences between professionals who thrived back then and those who didn’t?

Martin: “There was a mix of people chasing the hype and those saying, “this new thing is nothing special.” I think you’ve always got to have that balance of skepticism and curiosity, and to be selective about it. I mean, I have been completely skeptical about some big changes: Blockchain was one I was extremely skeptical about.

My skepticism is well-rooted because I’ve seen so much “snake oil” over the years. In fact, my skepticism has to be absolute and total, which means I have to be skeptical about my skepticism! To be that skeptical also requires curiosity: you’ve got to be curious enough to say “how do I probe in order to detect signs of something useful?”

You also need to be aware that your early interactions may not actually be a true signal. When I started playing around with AI, it was with GitHub Copilot a year and a half ago. I was pretty unimpressed; it would give you something wonderful, but most of the time it gave you such garbage that you would just delete it right away. If that had been my only impression of AI, I would’ve immediately flipped the “bozo bit” on it, like I did with blockchain.”

Kent: “Here’s the thing, the capabilities of AI can change week to week. I’ll try something with Gemini one week and it fails miserably. Then Claude Code works pretty well, and then it doesn’t. And then I try Gemini for the same thing and it works, when it hadn’t worked last week!

People want an answer, but the answer’s always changing. In this environment, you can’t possibly have the answer. That’s the bad news, but the good news is that nobody else has the answer either. So, you’re just as smart as everybody else because we’re all equally ignorant.”

2. Agile and AI similarities

In 2001, the ‘Agile Manifesto’ came out, of which you were both co-authors. I think many companies are expecting the same thing with AI as Agile promised: better, faster, cheaper software. But how did Agile adoption really play out?

Full house: The conversation with Martin (left) and Kent (right) drew a large audience

Kent: “It turns out people don’t want faster, cheaper, better! Inside some companies, the incentives are misaligned with actually achieving that. And so as geeks trying to achieve these improvements and saying: “it’s 40% better, 12% cheaper and less fattening,” people will punish you if that doesn’t align with their incentives inside organizations.

In the ideal organization, everybody would care about the same things, but that’s just not the way it works! So, if AI is coming along to promise the same things, we’re going to see the same reaction as before.”

Martin: “An obvious difference is the sheer magnitude and speed there is with AI. Also, I think there will be a big difference between people who use it well and people who use it badly. The trick is figuring out how to use it well and putting the effort in to learn. There will be a big distinction between those two groups.

But I suspect there will still be some similarities with Agile. The core notions behind Agile and extreme programming are solid and good, but a huge snake-oil industry appeared around it – the “Agile industrial complex”, as I refer to it. This is also happening with AI right now, and it’s often hard to see the difference between snake oil and the real stuff.”

AI as an amplifier

Kent: “AI is an amplifier. If you’re young and learning quickly, AI can amplify your learning. I personally think this is the golden age of the junior programmer. I get people coming to me all the time saying things like “my son started his second year in CS and wants to go into something more commercial like art history.” And I’d say, “this is like if you’re a carpenter and they just introduced the circular saw and you think, ‘oh, well, carpentry is over. Anybody can build a house now.’ Well, no! Now, you have more powerful tools. You have less of the crummy work to do.

I think that young people are going to learn faster, and experienced folks who are working effectively are going to work quicker and more effectively.”

Developers stuck in the middle

Kent: My concern is that there’s a “middle” of people who got into programming as a way to make money. If we look back at the Dotcom crash, there was a “mid pack” of such people who ended up going into real estate, more or less. But today, I don’t know where that “middle” will go, and it’s also much bigger now than 25 years ago.”

Martin: “But that middle has also been “flushed out” to some degree by retrenchment in the software industry at the end of the zero interest rate period. So, that’s an interesting difference because we’ve had these things occurring at once: the AI boom, and the economic headwinds of the past 2-3 years.

This is an interesting mix that wasn’t present in the ‘90s with the Dotcom Boom. Back then, it was pretty much all a solid boom.”

Return of “let’s get rid of programmers!”

Kent: “Another interesting confluence of factors is the periodic, “we can get rid of all the programmers, woo-hoo” trend, which started with Cobol in the 1970s. With Cobol, business analysts were supposedly going to be able to write the programs, and the logic was that we wouldn’t need programmers anymore. That comes back repeatedly.

Agile, however, was definitely not a “let’s get rid of programmers” trend. With Agile, we wanted programmers to be more effective in their jobs. And since we started it, and were programmers, we were able to push that agenda pretty effectively.

However, today the “get rid of programmers” trend is repeating. As programmers, it behooves us to think about why they keep wanting to get rid of us. Some of that’s about us as programmers, and some of it not. Still, we should think about why people periodically want to axe us. In the end, this trend amps up the fear factor that everybody’s experiencing.”

In the middle of the discussion

“Re-soloing” of programming

Kent: “A big trend is the “re-soloing” [reduced in-person collaboration] of programming.

A big part of extreme programming (XP) was creating a safe social environment for basically antisocial people. On an XP team, people are talking to each other for hours a day, and are happy to do so because it’s set up to be a positive experience.

Now, I see programmers saying, “I’ve got six agents, so really I’m managing a team.” No, you’re not: you’re using six tools at once, which is fine, but it’s very different from having a conversation with somebody who sees things slightly differently, or has a different energy level from you on the day.

We used to have programmers in individual offices with doors, and you’d shut the door and slide the pizza underneath. That was easy to manage, but then along came this messy, social, complicated, chaotic process of software development, which just happened to produce really good results.

But now, instead of 50 people on my team, I can have five and they don’t have to talk to each other, and each can have 10 agents. Is that the same? No, it’s not.”

Swag: As well as the usual merch at the Summit, there were books by speakers, including Martin and Kent

More effective two-pizza teams & the future of pairing

Martin: “Are we seeing two-pizza teams [of 5-10 people] becoming one-pizza teams because agents don’t eat pizza, or do we see two-pizza teams staying and becoming much more effective and capable? My bet is on more effective two-pizza teams.

We’re beginning to see some interesting feedback in terms of pair programming. With pair programming, is it one human and the genie (AI) programming, or is it two humans and one genie? If it’s two of us, perhaps we can control the genie a bit better, and we also have interaction.

I’ll be very interested in reports of people trying to control genies in pairs, possibly even beyond pairs. There’s also the whole ‘mob programming’ thing, and how that will go with genies. I don’t necessarily think that one person and many genies is the right answer.”

Kent: “My experience of pairing with two humans, plus one or more genies, has been very positive. And the fact the AI is slow is really nice. Every time models come out and are faster, I’m like, “Oh, there’s less time to talk.” When the AI goes away for three minutes, we can talk about our philosophy of naming, or how we express conditionals, or about what we should be doing next. But if it pops back in 15 seconds, you don’t have time for that conversation.”

4. Avoiding burnout with AI agents

Do you find yourself getting close to burnout, especially when spinning up multiple threads? Do you have strategies for managing the mental impact?

The Pulse: Industry leaders return to coding with AI

Gergely Orosz — Thu, 02 Apr 2026 16:29:36 GMT

The Pulse is a series covering events, insights, and trends within Big Tech and startups. Notice an interesting event or trend? Hit reply and share it with me.

Today, we cover:

Founders back coding with AI: Mark Zuckerberg & Garry Tan. The Meta chief is shipping diffs after 20 years, while Garry Tan at Y Combinator is knee-deep in coding, 15 years later. Founders with technical backgrounds being hands-on with AI agents could be a good thing – especially when the “honeymoon” period ends.
A bad week for Claude Code and GitHub. Claude Code’s source code was leaked when a sourcemap file was accidentally uploaded, and revealed that the tool uses anti-distillation to deal with competitors, and also some potential future features such as an always-on background agent. Also: DMCA copyright strikes from Anthropic raise a big question: can a codebase that is fully AI-generated be covered by copyright?
Industry pulse. Meta sets targets for AI-generated code, GitHub’s 6 years of reliability issues, massive job losses at Oracle, GitHub Copilot rollouts and then rolls back ads, RAM prices fall (for now), and more.

1. Founders back coding with AI: Mark Zuckerberg & Garry Tan

Two interesting stories of AI tools encouraging busy founders to start writing code again with AI agents.

Mark Zuckerberg back to landing diffs, 20 years later

Scaling Uber with Thuan Pham (Uber’s first CTO)

Gergely Orosz — Wed, 01 Apr 2026 16:49:59 GMT

Stream the latest episode

Listen and watch now on YouTube, Spotify, and Apple. See the episode transcript at the top of this page, and timestamps for the episode at the bottom.

Brought to You by

• Statsig – ⁠ The unified platform for flags, analytics, experiments, and more.Stop switching between different tools, and have them all in one place.

• WorkOS – Everything you need to make your app enterprise ready. WorkOS gives you APIs to ship enterprise features in days: features like authentication, SSO, SCIM, RBAC, audit logs. Visit WorkOS.com

• Sonar – The makers of SonarQube, the industry standard for automated code review. Sonar helps reduce outages, improve security, and lower risks associated with AI and agentic coding. See how SonarQube Advanced Security is empowering the Agent Centric Development Cycle (AC/DC) with new capabilities like malicious package detection to provide the same rigorous guardrails for AI agents as you would for a human developer.

In this episode

Thuan Pham was Uber’s first and longest-serving CTO, and today he’s the CTO of Faire, a B2B wholesale platform. Back when Thuan joined Uber, it had around 40 engineers and 30,000 rides per day, and the system crashed multiple times a week. Over seven years, he helped rebuild the system, move it from a monolith to microservices, and scaled the engineering organization behind it. I had the privilege of working with Thuan for four of those seven years. Later, the very first issue of The Pragmatic Engineer newsletter was a deepdive into Uber’s Program and Platform split. This episode of the podcast contains a nice “full circle” moment, where Thuan shares even more details about why Uber chose to embrace that structure.

We discuss what it takes to operate and build in that kind of environment. Thuan explains how he divided his time at Uber into three “tours of duty,” from stabilizing a fragile system, to re-architecting it, and scaling the org.

We go deep into the platform-and-program split, the Helix app rewrite, and what it took to launch Uber in China in just five months (the original estimate was 18 months). We also cover Uber’s in-house tools and explain why they were necessary to support rapid growth.

Finally, we discuss his role today as CTO of Faire, how the company is using AI, and how he sees AI changing software engineering.

Key observation from Thuan

14 takeaways from Thuan that I find the most interesting:

1. Your professional reputation is a compounding asset that pays off unpredictably. Bill Gurley recruited Thuan to Uber based on knowing him from a startup a decade earlier. Similarly, when Thuan needed to hire for critical infrastructure teams at Uber, he reached out to engineers at VMware whom he’d previously worked with, and they followed him to the ridesharing app because they trusted him.

2. The program/platform split came before microservices. The concept of cross-functional “program” teams and dedicated “platform” teams became necessary because an org split across backend, frontend and mobile engineers slowed down in execution speed when Uber grew to around 100 engineers. Every feature required negotiating bandwidth across the mobile, backend, and dispatch teams. Thuan, Travis Kalanick, and Jeff Holden literally used color-coded sticky notes with people’s names to reorganize into self-sufficient teams. We cover more about this split in the deepdive, The Platform and Program split at Uber.

3. Microservices at Uber were more about surviving hypergrowth than anything else. Uber needed to decompose its massive monolith called “API.” To do so, a simple rule was applied: anything new needed to be built outside of the monolith so that no team blocked another. Teams started to build microservices, but decomposing the monolith took a good two years. Fun fact: in 2026, Uber has somewhat fewer microservices (around 4,500) than back in 2016 (around 5,000).

4. When retiring a monolith, sometimes it gets even bigger before shrinking. After Uber decided to pull services out of the massive monolith, it still kept growing because the business kept adding features! There was an ugly middle phase before the monolith started to shrink. Keep this in mind if you look into decomposing a monolith.

5. Expect multiple rewrites during hypergrowth. The right architecture depends on how fast a product and company are growing. At Uber, repeated rewrites were common because each one “bought” another window of survival for the company. Thuan’s recommendation is to understand that a rewrite simply means a company is outrunning its existing architecture: this is not necessarily a bad thing!

6. Controversial launch advice: start with the hardest launch first. When Uber rolled out in China, Travis insisted on starting with Chengdu, the largest launch city. Looking back, it was scary but also helpful, as launching in the “hardest” city first gave the team confidence and made subsequent city launches much easier.

7. Travis Kalanick spent 30+ hours interviewing Thuan. This took place over two weeks, as a series of one-on-ones. The sessions became a simulation of working together: disagreeing, aligning, and working things out. I’ve yet to hear of such an intense – and technical! – recruitment process by another CEO.

8. Uber is the only major company that had a “Senior 1” and “Senior 2” level – and Thuan is unapologetic. Thuan introduced the Senior 1 (L5A) and Senior 2 (L5B) levels because the jump from senior (L5) to Staff (L6) became very big, and larger than between previous levels. One problem this split level created was that Uber’s L5B was akin to Google’s and Facebook’s L6/E6. Thuan resisted the title inflation of just renaming L5B to ‘Staff’.

9. Name your services clearly; you don’t work at a “Mickey Mouse shop.” As Uber grew more complex, whimsical service names (like “Mustafa”) made navigating systems more tricky, and onboarding for new joiners more painful. Thuan sent a company-wide email which called for professional-sounding naming conventions, and reminded everyone that Uber was not a “Mickey Mouse shop.” The email didn’t fully solve the issue, but did force the growing org to take itself more seriously.

10. Great engineering talent is global, so bring the opportunity to developers. During Thuan’s time, Uber opened nine engineering offices worldwide in order to access world-class talent. For example, the relatively small Denmark office built and operated core parts of Uber’s infrastructure, such as the trip datastore, Schemaless.

11. What’s the most important part of a CTO’s job? Thuan thinks that it’s to build a high-talent-density team, and to “see around the corner” 18–24 months in advance. As he puts it: “your team handles the six-month problems, while you figure out what the organization needs to look like two years from now.”

12. The hardest use case of AI in software engineering is building new features on legacy codebases. At Faire, Thuan’s team uses “swarm coding” (orchestrated AI agents working in parallel) and some engineers there have doubled their output in three months. But generating greenfield code is easy; the real challenge is dealing with millions of lines of code and building features on top with all those existing dependencies.

13. AI raises the floor, but doesn’t change what makes engineers great. AI enables people who can’t code to produce decent apps, but great engineers are still finding ways to leverage AI tools and accelerate even more. The differentiators remain the same as before AI: curiosity, fearlessness, and a willingness to innovate and learn new things.

14. Thuan’s career advice: think of it in phases. Each segment of your career has different priorities, which Thuan sees like this:

First 5–10 years: seek maximum learning and push yourself hard.
Mid-career as a senior/staff engineer: seek roles to make an outsized impact in, perhaps at a smaller company.
In leadership roles: teach and coach others, and bring them along with you.