The productivity impact of AI coding tools
How much of a productivity boost do GitHub Copilot and ChatGPT give software engineers? Results from a survey with over 170 respondents, and a look into what to expect from AI coding tools.
👋 Hi, this is Gergely with a subscriber-only issue of the Pragmatic Engineer Newsletter. In every issue, I cover challenges at Big Tech and startups through the lens of engineering managers and senior engineers. To get articles like this in your inbox, every week, subscribe:
A month ago, I asked developers and others working with AI coding tools like GitHub Copilot and OpenAI’s ChatGPT to share their experiences, so far. More than 170 respondents have provided personal insights and in this issue, we dig into this expert feedback, and I offer my thoughts, too.
Today, we cover:
The survey. An overview of questions and the profiles of respondents.
Comparable productivity gains. The gains which GitHub Copilot and ChatGPT offer are enticing, but not totally unprecedented within tech. Experienced engineers with decades in the business share examples of previous comparable productivity improvements.
GitHub Copilot. Its most common use cases, where the biggest gains are to be found, and when this tool isn’t so useful.
ChatGPT. Most common use cases and when to proceed with caution.
Copilot vs ChatGPT. How the tools compare. When is one better than the other?
The present and future of AI coding tools. Common observations and interesting predictions from survey respondents.
Copilot and ChatGPT alternatives. There are a growing number of AI tools; a list of other popular, promising options.
“Are AI coding tools going to take my job?” A very common question and source of concern for some engineers.
1. The survey
In the survey, I asked these questions:
What AI coding tool have you been using recently?
How long have you been using the tool for?
How have you been using the tool?
How would you summarize your view of the tool in one sentence?
Does this tool make you more efficient?
What areas have you seen the most productivity gains?
What is a tool that has comparable productivity gains?
What are areas where this tool is not that helpful?
What types of work is this tool the most helpful for?
Any other comments?
In total, 175 people responded with answers which I’ve analyzed. Of this number, 134 say they’ve been using GitHub Copilot, 39 say they use ChatGPT, and 2 responses mention other tools – including someone who uses Raycast AI together with Copilot:
Going through the responses, many people who mark “Copilot” as their answer, also used ChatGPT, meaning the separation is not so black-and-white as the above chart suggests.
By role split, the majority of respondents are software engineers:
How long have people been using Copilot or ChatGPT? Over half of respondents say more than 3 months:
Despite the hype around ChatGPT right now, it should be little surprise that usage of it is lower on average than GitHub Copilot, as ChatGPT was publicly released in November 2022, nearly a year after GitHub Copilot, which has been available since October 2021.
What about perception of efficiency gains from using these tools? It’s quite positive:
You can browse the 🔒 raw data of all responses here.
2. Comparable productivity gains
Tech workers who’ve been using Copilot or ChatGPT overwhelmingly say the tools do bring productivity gains. So we definitely are seeing a “jump” in developer productivity. But is this increase a new phenomenon, or have similar productivity gains occurred before?
I asked this question in the survey, and engineers with substantial experience in the industry cite several occasions when they experienced productivity jumps like with Copilot or ChatGPT.
1. Using a debugger over printing statements (1970s) One respondent compared the productivity gain of AI coding tools to using a debugger for the first time. Before then, applications were debugged by printing statements to the console log.
2. Usenet groups (1980s) Cloud Solution architect Neil Mackenzie shared how back in the day, print books took ages to arrive and Usenet groups were a big source of support when developing.
3. The World Wide Web (1989) Pre-internet, most software engineers learned coding from books and each other. The arrival of the world wide web was a true game changer. Software engineer Michael Bushe shared that in around 1992 his employer General Electric blocked developers from using the web, so they could not take advantage of communicating on this medium. He said that back then, developers knew each other’s specialisms by the stack of books on their desks.
4. MSDN quarterly CDs (1992) In 1992, Microsoft launched the Microsoft Developer Network. It shipped CD ROMs to subscribers quarterly, containing the latest documentation on Microsoft APIs and libraries, technical articles and programming information. A lot of this content was not available online, and so developers building Microsoft software relied on them for up-to-date documentation. Among all the listed productivity improvements, this feels the least significant. Still, it’s a nice reminder that compact discs were a small productivity boost in the 1990s!
5. Google (1998) A software engineer who’s been coding since before Google launched in 1998, shared that ChatGPT’s boost feels similar to what Google provided when it launched and people could discover programming-related answers online.
I wasn’t coding pre-Google and started at my first workplace in 2008, where colleagues often reminisced about when you couldn’t Google answers and had to rely on books, or knowing which websites to visit. While search engines existed before Google – like Altavista or Ask Jeeves – Google was much better at locating relevant niche content like programming topics.
6. Modern IDEs (1997) Several people mention the introduction of the modern integrated developer environment (IDE,) with tools like Visual Studio (released 1997) or IntelliJ (released 2001) which made a major difference to coding after simple text editors like Notepad++.
Early versions of IDEs had capabilities like basic autocomplete and one-click commands to build, test or run programs.
7. ReSharper (2004) Several respondents say that ReSharper by Jetbrains provided a significant productivity boost akin to AI coding tools today. ReSharper was a suite of intelligent application development tools, including functionality like:
Refactoring support with features like renaming variables/functions/classes, changing method signature, extracting methods, introducing variables
Enhanced navigation to jump around the code
Advanced autocomplete
8. Scaffolding frameworks and tools (2005) Backend engineers note they experienced similar productivity improvements from onboarding to frameworks like Ruby on Rails (2005) or Laravel (2011). These frameworks could generate boilerplate code for app scaffolds and the testing of structures, with just a few commands.
9. Web developer tools (2006) Previously, developing on the web was painful, especially when it came to debugging JavaScript code or UI layout. Firebug in 2006 was a major breakthrough in improving frontend engineers’ productivity. In 2008, Google launched “Developer Tools” and the company continues investing in these tools which most frontend and fullstack engineers rely on.
10. StackOverflow (2008) Near the end of the noughties, searching for programming questions online felt broken. Google returned plenty of results for a query, but few solutions. I recall one site called “Experts Exchange” showed up frequently among the top search results, but with the correct answer locked, visible only to paying members. Worse, the “correct” answer was often wrong.
In 2008, Stack Overflow launched and was an instant success. Engineers could ask questions and usually get high-quality answers, sometimes within hours. Over time, these high-quality answers became easy to find via Google, making it simple to find answers to common programming questions.
11. Smartphones (2007) One respondent said the leap AI coding tools offer feels similar to when they could first search topics on a mobile phone, instead of having to be at home on their desktop computer, or finding a library with internet access.
Interestingly, nobody has compared Al coding tools’ productivity gains with any tool or approach released in the 2010s.
12. Jupyter notebooks (2011) Simon Willison, co-creator of the Django web framework, tells me that Jupyter Notebook for Python development – a web-based interactive computing platform – felt like a comparable productivity boost.
13. Tabnine (2019) Tabnine is an AI coding assistant similar to Copilot, which was first released in 2019 using the GPT-2 OpenAI model. Several people say Tabnine offered a similar productivity boost to Copilot.
14. “No comparison.” Some respondents say they’ve never seen a productivity jump like this, including a former software engineer – now an engineering director – who shares that they’ve been in the industry for 30 years, and haven’t experienced any technology which generated gains on this scale.
My take is that there have been plenty of technological “jumps” in productivity:
Information access: driven by the spread of the internet, better search, and dedicated programming Q&A sites
Developer tooling: debuggers, IDEs, and specialized debugging and developer tools for areas like web or mobile development
We are likely seeing another category of productivity improvements, and in ways we’ve not experienced before: a “smart autocomplete” and a “virtual coding assistant” which build on information access and improved developer tooling.
3. GitHub Copilot
GitHub Copilot uses the OpenAI Codex released in August 2021. Codex is a descendant of OpenAI’s GPT-3 model and is supposedly less capable than the later GPT-3.5 and GPT-4 models.
The common sentiment among respondents about Copilot is that it increases engineering productivity in almost all cases. It’s most useful for autocomplete, writing tests, generating boilerplate code, and simple tasks. Several engineers refer to Copilot as a “fancy autocomplete” or “autocomplete on steroids.”
However, Copilot can (and does!) produce incorrect code. When using Copilot, you need to assume it produces errors and that you should verify the code. The consensus is that Copilot isn’t very useful for complex tasks, or when working in niche domains.
A few quotes from respondents summarizing sentiment about Copilot:
"A useful tool that isn't always correct, but it is correct enough to be useful."
“A really powerful tool in the hands of experienced engineers who have a lot of foundational knowledge, and are starting to code in a new language which they don’t know so well."
"Good for writing boring scaffolding, but only when you know the language or framework well."
“It's my third hand. Many small tasks are easily done in Copilot.“
"I don’t see it replacing seasoned engineers anytime soon, but those who adopt it will output vast amounts more."
How engineers use Copilot
The most-mentioned use cases: