What limits productivity gains?

In a previous post on whether AI will replace programmers, we touched on the issue of productivity gains from AI and how most businesses have yet to see a significant return on investment. While companies are splurging hundreds of billions per year on AI models and infrastructure, there are fears about a balance sheet mismatch between short-term liabilities and long-term profits. There is, however, still a general faith among the C-suite that AI will transform their businesses and unlock massive gains, if only they can stick it out a little longer. So much money is riding on this belief that we must examine it rigorously–and it doesn’t take much to start poking holes in the argument.

Reaching the limit

Amdahl’s law, formulated by computer scientist Gene Amdahl in 1967, states that the speedup of a system is limited by the portion that cannot be improved. Mathematically:

\[S = \frac{1}{(1 - p) + \frac{p}{s}}\]

where \( S \) is the overall speedup, \( p \) is the proportion of work that can be accelerated, and \( s \) is the speedup factor for that portion. As \( s \rightarrow \infty \), the overall speedup is bounded by \( \frac{1}{1 - p} \).

Consider a task where AI can accelerate 50% of the work. Even if AI completes that portion infinitely fast, the overall speedup is only \( 2\times \). For AI to turn an average programmer into a “\( 10\times \) developer”, it would have to trivialize 90% of their work–a completely unrealistic proportion even if we believe that tools can handle most of the coding and design (they can’t).

Expanding the horizon

Amdahl’s Law seems to impose harsh constraints on productivity gains, but there is a way out. Gustafson’s Law says that, though some portion of a task can’t be sped up, we can do more of the part that can, in effect decreasing the proportion that is resistant to acceleration.

This is the distinction between strong and weak scaling. In the former, the problem size stays the same, while in the latter it is increased. Suppose a programmer spends 2 hours a day on administrative work and 6 hours coding a project¹. If an AI agent can write the code in 2 hours, they’ll have done the day’s work in half the time. But if the programmer could spend the other 4 hours coding up two more projects, they’ll have tripled their output in the same amount of time as before².

So does this unlock miraculous productivity gains? That depends on whether there is enough work that can be sped up with AI. When it comes to software development, Fred Brooks in The Mythical Man-Month estimates that only a small fraction of time should be reserved for coding:

For some years I have been successfully using the following rule of thumb for scheduling a software task:

\( 1/3 \) planning

\( 1/6 \) coding

\( 1/4 \) component test and early system test

\( 1/4 \) system test, all components in hand

Even as the other parts are or can be automated to some extent, coding is still the minority of most programmers’ days. We have a tendency to caricaturize other people’s jobs as comprising mostly some straightforward technical work. But a carpenter is not primarily a woodcutter, nor is a programmer a glorified typist. Every non-menial job is dominated by communication, planning, evaluation, design, and so on. The final implementation is typically viewed as trivial, and handed to the most junior employees³.

Balancing the scales

In 1987, economist Robert Solow famously observed that:

You can see the computer age everywhere but in the productivity statistics

Computers were in widespread business use by that time, yet productivity growth was modest, even below trend. This became known as the Solow Paradox: local efficiency gains were obvious, yet macroscopic impact was nowhere to be found.

A number of causes were invented to explain this discrepancy. One argument was that computers were still too small a share of GDP to have much influence. Another, that old productivity measurements weren’t capturing the benefits brought about by the new machines. Still others doubted that computers were all that productive at all–lots of ambitious projects using computers failed, and the trumpeted successes were just survivorship bias.

None of these is particularly satisfactory, especially since productivity picked up in the early 1990’s, even before the dot-com boom. There’s no strong argument to be made that businesses which had trouble getting value out of computers in 1987 suddenly, and simultaneously, found the trick to doing so in 1993. What unlocked value was not a technological leap, but a political earthquake.

Throughout the Cold War, defense spending consumed a large share of GDP. When the Soviet Union collapsed in 1991, military budgets quickly fell. Money that had been reserved for warships and fighter jets was suddenly available for household consumption. In addition, the fall of the Iron Curtain meant hundreds of millions of Eastern Bloc consumers were suddenly accessible.

The unexpected surge in demand was what allowed the increased output from computer usage to manifest as GDP. Without corresponding demand, more output not only fails to grow the economy, but undermines it. As an example, consider a refrigerator manufacturer that makes 100 fridges a year. If a more efficient production line allowed them to pump out 200 fridges, but there weren’t an extra hundred buyers, the result is oversupply, price collapse, disappearing profits, and eventually layoffs. The unemployed workers further subtract demand from the market, so that only 95 fridges can be sold, creating a vicious cycle⁴.

Forcing the issue

Three forces, then, limit the productivity gains AI can deliver. Amdahl’s Law caps any speedup by the share of work that resists acceleration. Gustafson’s Law offers a way past that ceiling, but only where there is more accelerable work to expand into–and most jobs, including software development, are dominated by the parts AI handles least well. Even when output does rise, it cannot register as economic growth without matching demand, as the Solow Paradox revealed: computers were everywhere by 1987, but the gains stayed invisible in the statistics until the post-Cold War demand surge of the early 1990s caught up to them.

History tells us that supply-demand imbalances are rarely resolved by patient adjustment. The Industrial Revolution’s overcapacity gave way to Luddite uprisings, and the industrial expansion of WWI was followed by a sharp contraction in 1920, strikes, and riots. The challenge is sharper today, as economic imbalances have become global.

In an echo of the early 20th century, nations are now engineering demand through defense spending, not least for AI. As a general rule, weapons that are built will be used⁵. Today that means engineered demand; tomorrow, perhaps, engineered war. Let us pray that the productive facilities of AI are not frustrated in every direction but that of conflict.

Most corporate developers wish their day was partitioned like this. ↩
This, of course, is the outcome companies want to see, as opposed to the programmer going home at 1:00 PM. Claims that AI, or any technology, will help bring about the 4-hour workday are delusional. ↩
This is especially true in fields like law or banking, where document review and number crunching are done by recent college grads, while partners earn their keep by hosting cocktail parties for potential clients. ↩
It may seem obvious that the refrigerator company should simply not produce so many extra units, but that’s only a winning strategy if it has a monopoly. If there are several competing brands, they must each apply the more efficient production technology to protect market share, lest the others squeeze them out. This tragedy of the commons can be seen today in industries like solar panels. ↩
Even if it’s decades later, in a very different war. ↩

Reaching the limit

Expanding the horizon

Balancing the scales

Forcing the issue

Search Results