Home > Uncategorized > Indented vs non-indented if-statements: performance difference

Indented vs non-indented if-statements: performance difference

To non-developers discussions about the visual layout of source code can seem somewhat inconsequential. Layout probably ought to be inconsequential, being based on experimental studies that discovered how source should be visually organised to minimise the cognitive effort consumed by developers while processing it.

In practice software engineering is not evidence-based. There are two kinds of developers: those willing to defend to the death the layout they use, and those that have moved on.

In its simplest form visual layout involves indenting code some number of spaces from the left margin. Use of indentation has not always been widespread, and people wrote papers extolling the readability benefits of indenting code.

My experience with talking to developers about indentation is that they are heavily influenced by the indentation practices adopted by those around them when first learning a language. Layout habits from any prior language tend to last awhile, depending on the amount of time spent with the prior language.

As far as I know, I have had zero success arguing that the Gestalt principles of perception provide a useful framework for deciding between different code layouts.

The layout issue that attracts the most discussion is probably the indentation of if-statements. What, if any, is the evidence around this issue?

Developer indentation discussions focus on which indentation is better than the alternatives (whatever better might be). A more salient question would be the size of the developer performance difference, or is the difference large enough to care about?

Researchers have used several techniques for measuring difference in developer performance, including: code comprehension (i.e., number of correct answers to questions about the code they have just read), subjective ratings (i.e., how hard did the subjects find the task), and time to complete a task (e.g., modify source, find coding mistake).

The subjects have invariably been a small sample of undergraduates studying for a computing degree, so the usual caveats about applicability to professional developers apply.

Until 2023, the most detailed work I know of is a PhD thesis from 1974 studying the impact of mnemonic/meaningless variable names plus none/some indentation (experiments 1, 2 and 9), and a 1983 paper which compared subject performance with indentation of none and 2/4/6 spaces (contains summary data only). Both studies used small programs.

The 2023 paper Indentation in Source Code: A Randomized Control Trial on the Readability of Control Flows in Java Code with Large Effects by J. Morzeck, S. Hanenberg, O. Werger, and V. Gruhn measured the time taken by 20 subjects to answer 12 questions about the value printed by a randomly generated program containing a nested if-statement. The following shows an example without/with indentation (values were provided for i and j):

 if (i != j) {          if (i != j) { 
 if (j > 10) {             if (j > 10) {
 if (i < 10) {                if (i < 10) {
 print (5);                      print (5);
 } else {                     } else {
 print (10);                     print (10);
 }                            }
 } else {                  } else {
 print (12);                  print (12);
 }                         }
 } else {               } else {
 if (i < 10) {             if (i < 10) {
 print (23);                  print (23);
 } else {                  } else {
 print (15);                  print (15);
 }                         }
 }                      }

A fitted regression model found that the average response time of 122 seconds (yes, very slow) for non-indented code decreased to 44 seconds (not quite as slow) for indented code, i.e., about three times faster (code+data). This huge performance improvement is very different from most software engineering experiments where the largest effect is the between subjects performance, with learning producing the next largest effect.

Evidence that indentation is very effective, but nobody doubted this. There has been a follow-up study, more on that another time.

  1. November 14, 2024 08:33 | #1

    Assuming indentation makes such a large difference to programmer performance might we expect to see higher performance in languages which enforce a particular style?
    I.e. are python devs more productive? And can indentation explain some of this?
    Conversely, how can the claimed productivity benefits of Lisp be explained? To my eye Lisp indentation always seemed pretty random

  2. November 14, 2024 14:37 | #2

    @Allan Kelly
    Your question assumes that developers don’t indent in other languages. This might be true in Basic, and perhaps even a lot of Fortran. Lack of indentation in the early days may have been driven by lack of disk space and screen size (I didn’t indent when I wrote Fortran).
    I assume that these days, everybody indents; ok, perhaps not for assembly language.
    Ah, the ‘claimed productivity’ of using language that is the favourite of the person making the claim.
    People are great at spotting visual patterns, it just needs a bit of practice.

  3. Michael Tempest
    December 7, 2024 10:11 | #3

    There are holy wars around whether one should use spaces or tabs for indentation. My preference is irrelevant to this comment, but I have experienced the indentation mess (and associated cognitive load) that results when spaces and tabs are mixed and the software maintainer uses a different editor (or different editor settings) to the original author. Yes, code reformatters should make this problem go away. In practice, there exist software development environments where code reformatters are not allowed (and yes, there are reasons for this).

    What I have learned from this is:
    1) Indentation matters
    2) Holy wars aside, it doesn’t matter whether you use tabs or spaces, but it matters that you use them consistently

  1. No trackbacks yet.