Home > Uncategorized > Indented vs non-indented if-statements: performance difference

Indented vs non-indented if-statements: performance difference

November 10, 2024 (2 weeks ago) Leave a comment Go to comments

To non-developers discussions about the visual layout of source code can seem somewhat inconsequential. Layout probably ought to be inconsequential, being based on experimental studies that discovered how source should be visually organised to minimise the cognitive effort consumed by developers while processing it.

In practice software engineering is not evidence-based. There are two kinds of developers: those willing to defend to the death the layout they use, and those that have moved on.

In its simplest form visual layout involves indenting code some number of spaces from the left margin. Use of indentation has not always been widespread, and people wrote papers extolling the readability benefits of indenting code.

My experience with talking to developers about indentation is that they are heavily influenced by the indentation practices adopted by those around them when first learning a language. Layout habits from any prior language tend to last awhile, depending on the amount of time spent with the prior language.

As far as I know, I have had zero success arguing that the Gestalt principles of perception provide a useful framework for deciding between different code layouts.

The layout issue that attracts the most discussion is probably the indentation of if-statements. What, if any, is the evidence around this issue?

Developer indentation discussions focus on which indentation is better than the alternatives (whatever better might be). A more salient question would be the size of the developer performance difference, or is the difference large enough to care about?

Researchers have used several techniques for measuring difference in developer performance, including: code comprehension (i.e., number of correct answers to questions about the code they have just read), subjective ratings (i.e., how hard did the subjects find the task), and time to complete a task (e.g., modify source, find coding mistake).

The subjects have invariably been a small sample of undergraduates studying for a computing degree, so the usual caveats about applicability to professional developers apply.

Until 2023, the most detailed work I know of is a PhD thesis from 1974 studying the impact of mnemonic/meaningless variable names plus none/some indentation (experiments 1, 2 and 9), and a 1983 paper which compared subject performance with indentation of none and 2/4/6 spaces (contains summary data only). Both studies used small programs.

The 2023 paper Indentation in Source Code: A Randomized Control Trial on the Readability of Control Flows in Java Code with Large Effects by J. Morzeck, S. Hanenberg, O. Werger, and V. Gruhn measured the time taken by 20 subjects to answer 12 questions about the value printed by a randomly generated program containing a nested if-statement. The following shows an example without/with indentation (values were provided for i and j):

 if (i != j) {          if (i != j) { 
 if (j > 10) {             if (j > 10) {
 if (i < 10) {                if (i < 10) {
 print (5);                      print (5);
 } else {                     } else {
 print (10);                     print (10);
 }                            }
 } else {                  } else {
 print (12);                  print (12);
 }                         }
 } else {               } else {
 if (i < 10) {             if (i < 10) {
 print (23);                  print (23);
 } else {                  } else {
 print (15);                  print (15);
 }                         }
 }                      }

A fitted regression model found that the average response time of 122 seconds (yes, very slow) for non-indented code decreased to 44 seconds (not quite as slow) for indented code, i.e., about three times faster (code+data). This huge performance improvement is very different from most software engineering experiments where the largest effect is the between subjects performance, with learning producing the next largest effect.

Evidence that indentation is very effective, but nobody doubted this. There has been a follow-up study, more on that another time.

  1. November 14, 2024 (2 weeks ago) 08:33 | #1

    Assuming indentation makes such a large difference to programmer performance might we expect to see higher performance in languages which enforce a particular style?
    I.e. are python devs more productive? And can indentation explain some of this?
    Conversely, how can the claimed productivity benefits of Lisp be explained? To my eye Lisp indentation always seemed pretty random

  2. November 14, 2024 (2 weeks ago) 14:37 | #2

    @Allan Kelly
    Your question assumes that developers don’t indent in other languages. This might be true in Basic, and perhaps even a lot of Fortran. Lack of indentation in the early days may have been driven by lack of disk space and screen size (I didn’t indent when I wrote Fortran).
    I assume that these days, everybody indents; ok, perhaps not for assembly language.
    Ah, the ‘claimed productivity’ of using language that is the favourite of the person making the claim.
    People are great at spotting visual patterns, it just needs a bit of practice.

  1. No trackbacks yet.