Spefik asked Johannes to talk about everything we know about program comprehension, which is a lot, apparently 🙂 There are two routes of reading, bottom up and top down (which was also discussed in the previous talk by Igor and Andy)
Already the word program comprehension is weird to Johannes! He would prefer code comprehension, reading the code, because the program is the running thing. If you are comprehending it you are running the program. Cool observation!
So what now? There are many cognitive processes that are well understood but not used at all in program comprehension, since people in programming do not have a lot of knowledge of psychology. Johannes will change that with this talk 🙂
What is a construct?
A construct is an explanatory variable which is not directly observable. For example, personality, intelligence and other latent properties. Theses are invisible properties but still things we comfortably talk and reason about. Understanding constructs will help us to reason. The opposite is a manifest variable.
Sometimes people use the wrong tools to measure things, like measuring length with a scale.
Johannes showed us a number of pictures in which we have to find letters and symbols and there are really big differences in all the tests, including this famous test in which you have to read the color (not the word). This is hard because you cannot really turn your reading sense of unless you focus really hard.
So, if we want to measure for example the effect of syntax highlighting we have to take these type of effects into account. What exactly are we optimizing for? If it a Feature Search or a Conjunction Search? What is the task that participants have to do! Maybe we should do something like semantic highlighting! Highlight the things that we can choose rather than the fixed things in the syntax that we can probably read easily. For example, if we have to find the number of characters in a text, this is really useful:
Legibility, Readability, Comprehensibility
In German (and Dutch) there is one word for these, but in English we can be more precise. For legibility colors matter, and font, for readability keywords matters and for comprehension domain knowledge. These are very different things, and people in programming studies seem to skip over a bit too much detail says Johannes.
Perception and Cognition
Two other properties are perception (noticing things in the code) and cognition (thinking and reasoning about the code)
Johannes states that readability is not a property of code, but of a tool that presents the tool. We cannot separate the tool form the code and measure it differently.
Johannes did a study showing that people are 19% quicker to find bug when identifiers were words rather than letters and abbreviations. This is a nice example of a study with a precise task: find a bug. Here the titel of the paper is the effect on comprehension, which, Johannes argues he is measuring at least, because in order to fix a bug you need to understand the code. This paper was replicated but the paper is still under review with Java, rather than C# as Johannes did original.
Organization of Knowledge
So, what do we know about knowledge. There is something called dual code theory. The simple idea is that some things are idea to draw (like an egg) which some are really hard to draw (freedom). This has an impact on for example objects. Some people say that objects are natural, but objects are hard to represent abstract notions. It is really still unclear whether objects as in OO are useful.
This was a GREAT talk, Johannes talks way to quick so I feel like I only covered 5% of what was there! The take aways:
PS I am loving this Dagstuhl!!