While working on a codebase, developers spend a lot of time understanding code. We might be unwilling to admit it for fear of sounding dumb but the huge amount of time spent on making sense of code is staggering. In fact a conference is held every year just to tackle this problem. The good news is that with the emergence of online communities more and more people are discussing the challenges of working with large codebases openly.
The Back Story
About 10 years ago I was working on the Office Codebase for Microsoft. Even while implementing some really small features, it would take me a lot of time just trying to figure out how different parts of the code related to one another. As it was one of my first really large projects in the role of developer, I just quietly ended up spending the time understanding the code and hiding it from the rest of the team. I was not very experienced then and felt that it was just a personal limitation.
The cause of the problems was not the team or the bad code. It was just the simple fact that having 100+ developers writing code day-in day-out on a project means that there is a lot of code to read – even if you are only focusing on one part of the codebase. After a month or so of working on the codebase, I started to wish that I had better tools that could help me. Realizing that Microsoft would spend the money to buy a license of a good tool if I found something I was happy with, I looked for something and was quite upset to not find anything helpful.
In fact, I wanted something so much that I even wrote a tool in my spare time (more on that in another post). As I dove more into building something useful, the project grew into a Master’s thesis and eventually a PhD thesis before I felt like I had gotten to a useful point. But as I researched the topic I was really surprised to find the significant amount of studies done on the topic.
Results from Studies
Update: We have updated the content below and included it in a table on our website. For an easier to read page of this data see here.
An IBM research by Corbi in 1989 found that more than half of the effort in accomplishing a task for the programmer is towards understanding the system. Later on at Bell Labs in 1992, Davison’s team found that new project members spend 60%-80% of their time understanding code, with the number dropping at most down to a low of 20% as the developers gain experience in the code that they are working with. Another study in 1997 at the National Research Council in Canada lead by Singer et al. found developers spending over 25% of their time either searching for or looking at code.
More recently, in 2006 a study conducted at Microsoft with 157 developers found that roughly equal amounts of time is spent understanding code as other tasks such as designing, unit testing, and writing. In 2007, a survey of over 780 developers at Microsoft conducted by a team led by Cherubini found that 95% agreed that understanding existing code is a significant part of their job. Further, over 65% indicated spending time understanding existing code at least once a day (with over 25% indicating doing it multiple times a day).
Even Peter Hellam in his blog mentions that more than 70% of a developers time is spend understanding code as it is a preliminary requisite when testing code, modifying existing code or even writing new code.
Although many tools which help developers understand code are evolving but their effectiveness in day to day work is something yet to be determined. Till then, as developers it is our responsibility to write clear and understandable code to make life easier for anyone who will work on our codebase in future.
Perhaps John Woods says it best: