Law 6: Comments Should Explain Why, Not What
1 The Comment Dilemma: More Harm Than Good?
1.1 The Opening Hook: A Familiar Scenario
Picture this common scenario: You've just been assigned to work on a legacy project, or perhaps you're returning to your own code after several months. As you open the file, you're confronted with a sea of code and comments. Some functions have extensive comments explaining every single line, while others have no comments at all. You spend hours trying to understand the logic, the business rules, and the rationale behind certain implementation choices. The comments that do exist often merely restate what the code is already saying, adding no real value. "Increment counter by 1," one comment helpfully explains next to a line that clearly reads counter++
. Another comment provides a detailed explanation of a simple loop structure, yet offers no insight into why this loop is necessary or what problem it's solving.
This scenario is all too familiar for programmers across all levels of experience. It represents a fundamental misunderstanding of the purpose of comments in code. The problem isn't the absence of comments, but rather the presence of unhelpful ones that fail to serve their intended purpose. This chapter addresses this pervasive issue and establishes a clear principle for effective code documentation: comments should explain why code exists, not what it does.
1.2 The State of Comments in Modern Software Development
The landscape of code commenting in modern software development is paradoxical. Despite decades of progress in software engineering practices, commenting remains one of the most inconsistently applied and misunderstood aspects of programming. Studies of open-source repositories and proprietary codebases reveal a wide spectrum of commenting practices, from completely uncommented code to excessively verbose documentation that adds little value.
Research conducted across numerous software projects indicates that approximately 15-30% of source code consists of comments, yet a significant portion of these comments provide little to no explanatory value beyond what the code itself already communicates. A 2019 analysis of 10,000 popular GitHub repositories found that while most projects had some form of documentation, the quality and usefulness of inline comments varied dramatically, with many comments simply restating the obvious or containing outdated information.
This inconsistency stems from several factors. First, programming education often emphasizes the importance of commenting without teaching how to write effective comments. Many developers are told "comment your code" without guidance on what makes a comment valuable. Second, commenting practices are rarely standardized within development teams, leading to inconsistent approaches. Third, the rapid evolution of code often leaves comments behind, creating a disconnect between what the comments describe and what the code actually does.
The consequences of poor commenting practices extend far beyond mere inconvenience. They impact code maintainability, team productivity, knowledge transfer, and ultimately the success of software projects. A study by the Consortium for IT Software Quality found that poor code documentation, including inadequate comments, contributes to approximately 20% of software defects and significantly increases the time required for debugging and maintenance.
As software systems grow in complexity and teams become more distributed, the need for effective documentation practices becomes increasingly critical. The challenge is not merely to add more comments, but to add the right kind of comments—those that provide genuine insight into the code's purpose and rationale.
2 Understanding the Purpose of Code Comments
2.1 Defining "What" vs. "Why" in Code Comments
The fundamental principle at the heart of this chapter is the distinction between comments that explain "what" code does and those that explain "why" it does it. This distinction is crucial for understanding how to write effective comments that truly enhance code comprehension and maintainability.
"What" comments describe the mechanics of the code—the operations being performed, the data structures being manipulated, or the algorithms being implemented. These comments essentially restate in natural language what the code already expresses through its syntax and structure. For example:
// Increment the counter by 1
counter++;
# Loop through all users in the list
for user in users:
# Process each user
process_user(user)
These "what" comments add no value because they simply restate what the code already clearly communicates. Any programmer familiar with the language can understand what the code is doing without these comments. In fact, such comments often create maintenance overhead, as they must be updated whenever the code changes, yet provide no additional information.
In contrast, "why" comments explain the rationale behind the code—the reasons for particular implementation choices, the business rules being enforced, the context of the problem being solved, or the constraints that influenced the design. These comments provide insight that cannot be derived from the code itself. For example:
// We need to increment the counter before processing to maintain consistency with the legacy system's expectations
counter++;
# Process users in batches of 100 to avoid memory issues with large datasets (observed in production with >10K users)
for i in range(0, len(users), 100):
batch = users[i:i+100]
process_batch(batch)
These "why" comments provide valuable context that helps future developers understand the reasoning behind the implementation. They explain the decisions, constraints, and business logic that influenced the code, which cannot be inferred from the code alone.
The value of "why" comments becomes particularly evident when dealing with complex algorithms, business rules, or workarounds for external system limitations. Consider the following example:
/**
* Calculates the discount rate for a customer based on their purchase history.
*
* Why this approach: We use a weighted average of the last 6 months of purchases
* rather than a simple average because marketing analysis showed that recent
* purchases are better predictors of future spending patterns. This was determined
* during the Q3 2022 analysis of customer retention patterns.
*
* Note: The weights decrease exponentially (0.4, 0.3, 0.15, 0.1, 0.03, 0.02)
* based on the marketing team's regression analysis of customer behavior.
*/
public double calculateDiscountRate(Customer customer) {
// Implementation details...
}
This comment explains not just what the method does (calculates a discount rate), but why it uses a particular approach (weighted average based on marketing analysis). This context is invaluable for future developers who might need to modify or extend this functionality.
The distinction between "what" and "why" comments represents a fundamental shift in how we think about code documentation. Instead of viewing comments as a restatement of the code's mechanics, we should view them as a complement to the code that provides the context, rationale, and reasoning that the code itself cannot express.
2.2 The Psychology Behind Effective Comments
To understand why "why" comments are more effective than "what" comments, we need to examine the cognitive processes involved in reading and understanding code. When developers read code, they engage in a complex mental process of constructing a mental model of what the code does and why it does it. This process involves several cognitive activities:
- Pattern Recognition: Experienced developers recognize common programming patterns and idioms, allowing them to quickly understand what a piece of code does without explicit explanation.
- Inference: Developers make inferences about the purpose and rationale of code based on its structure, naming, and context.
- Mental Model Construction: Developers build a mental model of the system, including how different components interact and what business rules or requirements are being implemented.
"What" comments interfere with this process by adding redundant information that the developer would have inferred anyway. They create cognitive noise that can actually slow down comprehension rather than speeding it up. When a developer reads a comment that simply restates what the code does, they must process this redundant information, which adds to their cognitive load without providing any benefit.
In contrast, "why" comments enhance the mental model construction process by providing information that cannot be inferred from the code alone. They answer questions that naturally arise in the developer's mind: "Why was this approach chosen?" "What problem does this solve?" "What constraints influenced this design?" By providing this context, "why" comments help developers build a more complete and accurate mental model of the system.
Research in cognitive psychology supports this approach. Studies on expertise and problem-solving have shown that experts focus more on the underlying principles and rationale of a solution, while novices focus more on surface features. In the context of code, this means that experienced developers are more interested in understanding why code is written a certain way, while less experienced developers might initially focus on what the code does.
Effective "why" comments bridge this gap by making the rationale explicit, allowing developers at all experience levels to understand the context and reasoning behind the code. This is particularly important for complex systems where the relationship between code structure and business requirements may not be immediately apparent.
Another psychological aspect to consider is the concept of "cognitive load theory," which suggests that our working memory has limited capacity. When reading code, developers must simultaneously process the code itself, remember the context, and build a mental model of the system. "What" comments increase cognitive load by adding redundant information, while "why" comments reduce cognitive load by providing context that helps organize and structure the information being processed.
The temporal aspect of code comprehension is also important. When a developer first reads a piece of code, they may focus on understanding what it does. However, as they continue to work with the code, they need to understand why it was implemented in a particular way, especially when making modifications or debugging issues. "Why" comments provide lasting value throughout the entire lifecycle of the code, while "what" comments are only potentially useful during the initial reading and quickly become redundant.
Understanding these psychological principles helps explain why effective commenting practices focus on explaining "why" rather than "what." By providing the rationale and context that cannot be inferred from the code itself, "why" comments enhance comprehension, reduce cognitive load, and support the mental model construction process that is essential for working effectively with code.
3 The Impact of Poor Commenting Practices
3.1 Code Maintenance Challenges
Poor commenting practices create significant challenges for code maintenance, which constitutes a substantial portion of the software development lifecycle. Research indicates that maintenance typically accounts for 60-80% of the total cost of software ownership, and inadequate documentation, including poor comments, is a major contributing factor to these costs.
When comments fail to explain the "why" behind code, maintenance becomes a process of detective work rather than straightforward engineering. Developers must spend excessive time reverse-engineering the original intent and rationale behind the code, leading to several specific challenges:
Increased Time to Understanding: Without proper "why" comments, developers need more time to understand the code before they can modify it. A study conducted at Microsoft found that developers spend approximately 58% of their time trying to understand code, and poor documentation was identified as a primary factor contributing to this time expenditure. When comments only explain "what" the code does, they add little to this understanding, forcing developers to infer the rationale from the code structure, naming, and surrounding context.
Risk of Incorrect Modifications: When developers don't understand why code was implemented in a certain way, they are more likely to make incorrect assumptions when modifying it. This can lead to the introduction of bugs or the violation of important business rules. For example, consider a piece of code that appears to have an inefficient algorithm but was actually designed that way to handle a specific edge case that occurs rarely but has serious consequences. Without a comment explaining this rationale, a developer might "optimize" the code, inadvertently removing the handling for that edge case.
Accumulation of Technical Debt: Poor commenting practices contribute to the accumulation of technical debt. When code is difficult to understand and modify, developers often take shortcuts or implement quick fixes rather than proper solutions. These shortcuts may address immediate needs but create long-term maintainability issues. Over time, this technical debt compounds, making the codebase increasingly difficult to work with.
Knowledge Silos: In teams where comments don't capture the rationale behind code, knowledge tends to reside with individual developers rather than being shared across the team. This creates knowledge silos where only certain team members understand specific parts of the codebase. When these individuals leave the team or move on to other projects, their knowledge is lost, creating significant challenges for those who must maintain their code.
Regression Bugs: One of the most common consequences of poor commenting practices is the introduction of regression bugs—bugs that occur when changes to one part of the system inadvertently break functionality in another part. Without proper "why" comments explaining the relationships and dependencies between different components, developers may not realize the full impact of their changes.
Consider a real-world example from a financial services company. The system had a complex calculation for determining interest rates that had been implemented years ago. The code was uncommented, and the original developers had long since left the company. When new regulatory requirements mandated changes to how interest rates were calculated, the current development team struggled to understand the existing implementation. They spent weeks reverse-engineering the code, consulting business analysts, and reviewing historical documentation to understand the rationale behind the calculation. During this process, they accidentally introduced a bug that caused the system to miscalculate interest rates for certain types of accounts, resulting in significant financial losses and regulatory penalties. This entire situation could have been avoided with proper "why" comments explaining the rationale behind the original calculation.
The impact of poor commenting practices on code maintenance extends beyond just time and cost. It affects developer morale, as working with poorly documented code can be frustrating and demotivating. It also affects the ability to onboard new team members, as the learning curve for a poorly documented codebase is significantly steeper. Ultimately, poor commenting practices contribute to a vicious cycle where the difficulty of maintaining code leads to further shortcuts and deteriorating code quality.
3.2 Team Collaboration and Knowledge Transfer
The quality of code comments has a profound impact on team collaboration and knowledge transfer, which are critical factors in the success of software development projects. In modern software development, where teams are often distributed, members change frequently, and projects span years, effective knowledge transfer is not just beneficial but essential.
Poor commenting practices create several barriers to effective collaboration and knowledge transfer:
Fragmented Understanding: When comments fail to explain the "why" behind code, team members develop fragmented and potentially inconsistent understandings of the system. Each developer may interpret the rationale differently based on their own experiences and assumptions. This fragmentation leads to inconsistencies in how different parts of the system are modified and extended, resulting in architectural drift and increased complexity.
Communication Overhead: Teams working with poorly commented code spend more time in meetings, discussions, and code reviews trying to understand and explain the rationale behind implementation decisions. This communication overhead reduces productivity and can lead to misunderstandings and misinterpretations. A study by the Software Engineering Institute found that teams working with poorly documented code spent up to 30% more time in communication activities related to understanding and clarifying code.
Onboarding Challenges: New team members face significant challenges when joining a project with poor commenting practices. Without "why" comments to provide context and rationale, they must rely heavily on other team members to understand the codebase. This creates a bottleneck, as the time senior developers spend explaining the code is time they could spend on development tasks. It also increases the time it takes for new members to become productive, which can be particularly problematic in projects with tight deadlines or high turnover.
Loss of Institutional Knowledge: When developers leave a team or project, their knowledge of the rationale behind code decisions leaves with them if it's not captured in comments. This loss of institutional knowledge can be devastating for a project, especially when key contributors depart. A study by the Consortium for IT Software Quality found that projects with high developer turnover and poor documentation practices experienced a 40% increase in defect rates and a 25% decrease in development velocity.
Inconsistent Decision-Making: Without "why" comments to document the rationale behind past decisions, teams may make inconsistent or contradictory decisions over time. This inconsistency can lead to architectural problems, duplicated functionality, and increased complexity. For example, one part of the system might handle errors in one way, while another part handles them differently, simply because the rationale for the original approach wasn't documented.
Consider a case study from a large e-commerce company. The company had a platform that had been developed over many years by multiple teams. The codebase had inconsistent commenting practices, with many sections lacking "why" comments. When the company decided to implement a new pricing strategy, the development team struggled to understand how the existing pricing system worked and why certain decisions had been made. Different team members had different understandings of the system, leading to conflicting approaches to implementing the new strategy. The project took twice as long as planned, required extensive rework, and ultimately resulted in a pricing system that was even more complex and difficult to maintain than the original. This situation could have been avoided with proper "why" comments documenting the rationale behind pricing decisions.
The impact of poor commenting practices on team collaboration extends beyond just development efficiency. It affects team dynamics, as frustration with poorly documented code can lead to blame and resentment. It affects code quality, as developers may be reluctant to modify code they don't fully understand. It also affects innovation, as developers may be hesitant to experiment with new approaches when they don't understand the existing system well enough to assess the impact of changes.
Effective "why" comments serve as a form of asynchronous communication, allowing developers to share knowledge and context without requiring direct interaction. They create a shared understanding of the system that transcends individual team members and persists over time. In distributed teams, where members may be working in different time zones or have limited opportunities for direct communication, "why" comments become even more critical for maintaining alignment and consistency.
4 The Science of Good Comments
4.1 Cognitive Load Theory and Comments
Cognitive Load Theory (CLT), developed by educational psychologist John Sweller in the 1980s, provides a valuable framework for understanding how comments affect code comprehension. CLT posits that working memory has limited capacity, and learning is most effective when cognitive load does not exceed this capacity. The theory identifies three types of cognitive load:
- Intrinsic Load: The inherent complexity of the material being learned. In the context of code, this would be the complexity of the algorithm or business logic being implemented.
- Extraneous Load: The way information is presented that does not directly contribute to learning. In code, this would be confusing or redundant comments that add no real value.
- Germane Load: The cognitive resources devoted to processing information, constructing schemas, and automating procedures. In code, this would be the mental effort required to understand the code and build a mental model of the system.
Applying CLT to code comments reveals why "what" comments are problematic while "why" comments are beneficial. "What" comments increase extraneous load by adding redundant information that does not contribute to understanding the code. They force developers to process information that they could have inferred from the code itself, consuming valuable working memory capacity without providing any benefit.
For example, consider the following code with a "what" comment:
// Check if user is authenticated
if (user.isAuthenticated()) {
// Grant access to the resource
grantAccess(resource);
}
The comments simply restate what the code already clearly communicates. A developer reading this code must process both the code and the comments, increasing extraneous load without providing any additional insight. The comments add noise rather than signal.
In contrast, "why" comments reduce extraneous load and support germane load by providing context that helps developers organize and structure information. They help developers build schemas and mental models more efficiently.
Consider the same code with a "why" comment:
// We must check authentication here rather than relying on the framework's security
// because this endpoint is accessed by both web and mobile clients, and the framework's
// security only applies to web requests (see JIRA-1234 for details)
if (user.isAuthenticated()) {
grantAccess(resource);
}
This comment explains why the authentication check is necessary in this specific location, providing context that cannot be inferred from the code itself. It helps developers build a more complete mental model of the system, reducing the cognitive effort required to understand the code.
Research in software engineering has applied CLT to understand how various factors affect code comprehension. A study by Busjahn et al. (2015) used eye-tracking to analyze how developers read code and found that comments that explain the rationale behind code help developers navigate the code more efficiently and focus their attention on the most relevant parts. Another study by Fritz et al. (2010) found that developers spend a significant portion of their time trying to understand the context and rationale of code, and that documentation that provides this context can substantially reduce this time.
The implications of CLT for commenting practices are clear:
- Minimize Extraneous Load: Avoid comments that simply restate what the code does. These comments increase extraneous load without providing any benefit.
- Support Germane Load: Provide comments that explain the rationale, context, and reasoning behind code. These comments help developers build mental models more efficiently.
- Manage Intrinsic Load: For inherently complex code, use comments to break down the complexity into manageable chunks and explain the high-level approach before diving into details.
CLT also explains why certain commenting strategies are more effective than others. For example, the strategy of writing comments at a higher level of abstraction than the code itself (explaining the "why" rather than the "what") aligns with the principle of managing intrinsic load by providing a conceptual framework for understanding the code.
Another implication of CLT is the importance of comment placement. Comments that are placed close to the relevant code and are visually distinct from the code itself (e.g., through proper formatting) reduce extraneous load by making it easier for developers to associate the comment with the relevant code and distinguish between the comment and the code.
Understanding CLT provides a scientific foundation for the principle that comments should explain "why" rather than "what." It explains why this approach is more effective from a cognitive perspective and provides guidelines for implementing this principle in practice.
4.2 Documentation as a Living Entity
Effective code comments should be viewed not as static annotations but as living documentation that evolves with the code. This perspective recognizes that code is constantly changing—being modified, extended, refactored, and optimized—and that the documentation must keep pace with these changes to remain valuable.
When comments fail to evolve with the code, they become misleading or even harmful. Consider a comment that explains the rationale behind a particular implementation approach. If the code is later changed but the comment is not updated, future developers may be misled by the outdated comment, leading to confusion and potentially incorrect assumptions. This phenomenon, often referred to as "comment drift," is one of the reasons why some developers argue against extensive commenting.
The solution is not to avoid comments but to treat them as an integral part of the code that must be maintained alongside it. This requires a shift in mindset—from viewing comments as a one-time documentation task to viewing them as living documentation that is updated whenever the code changes.
Several strategies can help maintain comments as living documentation:
Comment-Code Co-evolution: Make it a practice to review and update comments whenever code is modified. This should be part of the standard development workflow, just like testing or code review. When a developer changes code, they should ask themselves: "Does this change affect the rationale or context documented in the comments? If so, the comments must be updated as well."
Automated Comment Validation: Implement tools and processes that automatically detect potential inconsistencies between code and comments. For example, static analysis tools can identify comments that reference function names, parameters, or variables that no longer exist in the code. While these tools cannot catch all types of comment drift, they can help catch the most obvious cases.
Comment Reviews: Include comment quality and consistency as part of the code review process. Reviewers should check not only that the code is correct and well-written but also that the comments accurately reflect the code and provide valuable context. This helps ensure that comments are maintained as part of the normal development process.
Executable Documentation: Where possible, use executable documentation approaches that combine code and documentation in a way that ensures they remain synchronized. For example, literate programming tools allow developers to write documentation and code together in a single document that can be both read by humans and executed by computers. While this approach is not suitable for all types of code, it can be valuable for algorithms and complex business logic.
Comment Decay Detection: Implement processes to periodically review comments for signs of decay. This could involve automated tools that flag potentially outdated comments based on heuristics (e.g., comments that haven't been modified recently despite changes to the surrounding code) or manual reviews where developers specifically check the accuracy of comments.
Treating comments as living documentation requires a cultural shift within development teams. It requires recognizing that documentation is not a separate task from coding but an integral part of the coding process. This cultural shift can be facilitated by:
Leadership Example: Team leads and senior developers should model good commenting practices, including updating comments when they modify code.
Documentation Standards: Establish clear standards for comments, including expectations for maintaining comments as code changes.
Tooling Support: Provide tools that make it easy to update comments alongside code, such as IDE features that highlight comments when the associated code is modified.
Recognition and Incentives: Recognize and reward developers who maintain high-quality documentation, including keeping comments up to date.
The concept of documentation as a living entity extends beyond just comments to include other forms of documentation such as API documentation, architectural diagrams, and user guides. However, comments are particularly important because they are embedded directly in the code and are therefore most likely to be read by developers working with that code.
When comments are treated as living documentation, they become a valuable asset that continues to provide value throughout the lifecycle of the code. They help ensure that the rationale and context behind implementation decisions are preserved even as the code evolves and the original developers move on to other projects. This preservation of context is critical for maintaining the long-term health and maintainability of software systems.
5 Practical Strategies for Effective Comments
5.1 Commenting Best Practices
Writing effective comments that explain "why" rather than "what" requires both skill and discipline. It involves understanding what information is valuable to future developers and expressing that information clearly and concisely. This section outlines best practices for writing effective comments based on industry experience and research.
Focus on Rationale and Intent: The most valuable comments explain the rationale behind code decisions and the intent of the code. They answer questions like: Why was this approach chosen? What problem does this solve? What constraints influenced this design? What business rules are being implemented?
For example, instead of:
// Set timeout to 30 seconds
connection.setTimeout(30);
Write:
// Set timeout to 30 seconds to balance between responsiveness and avoiding premature timeouts
// on slow networks. 30 seconds was determined based on user experience testing (see PROJ-456).
connection.setTimeout(30);
Document the "Why" Behind Non-Obvious Decisions: Some code may appear overly complex, inefficient, or counterintuitive at first glance. These are often the places where "why" comments are most valuable. They explain the reasoning behind decisions that might otherwise be questioned or "optimized" by future developers.
For example:
// Using a simple bubble sort here instead of quicksort because:
// 1. The dataset is always small (<10 elements) in practice
// 2. Bubble sort is stable, which is required for our use case
// 3. Benchmarking showed no significant performance difference for our typical datasets
// See performance analysis in docs/performance/sorting.md
bubbleSort(items);
Explain the Context and Constraints: Code often exists within a broader context of business requirements, technical constraints, and historical decisions. Comments that explain this context help future developers understand the code's place in the larger system.
For example:
// This validation logic must match exactly the validation in the mobile app
// to ensure consistent behavior across platforms. Any changes here must be
// coordinated with the mobile team (see MOB-234).
if (!isValidEmail(email)) {
throw new ValidationException("Invalid email format");
}
Use Comments to Document Assumptions: Code often makes assumptions about the environment, data, or other components. Documenting these assumptions helps future developers understand the conditions under which the code is expected to work.
For example:
// Assumes that the input list is already sorted by timestamp in descending order.
// This is guaranteed by the DataProcessor class (see DataProcessor.process()).
// If this assumption changes, this method will need to be updated.
List<Event> getRecentEvents(List<Event> allEvents) {
// Implementation...
}
Provide References to Related Resources: Sometimes the rationale behind code is documented in external resources such as design documents, specifications, research papers, or issue tracking systems. Comments that reference these resources provide a bridge to more detailed information.
For example:
// Implementation of the OAuth 2.0 authorization code grant flow as specified in
// RFC 6749 section 4.1. See https://tools.ietf.org/html/rfc6749#section-4.1
// For our specific implementation decisions, see design document:
// https://wiki.company.com/display/ARCH/OAuth+Implementation
public AuthorizationCode handleAuthorizationCodeRequest(Request request) {
// Implementation...
}
Write for the Audience: Consider who will be reading the comments and tailor the level of detail accordingly. Code that is likely to be modified by junior developers may need more detailed explanations than code that will only be touched by senior specialists.
Keep Comments Concise but Complete: While it's important to provide sufficient context, comments should also be concise. Avoid unnecessary words and focus on the essential information. A good comment provides the maximum amount of useful information with the minimum number of words.
Use a Consistent Style: Establish a consistent style for comments within a project or team. This includes formatting, terminology, and level of detail. Consistency makes comments easier to read and understand.
Update Comments When Code Changes: Treat comments as living documentation that evolves with the code. When you modify code, review and update the associated comments to ensure they remain accurate and relevant.
Avoid Redundant Comments: Don't write comments that simply restate what the code already clearly communicates. These comments add no value and create maintenance overhead.
Use the Right Type of Comment for the Situation: Different types of comments serve different purposes. Use block comments for detailed explanations of complex code, inline comments for brief explanations of specific lines, and documentation comments (e.g., Javadoc, Doxygen) for API documentation.
Write Comments as You Code: Don't leave commenting as a task to be done later. Write comments as you write the code, when the rationale and context are fresh in your mind. This ensures that the comments capture the original intent accurately.
Review Comments During Code Review: Include comment quality as part of the code review process. Reviewers should check that comments are accurate, relevant, and provide valuable context.
By following these best practices, developers can write comments that truly enhance code comprehension and maintainability. The goal is not to add as many comments as possible, but to add the right comments—those that explain the "why" behind the code and provide context that cannot be inferred from the code itself.
5.2 Tools and Techniques for Better Comments
While good commenting practices rely primarily on developer discipline and skill, various tools and techniques can support and enhance these practices. This section explores tools and techniques that can help developers write more effective comments and maintain them as living documentation.
IDE Features and Plugins: Modern Integrated Development Environments (IDEs) offer features that can facilitate better commenting practices:
- Comment Templates: Many IDEs allow developers to create templates for common types of comments, ensuring consistency and reducing the effort required to write well-structured comments.
- Comment Highlighting: Some IDEs can highlight comments that have not been modified recently when the surrounding code has been changed, helping to identify potential comment drift.
- Documentation Generation: IDEs often integrate with tools like Javadoc, Doxygen, or Sphinx to generate documentation from structured comments, encouraging developers to write more comprehensive comments.
- Code Folding: IDEs typically allow comments to be folded or expanded, making it easier to focus on the code or the documentation as needed.
Static Analysis Tools: Static analysis tools can help identify issues with comments and ensure they remain synchronized with the code:
- Comment-Code Consistency Checkers: Tools like Checkstyle, PMD, and SonarQube can identify comments that reference code elements that no longer exist or have been changed.
- Comment Quality Analyzers: Some tools can analyze comments for common issues, such as comments that are likely redundant (e.g., comments that simply restate the method name).
- Spelling and Grammar Checkers: Tools that check spelling and grammar in comments can improve the professionalism and clarity of documentation.
Documentation Generation Tools: Tools that generate documentation from comments can incentivize better commenting practices by making the comments more visible and useful:
- Javadoc/Doxygen/Sphinx: These tools generate documentation from structured comments in the code, creating HTML, PDF, or other formats of documentation.
- Swagger/OpenAPI: For API documentation, tools like Swagger can generate interactive API documentation from comments in the code.
- Literate Programming Tools: Tools like Jupyter notebooks (for Python) or Org-mode (for Emacs) allow developers to mix code and documentation in a single document that can be both read by humans and executed by computers.
Version Control Integration: Version control systems can be leveraged to maintain better comments:
- Commit Message Guidelines: Establishing guidelines for commit messages that reference the rationale for changes can complement inline comments.
- Blame Annotations: Version control systems like Git provide "blame" annotations that show who last modified each line of code and when, which can be useful for understanding the context of changes.
- Comment Change Tracking: Some version control systems can be configured to highlight when comments are changed along with code, making it easier to review comment updates.
Collaborative Documentation Platforms: Platforms that facilitate collaborative documentation can complement inline comments:
- Wikis: Team wikis can provide a space for more extensive documentation that complements inline comments.
- Architecture Decision Records (ADRs): ADRs are documents that capture important architectural decisions along with their rationale and consequences. They can be referenced in comments to provide more detailed context.
- Issue Tracking Integration: Linking comments to issue tracking systems (e.g., JIRA, GitHub Issues) can provide additional context and history.
Automated Commenting Tools: While automated commenting tools should be used with caution, some can assist in creating basic comment structures:
- Comment Generation: Some AI-powered tools can generate basic comments from code analysis, which can then be enhanced by developers with the rationale and context.
- API Documentation Generators: Tools that generate API documentation from code signatures can reduce the need for boilerplate comments.
Techniques for Effective Commenting: Beyond tools, certain techniques can help developers write more effective comments:
- Rubber Duck Debugging for Comments: Explaining code out loud (even to an inanimate object like a rubber duck) before writing comments can help identify what aspects of the code need explanation.
- Comment-First Development: Writing comments before writing the code can help clarify the intent and design before implementation.
- Comment Reviews: Including comment quality as a specific focus during code reviews can help maintain high standards.
- Comment Metrics: Tracking metrics related to comments (e.g., comment density, comment drift) can help identify areas for improvement.
Training and Resources: Providing training and resources can help developers improve their commenting skills:
- Style Guides: Developing and maintaining a commenting style guide can ensure consistency across a team or project.
- Code Examples: Providing examples of well-commented code can serve as models for developers.
- Workshops and Training: Conducting workshops on effective commenting practices can raise awareness and improve skills.
By leveraging these tools and techniques, development teams can establish an environment that supports and encourages effective commenting practices. The goal is to make it easier for developers to write good comments and to maintain those comments as the code evolves. While tools cannot replace the need for developer judgment and skill, they can significantly enhance the effectiveness and sustainability of good commenting practices.
6 Beyond Comments: Self-Documenting Code
6.1 Writing Code That Explains Itself
While effective comments are valuable, the ultimate goal should be to write code that is so clear and expressive that it requires minimal comments. This concept, often referred to as "self-documenting code," is based on the principle that the best documentation is the code itself. Self-documenting code relies on clear naming, logical structure, and good design choices to make its intent and functionality apparent without extensive comments.
Self-documenting code does not mean code without comments. Rather, it means code that is written so clearly that comments are only needed to explain the "why"—the rationale, context, and constraints that cannot be expressed in the code itself. The "what"—what the code does—should be evident from the code's structure and naming.
Several techniques contribute to writing self-documenting code:
Meaningful Names: The most powerful technique for self-documenting code is using meaningful names for variables, functions, classes, and other code elements. Names should clearly express the purpose and intent of the element they represent.
For example, instead of:
// Process data
void proc(List d) {
// Calculate result
int r = 0;
for (int i = 0; i < d.size(); i++) {
r += d.get(i);
}
return r;
}
Write:
int calculateTotalScore(List<Integer> studentScores) {
int totalScore = 0;
for (int score : studentScores) {
totalScore += score;
}
return totalScore;
}
The second version requires no comments to explain what it does because the names clearly express the intent and functionality.
Small, Focused Functions: Functions should be small and focused on a single task. When a function does one thing and does it well, its purpose is usually clear from its name and implementation. Large, multifunctional methods are harder to understand and typically require more comments to explain their various responsibilities.
For example, instead of a large function that handles user authentication, session management, and data access, create separate functions for each responsibility:
// Instead of:
void handleUserRequest(Request request, Response response) {
// Lots of code handling authentication, session management, and data access
}
// Use:
User authenticateUser(Request request) {
// Authentication logic
}
Session createSession(User user) {
// Session creation logic
}
Data fetchData(Session session) {
// Data access logic
}
Consistent Conventions: Using consistent conventions for naming, formatting, and organization makes code more predictable and easier to understand. When developers can predict how code is structured and named based on established conventions, they can understand it more quickly.
Expressive Language Features: Using the expressive features of the programming language can make code more self-documenting. For example, using higher-level constructs like list comprehensions, lambda expressions, or built-in functions can often express intent more clearly than lower-level constructs.
For example, instead of:
// Filter active users
List<User> activeUsers = new ArrayList<>();
for (User user : users) {
if (user.isActive()) {
activeUsers.add(user);
}
}
Write:
List<User> activeUsers = users.stream()
.filter(User::isActive)
.collect(Collectors.toList());
The second version expresses the intent more clearly and concisely.
Appropriate Abstractions: Choosing the right level of abstraction can make code more self-documenting. Code that is too abstract can be difficult to understand because it hides important details, while code that is not abstract enough can be bogged down in implementation details that obscure the overall intent.
Design Patterns: Using well-known design patterns can make code more self-documenting for developers familiar with those patterns. When a developer recognizes a pattern, they can leverage their existing knowledge of that pattern to understand the code more quickly.
Error Handling: Clear and consistent error handling can make code more self-documenting. When errors are handled explicitly and meaningfully, it's easier to understand the expected behavior of the code under various conditions.
Code Organization: Organizing code logically and coherently can make it more self-documenting. Related functionality should be grouped together, and dependencies should flow in a clear and predictable direction.
Refactoring: Regular refactoring to improve code clarity and structure is essential for maintaining self-documenting code. As code evolves, it can become more complex and less clear. Refactoring helps restore clarity and expressiveness.
Self-documenting code is not achieved through a single technique but through the consistent application of these principles throughout the codebase. It requires discipline and attention to detail, but the benefits are substantial: code that is easier to understand, maintain, and extend.
It's important to note that self-documenting code does not eliminate the need for comments. Even the clearest code can benefit from comments that explain the rationale behind design decisions, the context of the problem being solved, or the constraints that influenced the implementation. The goal is not to eliminate comments but to reduce the need for comments that explain "what" the code does, allowing comments to focus on explaining "why" it does it.
6.2 The Future of Code Documentation
The landscape of code documentation is evolving rapidly, driven by advances in technology, changes in development practices, and new insights into cognitive aspects of code comprehension. Understanding these trends can help developers and teams prepare for the future of code documentation and adapt their practices accordingly.
AI-Assisted Documentation: Artificial intelligence is poised to revolutionize code documentation. AI-powered tools can already generate basic comments and documentation from code analysis, and this capability is expected to improve significantly in the coming years. Future AI tools may be able to:
- Generate high-level summaries of code functionality and rationale
- Identify areas where comments would be most valuable
- Suggest improvements to existing comments
- Automatically update comments when code changes
- Translate comments between languages for international development teams
However, AI tools are unlikely to fully replace human-written documentation, especially for explaining the rationale and context behind code, which often requires domain knowledge and understanding of business requirements that AI may not possess.
Interactive Documentation: The future of code documentation is likely to be more interactive and integrated with the development environment. Instead of static text, documentation may become more dynamic, with features such as:
- Interactive code examples that can be executed and modified
- Visualizations of code structure and data flow
- Links between code and related documentation, issues, and discussions
- Context-sensitive documentation that adapts to the reader's role and experience level
Executable Documentation: The line between code and documentation is blurring, with approaches like literate programming, notebooks, and live coding environments becoming more prevalent. These approaches allow developers to mix executable code with explanatory text, creating documentation that is always up-to-date because it is the code itself.
Documentation as Code: Treating documentation with the same rigor as code—using version control, automated testing, continuous integration, and code review practices—is becoming more common. This "Documentation as Code" approach ensures that documentation is maintained with the same level of care as the code it describes.
Augmented Reality (AR) Documentation: As AR technology matures, it may offer new ways to visualize and interact with code documentation. Imagine being able to "see" the data flow through a system or visualize the impact of a change in three dimensions.
Voice-Activated Documentation: Voice interfaces may become a common way to interact with documentation, allowing developers to ask questions about code and receive spoken explanations. This could be particularly valuable for developers with visual impairments or for situations where reading is not practical.
Social Documentation: Documentation is becoming more social and collaborative, with features like commenting, discussion threads, and collaborative editing. This social aspect helps capture collective knowledge and facilitates knowledge sharing within teams.
Personalized Documentation: Future documentation systems may adapt to the individual needs and preferences of each developer, providing the right level of detail based on their experience, role, and past interactions with the codebase.
Automated Documentation Testing: Just as code is tested for correctness, documentation may be automatically tested for accuracy, completeness, and consistency with the code. Tools may verify that examples in documentation work as expected and that all public APIs are documented.
Cognitive-Adaptive Documentation: Drawing on research in cognitive science, future documentation systems may adapt to how humans learn and process information, presenting information in ways that optimize comprehension and retention.
Ethical Documentation: As software becomes more critical to society, there is growing awareness of the need for documentation that addresses ethical considerations, such as privacy implications, security considerations, and potential societal impacts.
These trends suggest a future where code documentation is more integrated, interactive, intelligent, and adaptive. However, the fundamental principle that comments should explain "why" rather than "what" will remain relevant. Regardless of the technology used, the most valuable documentation will always be that which provides context, rationale, and insight that cannot be derived from the code itself.
As these technologies evolve, developers will need to adapt their skills and practices. They will need to learn to work effectively with AI-powered tools, to create interactive and engaging documentation, and to maintain documentation as an integral part of the development process. They will also need to balance the use of advanced tools with the human judgment and domain knowledge that are essential for explaining the "why" behind code.
The future of code documentation is exciting, but it will require a commitment to continuous learning and adaptation. By staying informed about emerging trends and technologies, developers can ensure that their documentation practices remain effective and relevant in a rapidly changing landscape.
7 Conclusion and Reflection
7.1 Key Takeaways
Throughout this chapter, we have explored the principle that comments should explain "why" rather than "what" in code. This principle is not merely a stylistic preference but a fundamental aspect of writing maintainable, comprehensible software. Let us summarize the key takeaways from our exploration:
The Value of "Why" Comments: Comments that explain the rationale, context, and constraints behind code provide value that cannot be derived from the code itself. They answer questions that naturally arise in the minds of developers working with the code: Why was this approach chosen? What problem does this solve? What constraints influenced this design? By providing this context, "why" comments enhance comprehension, reduce cognitive load, and support the mental model construction process that is essential for working effectively with code.
The Problem with "What" Comments: Comments that merely restate what the code does add no real value and can actually be harmful. They create maintenance overhead, as they must be updated whenever the code changes, yet provide no additional information. They increase cognitive load by adding redundant information that the developer would have inferred anyway. In the worst case, they can become misleading if they are not updated when the code changes.
Cognitive Foundations: The principle that comments should explain "why" rather than "what" is supported by cognitive science, particularly cognitive load theory. "What" comments increase extraneous cognitive load without providing any benefit, while "why" comments reduce cognitive load by providing context that helps organize and structure information. They help developers build mental models more efficiently, leading to better comprehension and retention.
Practical Impact: Poor commenting practices have significant practical consequences, including increased time to understand code, higher risk of incorrect modifications, accumulation of technical debt, knowledge silos, and regression bugs. These consequences affect not only individual developers but entire teams and organizations, impacting productivity, code quality, and ultimately the success of software projects.
Best Practices: Effective commenting practices focus on providing rationale and intent, documenting non-obvious decisions, explaining context and constraints, documenting assumptions, providing references to related resources, writing for the audience, keeping comments concise but complete, using a consistent style, updating comments when code changes, avoiding redundant comments, using the right type of comment for the situation, writing comments as you code, and reviewing comments during code review.
Tools and Techniques: Various tools and techniques can support and enhance effective commenting practices, including IDE features and plugins, static analysis tools, documentation generation tools, version control integration, collaborative documentation platforms, automated commenting tools, techniques for effective commenting, and training and resources.
Self-Documenting Code: The ultimate goal should be to write code that is so clear and expressive that it requires minimal comments. Self-documenting code relies on meaningful names, small focused functions, consistent conventions, expressive language features, appropriate abstractions, design patterns, clear error handling, logical code organization, and regular refactoring. Self-documenting code does not eliminate the need for comments but reduces the need for comments that explain "what" the code does, allowing comments to focus on explaining "why" it does it.
Future Trends: The landscape of code documentation is evolving rapidly, with trends toward AI-assisted documentation, interactive documentation, executable documentation, documentation as code, augmented reality documentation, voice-activated documentation, social documentation, personalized documentation, automated documentation testing, cognitive-adaptive documentation, and ethical documentation. Regardless of these technological advances, the fundamental principle that comments should explain "why" rather than "what" will remain relevant.
7.2 Continuous Improvement
Mastering the art of effective commenting is not a one-time achievement but a journey of continuous improvement. As with any skill, it requires practice, reflection, and a commitment to learning. Here are some strategies for continuously improving your commenting practices:
Reflect on Your Comments: Regularly review the comments you write. Ask yourself: Do they explain "why" rather than "what"? Do they provide valuable context that cannot be inferred from the code? Are they clear and concise? Could they be improved?
Learn from Others: Study the comments written by experienced developers. Look for examples of effective "why" comments in open-source projects or within your own organization. Analyze what makes these comments effective and how you can apply these principles to your own comments.
Seek Feedback: Ask colleagues to review your comments and provide feedback. Include comment quality as a specific focus during code reviews. Be open to constructive criticism and use it to improve your commenting skills.
Stay Informed: Keep up with research and best practices in code documentation. Read books, articles, and blog posts on the subject. Attend conferences, workshops, or webinars on software craftsmanship and documentation.
Experiment with Different Approaches: Try different commenting techniques and approaches to see what works best for you and your team. Experiment with different levels of detail, different styles, and different types of comments.
Measure the Impact: Look for ways to measure the impact of your commenting practices. For example, track how long it takes for new team members to become productive, or how frequently questions arise about code you've written. Use this information to identify areas for improvement.
Share Your Knowledge: Teach others about effective commenting practices. Teaching is one of the best ways to deepen your own understanding. Share examples of good comments, explain the principles behind them, and help others improve their skills.
Adapt to Context: Recognize that different contexts may require different approaches to commenting. A quick prototype may need fewer comments than a critical piece of production code. A complex algorithm may need more detailed comments than a straightforward utility function. Adapt your commenting practices to the specific context.
Embrace Change: Be open to new tools, techniques, and approaches to code documentation. The field is evolving rapidly, and new technologies and insights are constantly emerging. Be willing to adapt your practices as new and better approaches become available.
Balance Principles with Pragmatism: While the principles discussed in this chapter are important, they should be applied with pragmatism. There may be situations where strict adherence to these principles is not practical or necessary. Use your judgment to find the right balance for each situation.
By committing to continuous improvement, you can develop the skill of writing effective comments that explain "why" rather than "what." This skill will not only make your code more maintainable and comprehensible but will also contribute to your growth as a software professional.
In conclusion, the principle that comments should explain "why" rather than "what" is a fundamental aspect of writing high-quality, maintainable software. By focusing on providing rationale, context, and insight that cannot be derived from the code itself, comments become a valuable asset that enhances comprehension, facilitates collaboration, and supports the long-term health of software systems. As you continue your journey as a software professional, let this principle guide your commenting practices, and strive to write comments that truly add value to your code and your team.