Refactoring with Microsoft Visual Studio 2010

By Peter Ritchie
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Introduction to Refactoring

About this book

Changes to design are an everyday task for many people involved in a software project. Refactoring recognizes this reality and systematizes the distinct process of modifying design and structure without affecting the external behavior of the system. As you consider the benefits of refactoring, you will need this complete guide to steer you through the process of refactoring your code for optimum results.

This book will show you how to make your code base more maintainable by detailing various refactorings. Visual Studio includes some basic refactorings that can be used independently or in conjunction to make complex refactorings easier and more approachable. This book will discuss large-scale code management, which typically calls for refactoring. To do this, we will use enterprise editions of Visual Studio, which incorporate features like Application Performance Explorer and Visual Studio Analyzer. These features make it simple to handle code and prove helpful for refactoring quickly.

This book introduces you to improving a software system's design through refactoring. It begins with simple refactoring and works its way through complex refactoring. You will learn how to change the design of your software system and how to prioritize refactorings—including how to use various Visual Studio features to focus and prioritize design changes. The book also covers how to ensure quality in the light of seemingly drastic changes to a software system. You will also be able to apply standard established principles and patterns as part of the refactoring effort with the help of this book. You will be able to support your evolving code base by refactoring architectural behavior. As an end result, you will have an adaptable system with improved code readability, maintainability, and navigability.

Publication date:
July 2010
Publisher
Packt
Pages
372
ISBN
9781849680103

 

Chapter 1. Introduction to Refactoring

 

Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure.

 
 --Martin Fowler

This chapter begins our journey together into refactoring. In this chapter, we'll provide some introductory information on what refactoring is, its importance, and why we'd want to do it. We'll also describe some refactorings and some of the side-effects of not performing regular refactoring.

The following are a list of topics for this chapter:

  • What is refactoring?

  • Why the term refactoring?

  • Simple refactoring

  • Complex refactoring

  • Technical debt

  • The option of rewriting

What is refactoring?

Although the task of refactoring has been around for some time, and the term refactoring and the systematic approach associated with it have also been around for a long time; Martin Fowler and all were the first to popularize refactoring and begin organizing it more systematically.

Refactoring is a very broad area of software development. It could be as simple as renaming a variable (and updating all the uses of that variable), or it could refer to breaking a class that has taken on too much responsibility into several classes, like implementing a pattern. Refactoring applies to all complex software projects that require change over time.

 

A pattern is a [description of] a problem which occurs over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use this solution a million times over, without ever doing it the same way twice.

 
 --Christopher Alexander

Refactoring is changing the design of a software system without changing its external behavior. Changes to internal structure and interfaces that don't affect the system's external behavior are considered refactoring.

The term refactoring was coined by William Opdyke in what seems to be an evolution of factoring. Factoring was a common term used in the Forth programming language (and mathematics) to mean decomposition into constituent parts. The term refactoring has been around since at least the early 1990's. Few developers would argue that refactoring isn't something they do on a day-to-day basis. Although refactoring is a fairly simple concept, many programmers don't associate changing code without changing external behavior as being refactoring.

Refactoring has been an integral part of software development in the Extreme Programming (XP) methodology since Kent Beck introduced it in 1999.

Note

Kent Beck introduced XP in the book Extreme Programming Explained circa 1999.

XP mandates a Test-Driven-Development (TDD) process, where only enough code is written to implement a single feature and to pass at least a single test. The code is then refactored to support implementing another feature and pass all the existing tests. The new feature is then implemented and passes at least a single new test and all current unit tests. Extreme Programming also mandates continuous refactoring. We will discuss the importance of tests when refactoring in Chapter 3.

Some of the first tools to support automated refactoring came out of the Smalltalk community. The "Refactoring Browser" was one of the first user interfaces that provided abilities to refactor Smalltalk code. Now refactoring tools are commonplace in almost all programming language communities. Some of these tools are stand-alone; while some are add-ins to existing tools; while some refactoring abilities are built into other tools and applications where refactoring isn't their main purpose. Visual Studio®® 2010 fits in the second two categories. Visual Studio®® 2010 is an integrated development environment (IDE) designed specifically to allow developers to perform almost all the tasks required to develop and deliver software. Visual Studio® ®, since version 2005, has included various refactoring abilities. Visual Studio®® has also included extensibility points to allow third-party authors to write add-ins for it. Some of these add-ins have the purpose of giving users of Visual Studio®® 2010 specific refactoring abilities. Add-ins such as Resharper, Refactor! Pro, and Visual AssistX add more refactoring abilities for Visual Studio® 2010 users.

Refactoring has become more mainstream with Visual Studio® users in the past few years because of the built-in automated refactoring abilities of many IDEs, including Visual Studio® 2010. It's now almost trivial, for example, to rename a class in a million line software project and update dozens of references to that class with a few mouse clicks or keystrokes. Before the support for this simple refactoring came about, this problem would have involved searching for text and manually changing text or using a brute force search-and-replace to replace all instances of the class name and then rely on the help of the compiler to tell you where replacements weren't actually a use of the class name. While it's almost unheard of for IDEs to not have some sort of (at least rudimentary) simple automated refactoring abilities, developers will always have to manually perform many complex refactorings (although they may consist of some simple refactorings that can be automated).

Without automatic refactoring abilities and tools, developers feel friction when performing simple refactoring. Changing the order of parameters in a complex million-line software project, for example, is tedious and error prone. Encountering a spelling mistake in the name of a method that is referenced by dozens of other methods in dozens of other files is time-consuming. Without automated tools, our maintenance problems cause even more friction; simple tasks like changing the order of parameters or fixing spelling mistakes are simply avoided. This prevents a code base from improving in its maintainability and it becomes even more fragile and even more likely to be neglected (have you ever tried to find a specific method with text search only to find out someone misspelled it?)

It's the ease with which simple refactorings can be performed that has elevated "refactoring" in to the lexicon of almost every programmer. Yet, there is so much more to the act of refactoring than just simple refactorings that can be accomplished automatically by software.

The common thread in all refactoring is the goal of the refactoring. The goal can be as simple as making the code easier to read and maintain, or to make the code more robust; or the goal may be as complex as improving componentizing code modularity to make it more decoupled and to make it easier to add new behavior. But, systematic refactoring is the acceptance that the act of writing software is not atomic; it cannot be done in a single step and will evolve over time as our understanding of the problem domain improves and/or our customer's understanding of their needs is discovered and communicated.

 

What is refactoring?


Although the task of refactoring has been around for some time, and the term refactoring and the systematic approach associated with it have also been around for a long time; Martin Fowler and all were the first to popularize refactoring and begin organizing it more systematically.

Refactoring is a very broad area of software development. It could be as simple as renaming a variable (and updating all the uses of that variable), or it could refer to breaking a class that has taken on too much responsibility into several classes, like implementing a pattern. Refactoring applies to all complex software projects that require change over time.

 

A pattern is a [description of] a problem which occurs over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use this solution a million times over, without ever doing it the same way twice.

 
 --Christopher Alexander

Refactoring is changing the design of a software system without changing its external behavior. Changes to internal structure and interfaces that don't affect the system's external behavior are considered refactoring.

The term refactoring was coined by William Opdyke in what seems to be an evolution of factoring. Factoring was a common term used in the Forth programming language (and mathematics) to mean decomposition into constituent parts. The term refactoring has been around since at least the early 1990's. Few developers would argue that refactoring isn't something they do on a day-to-day basis. Although refactoring is a fairly simple concept, many programmers don't associate changing code without changing external behavior as being refactoring.

Refactoring has been an integral part of software development in the Extreme Programming (XP) methodology since Kent Beck introduced it in 1999.

Note

Kent Beck introduced XP in the book Extreme Programming Explained circa 1999.

XP mandates a Test-Driven-Development (TDD) process, where only enough code is written to implement a single feature and to pass at least a single test. The code is then refactored to support implementing another feature and pass all the existing tests. The new feature is then implemented and passes at least a single new test and all current unit tests. Extreme Programming also mandates continuous refactoring. We will discuss the importance of tests when refactoring in Chapter 3.

Some of the first tools to support automated refactoring came out of the Smalltalk community. The "Refactoring Browser" was one of the first user interfaces that provided abilities to refactor Smalltalk code. Now refactoring tools are commonplace in almost all programming language communities. Some of these tools are stand-alone; while some are add-ins to existing tools; while some refactoring abilities are built into other tools and applications where refactoring isn't their main purpose. Visual Studio®® 2010 fits in the second two categories. Visual Studio®® 2010 is an integrated development environment (IDE) designed specifically to allow developers to perform almost all the tasks required to develop and deliver software. Visual Studio® ®, since version 2005, has included various refactoring abilities. Visual Studio®® has also included extensibility points to allow third-party authors to write add-ins for it. Some of these add-ins have the purpose of giving users of Visual Studio®® 2010 specific refactoring abilities. Add-ins such as Resharper, Refactor! Pro, and Visual AssistX add more refactoring abilities for Visual Studio® 2010 users.

Refactoring has become more mainstream with Visual Studio® users in the past few years because of the built-in automated refactoring abilities of many IDEs, including Visual Studio® 2010. It's now almost trivial, for example, to rename a class in a million line software project and update dozens of references to that class with a few mouse clicks or keystrokes. Before the support for this simple refactoring came about, this problem would have involved searching for text and manually changing text or using a brute force search-and-replace to replace all instances of the class name and then rely on the help of the compiler to tell you where replacements weren't actually a use of the class name. While it's almost unheard of for IDEs to not have some sort of (at least rudimentary) simple automated refactoring abilities, developers will always have to manually perform many complex refactorings (although they may consist of some simple refactorings that can be automated).

Without automatic refactoring abilities and tools, developers feel friction when performing simple refactoring. Changing the order of parameters in a complex million-line software project, for example, is tedious and error prone. Encountering a spelling mistake in the name of a method that is referenced by dozens of other methods in dozens of other files is time-consuming. Without automated tools, our maintenance problems cause even more friction; simple tasks like changing the order of parameters or fixing spelling mistakes are simply avoided. This prevents a code base from improving in its maintainability and it becomes even more fragile and even more likely to be neglected (have you ever tried to find a specific method with text search only to find out someone misspelled it?)

It's the ease with which simple refactorings can be performed that has elevated "refactoring" in to the lexicon of almost every programmer. Yet, there is so much more to the act of refactoring than just simple refactorings that can be accomplished automatically by software.

The common thread in all refactoring is the goal of the refactoring. The goal can be as simple as making the code easier to read and maintain, or to make the code more robust; or the goal may be as complex as improving componentizing code modularity to make it more decoupled and to make it easier to add new behavior. But, systematic refactoring is the acceptance that the act of writing software is not atomic; it cannot be done in a single step and will evolve over time as our understanding of the problem domain improves and/or our customer's understanding of their needs is discovered and communicated.

 

Why the term refactoring?


So, why bother with a specific term for this type of modifying code? Isn't all modifying code simply modifying code? A systematic approach to the different types of editing and writing code allows us to focus on the side-effects we're expecting as a result of changing the code. Making a change that includes fixing a bug, refactoring, and adding a feature means that if something doesn't work then we're unsure which of our edits caused the problem. The problem could be that we didn't fix the bug correctly, that there was a problem with our refactoring, or that we didn't add the feature properly. Then there's the possibility that one of the edits interacted with the other. If we do encounter an adverse side-effect (also known as, "a bug") in this process, we have an overly large combination of possible causes to evaluate.

It's much easier to focus on one type of task at a time. Make a bug fix, validate the bug fix didn’t cause any expected side-effects. When we're sure that the bug fix works, we move on to adding new functionality. If there are unexpected side-effects while we're adding new functionality, we know that the side-effect was caused either by the code that we have just added to implement the new feature, or the way that code interacts with the rest of the system. Once we know the new functionality is working correctly, we can reorganize the code through refactoring. If we encounter any unexpected side-effects through the refactoring, then we know that the problem comes from either the code that was refactored or some way that the refactored code interacts with the rest of the system further minimizing the domain in which the unexpected side-effect could exist. This systematic approach reduces the time and work involved in writing software.

It's rare as software designers and programmers that we know at the start of a project exactly what the end-user requires. Requirements can be unclear, wrong, incomplete, or written by someone who isn't a subject matter expert. Sometimes this leads us to make an educated guess at what an end-user requires. We may create a software system in the hope of someone finding it useful and wanting to purchase it from us. Sometimes we may not have direct access to end-users and base all our decisions on second-hand (and sometimes third-hand) information. In situations such as these, we're essentially betting that what we're designing will fulfill the end-user's requirements. Even in a perfect environment, concrete requirements change. When we find out that the behavior does not fulfill the end-user's real requirements, we must change the system. It's when we have to change the behavior of the system that we generally realize that the design is not optimal and should be changed.

Writing software involves creating components that have never been written before. Those components may involve interaction with third-party components. The act of designing and writing the software almost always provides the programmer with essential knowledge of the system under development. Try as we might, we can almost never devise a complete and accurate design before we begin developing a software system. The term Big Design Up Front (BDUF) describes the design technique of devising and documenting a complete design before implementation begins. Unless the design repeats many aspects of an existing design, the act of implementing the design will make knowledge about the system illicit, which will clarify or correct the design. Certain design techniques are based on this truth. Test-Driven Development, for example, is based on realizing the design as the code is written where tests validate the code in implementing requirements of the system.

Regardless of the design technique; when a design needs to change to suit new or changing requirements, aspects of the existing design may need to change to better accommodate future changes.

 

Unit testing the second half of the equation


In order to support the refactoring effort and to be consistent with changing design and not external behavior it's important that all changes be validated as soon as possible. Unit tests validate that code continues to operate within the constraints set upon it.

Unit tests validate that code does what it is supposed to do; but unit tests also validate that any particular refactoring hasn't changed the expected behavior and side-effects of that code. They also validate that no new and unexpected side-effects have been introduced.

It's important that refactoring be done in conjunction with unit tests. The two go hand-in-hand. The refactorings described in this book will be focused on the details of the refactoring and not about how to unit test the code that is being refactored. Chapter 11 details some strategies for approaching unit-testing in general and in circumstances particular to refactoring.

 

Simple refactoring


Simple refactorings are often supported by tools IDEs, IDE add-ons, or stand-alone tools. Many simple refactorings have been done by programmers since they started programming.

A simple refactoring is generally a refactoring that can occur conceptually in one step usually a change to a single artefact and doesn’t involve a change in program flow. The following are examples of simple refactorings that we'll look at in more detail:

  • Renaming a variable

  • Extracting a method

  • Renaming a method

  • Encapsulating a field

  • Extracting a interface

  • Reordering parameters

Renaming a variable, is a simple refactoring that has a very narrow scope generally limited to a very small piece of code. Renaming a variable is often done manually because of its simplicity. Prior to having automated refactorings, the rename variable refactoring could often be done reliably with search and replace.

The Extract method refactoring is a type of composing method refactoring. Performing an extract method refactoring involves creating a new method, copying code from an existing method and replacing it with a call to the new method. Performing this refactoring can involve use of local variables outside the scope of the original code that would then become new parameters added to the new method. This refactoring is useful for abstracting blocks of code to reduce repetition or to make the code more explicit.

Another simple refactoring is Rename method. Renaming method is a simplification refactoring. It has a scope whose change can impact much of a code base. A public method on a class, when renamed, could impact code throughout much of the system if the method is highly coupled. Performing the refactoring rename method involves renaming the method, then renaming all the references to that method through all the code.

Encapsulating a field is an abstraction refactoring. Encapsulating a field involves removing a field from the public interface of a class and replacing it with accessors. Performing an encapsulate field refactoring may involve simply making the field private and adding a getter and a setter. All references to the original field need to be replaced with calls to the getter or the setter. Once a field is encapsulated, its implementation is then abstracted from the public interface of the class and can no longer be coupled to external code freeing it to evolve and change independently of the interface. This abstracting decreases coupling to implementation and increases the maintainability of code.

Another simple abstraction refactoring is Extract interface. Performing an extract interface refactoring involves creating a new interface; copying one or more method signatures from an existing class to the new interfaces, then having that class implement the interface. This is usually done to decouple use of this class and is usually followed up by replacing references to the class with references to the new interface. This refactoring is often used in more complex refactorings, as we'll see in later chapters.

Reording parameters is a simple refactoring whose process is well described by the name. Performing reorder parameters, as its name describes, involves reordering the parameters to a method. This refactoring is useful if you find that the order of the parameters of a method make it more prone to error (two adjacent parameters of the same type) or that you need to make the method signature match another (maybe newly inherited) method. If the method is referenced, the order in which the arguments are passed to the method would need to be reordered. Although a simple refactoring, conceptually this refactoring could lead to logic errors in code if not fully completed. If parameters that were reordered in the method signature had the same type, all references to the method would be syntactically correct and the code would recompile without error. This could lead to arguments being passed as parameters to a method that were not passed to the method before "refactoring". If this refactoring is not done properly it is no longer a refactoring because the external behavior of the code has changed! Reording parameters is different from the previously mentioned simple refactorings, because if not completed properly they would all result in a compiler error.

Removing parameters is another simplification refactoring. It involves removing one or more parameters from a method signature. If the method is referenced, these references would need to be modified to remove the argument that is no longer used. This refactoring is often in response to code modifications where method parameters are no longer used. This could be the result of a refactoring or an external behavior change. With object-oriented languages, removing a parameter could cause a compiler error as removing the parameter could cause the method signature to be identical to an existing method overload. If there were an existing overload, it would be hard to tell which method references actually referenced the overload or the method with the newly removed parameter. In cases such as these, it's usually best to revert the change and reevaluate the two methods.

These simple refactorings are conceptually easy and aren't difficult to perform manually. Performed manually, these refactorings could easily involve many lines of code with a lot of repetitive code changes. With many repetitive actions and the possibility of introducing human error, many developers may tend to avoid performing these manual refactorings despite being simple. These refactorings have the potential to evolve the code in ways that make it easier to maintain it and add features to it. This makes for more robust code and enables you to more quickl respond to new requirements.

These simple refactorings are actually supported by Visual Studio® 2010. Visual Studio® 2010 automates these refactorings. If you rename a method with the rename method refactoring in Visual Studio® 2010 and there are hundreds of references to the method, Visual Studio® 2010 will find all references to them in the current solution and rename them all automatically. This drastically reduces the friction of performing many of the refactoring building blocks.

In Chapter 2, we began performing these simple refactorings with the built-in refactoring tools in Visual Studio® 2010.

Simple refactorings are refactorings that are easy to understand, easy to describe, and easy to implement. Simple refactorings apply to software projects in just about every language they transcend languages. Refactorings are like recipes or patterns; they describe (sometimes merely by their name) how to improve or change code. This simplicity with which they can describe change and the simplicity by which they are known has led to several taxonomies of refactorings. Most taxonomies catalogue simple refactorings; but some taxonomies can be found that include complex or specialized refactorings. Some examples of online taxonomies offer centralized databases of refactorings:

http://www.refactoring.com/catalog/

http://industriallogic.com/xp/refactoring/catalog.html

 

Technical debt


As we implement the required functionality and behavior to fulfill requirements, we have the choice of not fully changing the rest of the system to accommodate the change. We may make design concessions to get the feature done on time. Not improving the design to accommodate a single change may seem benign; but as more changes are made to a system in this way, it makes the system fragile and harder to manage. Making changes like this is called incurring technical debt. The added fragility and unmaintainability of the system is a debt that we incur. As with any debt, there are two options: pay off the debt or incur interest on the debt. The interest on the technical debt is the increased time it takes to correctly make changes with no adverse side-effects to the software system and/or the increased time tracking down bugs are as changes made to a more fragile system that cause it to break. As with any debt, it is not always detrimantal to your project, there can be good reasons to incur debt. Incurring a technical debt, for example, to reach a high-priority deadline, may be good use of technical debt. Incurring technical debt for every addition to the code base is not a good use of technical debt.

 

In the software development trenches


I've been a consultant in the software development industry for 15 years. I've worked with many different teams, with many different personalities, and many different "methodologies". It's more common than not for software development teams to tell me of nebulous portions of code that no one on the team is willing to modify or even look at. It performs certain functionality correctly at the moment, but no one knows why. They don't want to touch it because when someone has made changes in the past it has stopped working and no matter how many duct tape fixes they make to it, it never works correctly again.

Every time I hear this type of story, it always has the same foot note: no one has bothered to maintain this portion of code and no one bothered to try to fully understand it before or after changing it. I call this the "House-of-Cards Anti-pattern". Code is propped up by coincidence until it reaches a pre-determined height, after which everyone is told to not go near it for fear that the slightest movement may cause it to completely collapse. This is a side-effect of Programming by Coincidence. Had this nebulous code constantly been refactored to evolve along with the rest of the system or to improve its design, it's quite likely that these regions of code would never have become so nebulous.

Note

Programming by Coincidence is a Pragmatic Programmer term that describes design or programming where code results in positive side-effects but the reason the code "works" is not understood. More information on this can be found here: http://www.pragprog.com/the-pragmatic-programmer/extracts/coincidence

Unlike the financial debt basis for this metaphor, it's hard to measure technical debt. There are no concrete metrics we can collect about what we owe: the increased time it takes to make correct changes to a hard-to-maintain code base without unexpected consequence. This often leads members of the development community to discount keeping a code base maintainable in favor of making a change with complete disregard for overall consequences. "It works" becomes their catch-phrase.

 

The option of rewriting


I'm not going to try to suggest that while as developers we "refactor" all the time it's the only option for major changes to a system. Rather than systematically evolve the code base over time, we could simply take the knowledge learned from writing the original code base and re-write it all on the same platform. Developers often embrace re-writing because they haven't been maintaining large blocks of the code, for various reasons. They feel that over time the changed requirements have caused them to neglect the underlying design and it has become fragile, hard to maintain, and time-consuming to change. They feel if they could just start anew they could produce a much better quality code base.

While re-writing may be a viable option and will most likely get you from point A to point B, it tends to ignore many aspects of releasing software that some developers tend to not want to think about. Yes, at the end of the re-write we might end up with a better code base and it might be just as good as having refactored it over the same amount of time; but there are many consequences to committing the software to a re-write.

Committing the software to a re-write does just that, commits the team to a re-write. Writing software means moving from less functionality to more functionality. The software is "complete" when it has the functionality that the users require. This means between point A and B the new software is essentially non-functional. At any given point of time, it can only be expected to perform a subset of the final functionality (assuming it's functional at all at that given point in time). This means the business has to accept the last released version of the software during the re-write. If the software is such a bad state as to be viewed as requiring a re-write then that usually means that last released version is in a bad state and user is not overly satisfied with it.

Re-writes take a long time. Developers can re-use the concepts learned from the original code and likely avoid some of the analysis that needs to go into developing software; but it's still a monumental task to write software from scratch. Re-writes are like all software development, the schedule hinges on how well the developers can estimate their work and how well they and their process are managed. If the re-write involves the same business analysts, the same developers, the same management, and the same requirements that the original admittedly flawed software was based upon, the estimates should be suspect. If all the same inputs that went into the flawed software go into the re-write, why should we expect better quality? After all, the existing software started out with a known set of requirements and still got to its existing state. Why would essentially performing the same tasks with the same process result in better software?

So, we know we are going to get to point B from point A, but now we know the business can't have new software until we get to point B and we now realize that we're not really sure when we're going to get to point B. The time estimates may not be scrutinized at the beginning of the re-write; but if the developers are unable to keep to the schedule they will most certainly be scrutinized when they fail to maintain the schedule. If the business has been without new features or bug fixes for an extended period of time, telling them that this timeframe is expanding indefinitely (and that is how they'll view it because they were already given a timeframe, and now it's changed) will not be welcome news.

I know what you're thinking, we can deal with the fact that the re-write means the software can't respond to market changes and new requirements for an unknown and significant amount of time by having two code streams. One code stream is the original code base and the other is the re-write. One set of developers can work on the original code base, implementing bug fixes and responding to market changes by adding features; and another set of developers can work on the rewrite. And this often happens, especially due to the business's response to the pushed timeframe. If they can't get their re-write right away, they'll need something with new features and bug fixes to keep the business afloat to pay for the re-write. If two code streams aren't created the software developers are right back to where they started, shoe-horning new features and bug fixes into the "pristine" code base. This is one of the very reasons why the developers feel the original code base had problems, the fact that the new features and bug fixes had caused them to neglect the design.

Maintaining two code streams make the developers feel like they're mitigating the risks involved with having to implement new features and bug fixes into their new design. It seems like everyone they should be happy. But, how is this really any different? Why would the business even accept the re-write if they're already getting what they want? How would these new features not make their way into the re-write? This is actually the same as having one code stream; but now we have to maintain two code streams, two teams, and manage change in two code bases and in two different ways. So, have we made the problem worse?

It's easy for developers to neglect the design of the system over time, especially if the design process is based on Big Design Up Front. With Big Design Up Front, the software development lifecycle stages are assumed to progress in a way similar to a waterfall, one stage can't continue until the previous stage has completed, for example, development can't begin until "Design" has been completed. This means that adding new behavior to an existing design means going back to the Design stage, coming up with a new design: documenting it, reviewing it, and submitting it for approval.

With a process like this, the management team quickly realizes that they can't make money if every evolution of the software requires a Big Design Up Front and they plead with developers to make small changes without Big Design Up Front to be responsive to customer and market demand despite mandating Big Design Up Front at the start of the project. Developers are almost always happy to oblige because they don't like dealing with processes, and generally like dealing with documentation even less. Going back to analyzing and writing designs isn't what they really want to do, so they're happy to be asked to effectively just write code. But, the problem is that they often fail to make sure the rest of the software can accommodate the change. They simply duct tape it to the side of the existing code (like adding a feature, making sure it passes one or two change-specific use cases, and moving on). It's after months of doing this that the code base becomes so brittle and so hard to manage that no developer wants to look at it because it means really understanding the entire code base in order to make a change that doesn't cause something else to break.

A developer faced with a process like this is almost forced to fall back on a re-write in order to implement any major additions or changes to the software because that's essentially how the lifecycle is organized and what it promotes. They're certainly not presented with a process that guides or even rewards the developer for keeping on top of the entire code base. The reward system is based almost solely on delivery of functionality. It's not the developer's fault that the lifecycle is forcing them into a corner like that. Or is it?

Many developers fall back on management's lifecycle. "I'm just doing what I'm told" or "I didn't define the process". This is avoiding ownership in their work. The developer is avoiding any ownership of their work because they haven't been delegated responsibility for part of what goes into their work the process. This is partially the fault of the developer, they need to take ownership of their work (the code) and understand what it is that only they really have control over. Getting into this situation is also the fault of management (or mismanagement) because it assumes that management only manages tasks and time estimates and leaves the details of "software development" to the software developers but ties their hands behind their backs by imposing a process by which they need to follow. Refactoring is a tool to help developers take ownership of the code especially code they didn't originally author.

Unless we have an almost perfect handle on the code base that we're working with, we're unlikely to know all the implicit functionality of the system. Implicit functionality is unintended functionality that users have come to rely upon or can come to rely upon. This functionality isn't necessarily a bug (but users can come to rely upon bugs as well); but is concrete behavior of the system that users are using. This is never documented and it is a side-effect of the way the software was written (that is, not the way it was designed). I've actually worked with end-users that complained when a bug was fixed, because they communicated amongst themselves a process that included a work around for the bug. They were annoyed that they had to change this process because they were so used to doing it a certain way (the work around had essentially become a reflex to them).

The reason a re-write may be appealing is because the developers don't have a complete handle on the code base and therefore cannot have a complete handle on the implicit behavior. A re-write will neglect these implicit behaviors and almost always draw complaints from users. While re-writing is a viable technique, the applicable circumstances are rare. If a re-write is proposed for a project you are on, be sure it is thoroughly evaluated.

 

Working refactoring into the process


Writing software doesn't work well if you don't accept that the industry changes rapidly and that the software needs to respond to these rapid changes and those that its users require. If the act of writing a piece of software took a couple of days (or even a couple of weeks) we could get away with not having to evolve existing source to produce changes to software. If it were just industry and requirements changes, software development may be able to keep up with these demands. But, our industry contains many competitors, all vying for the same pool of new and existing customers. If a competitor releases software with new features that the software we're writing doesn't have, our software would be obsolete before it even got released. As software developers, we need to embrace change and be as responsive to it as possible. Refactoring helps us to achieve this.

It's inevitable; someone will ask for a new feature that the existing design hadn't accounted for. One way of dealing with this, as we've discussed, is to go back to the drawing board with the design. This is too costly due to the consequences of that fall-out. The opposite side of the pendulum swing is to shoe-horn the change in, ignore the design, and hope there are no consequences. There's a happy medium that can accommodate both changes to the design and make the customer happy. Simply account for refactoring work and inform the customer of the slight increase in time estimates. If this is done consistently, the design is kept up-to-date with what it is required to support and maintains a fairly constant ability to respond to change. Each new feature may require design changes, but now we've spread those design changes over time so that one single new feature doesn't require abandoning the entire design. Each new feature has a constant and limited design evolution aspect to it.

To clarify with a metaphor: there are two ways of maintaining your car. One is to be entirely responsive, add gas when the gauge reaches E and go to the mechanic when the red light comes on. When the red light comes on, this means there's trouble. The owner of a car can avoid this red light through regular maintenance. Changing the oil at prescribed intervals, performing routine checks at scheduled times, and so on, go a long way in maintaining the health of your car so that the red light never goes on and you avoid costly repair bills. Software development is much the same; you can neglect regular and preventative maintenance of your code base by ignoring technical debt. Ignoring technical debt could result in a costly "repair bill" if you're forced to pay your technical debt when you haven't scheduled it. "Regular maintenance" comes in the form of constant preventative refactoring. Time should be scheduled for refactoring code outside of in response to adding a new feature. Developers should regularly read the code looking for ways to refactor it and improve its structure. This has two benefits. One is that they're constantly up-to-speed on the code. Even if there are infrequent changes to portions of the code, they're still familiar with it because they're reading and understanding it on a periodic basis.

The other benefit is that the code is being kept up-to-date with the rest of the design and its changes. Keeping the code up-to-date in this way generally means the latest technologies and patterns are incorporated into the code in a timely fashion, keeping it that much more robust and reliable. When customers are informed of the amount of time work entails and this amount of time is consistently met, they're much more accepting of this than to be informed of the amount of time it takes for a re-write.

 

What to refactor


With most software development projects, there's a reasonably constant influx of new requirements or change requests. Depending on the system and the input from your user community, some or many of these requests may require code that is significantly outside what the design currently handles. These requests may be enough to keep your refactoring plate full for a long time.

But, what if your project is already seeing the side-effects of brittle design and hard-to-maintain code? Implementing requirements often introduces bugs in seemingly disparate areas of the code that take less than trivial amounts of time to fix that make estimating the effort to implement requirements less than accurate. Finding where to make code changes to implement requirements may not be obvious, or the changes end up requiring far-reaching changes to code across much of the project. Working on a complex software project often means this can be an everyday fact of life. But, working on a software team where most changes involve modifying code across much of the code base can introduce a friction that makes the time it takes to implement requirements much longer than necessary. Some people may think this is just a part of the software development process; but, because you're reading this book, you don't believe that.

New requirements that clearly don't fit in to the current design offer a clear hint at where to focus our refactoring effort. So, we know that refactoring may produce a more maintainable and robust code base; but, where do we start?

Refactoring to patterns

Many object-oriented code bases are perfectly functional. But these code bases don't consistently attempt to reuse concepts or attempt to use formally accepted patterns. A pattern is a description of communicating objects and classes that are customized to solve a general design problem in a particular context. One way to clean up code is to attempt to refactor individual parts of the code to use specific and applicable patterns. This is an excellent way of better understanding the intention of the code. Once specific patterns are implemented, the code then becomes that much more understandable. Industry-standard terms and code is then strewn throughout the code rather than project-specific buzzwords. The code becomes easier to consume by newcomers and thus easier to maintain. Chapters 5 through 9 deal specifically with examples of code that use concepts found in several common patterns and show how to refactor the code to make the pattern explicit and thus improve the chances of not repeating code.

Just as refactoring to patterns may make code easier to understand, less likely to repeat itself, and easier to maintain in general; forcing patterns into code for the sake of patterns will make code harder to understand and maintain. It's important to be sure that refactoring to a particular pattern adds value to the readability, understandability, and maintainability of the code. Shoe-horning a pattern where it does not belong will have the opposite effect. Make sure your desire to use a pattern hasn't overshadowed its suitability.

Refactoring to principles

In the race to get products out the door or to meet specific deadlines, programmers can often lose sight of principles in favor of functionality. While this isn't all bad, it could leave the code as a procedural ball of mud, or hard to maintain and understand in many other ways. After deadlines have been met, this is often a good time to reflect on the code, its design and structure, as it applies to principles. The code may not be overly object-oriented, for example. The code and the architecture may benefit from a review for SOLID principles. SOLID is an acronym for Single Responsibility principle, Open-Closed principle, Liskov Substitution principle, Interface Segregation principle, and the Dependency Inversion principle. There are several other object-oriented principles geared towards reducing coupling and complexity in source code and help keeping software development efforts focused more on adding value and less on responding to issues. Refactoring to principles is discussed in detail in Chapters 5 through 7.

Refactoring to principles and patterns is a design change. Although changes of this nature have positive side-effects (the impetus to implement them) they may also come with negative side-effects. A design change may decrease performance, for example. Any design change should be evaluated to ensure there are no unacceptable negative side effects before being accepted.

Code smells

Kent Beck introduced the term "code smells" (and formalized it along with Martin Fowler) to provide a way of communicating ways to detect potential problems that can occur in code. These problems are generally adverse side-effects of code that effectively works in at least one context. As with real-world smells, some smells may be tolerable by some people and not by others. There are no good smells with code smells, only varying degrees of bad smells.

Note

Kent Beck and Martin Fowler formalized the term code smells in the book Refactoring: improving the design of existing code.

Code smells allow us to easily define common anti-patterns that can be prioritized by opinion or context. Depending on the person or the project in which they work, a code smell can be distinctly prioritized. Some people and teams may feel certain smells are unacceptable (maybe they consider them as stenches); while others feel the removal of a smell is just a nice-to-have; while still others may feel the same smell is nary an issue.

Code smells are an easily categorized, but personal means of describing the side-effects that someone discovered (and documented) of a particular type of code. So, code smell possibilities and the degree to which you may find them applicable are endless. Chapters 2 and 3 go into more detail about code smells and specific examples of refactoring code in response to common code smells that are generally accepted as having a high return on investment when properly removed.

Complexity

Reducing complexity is always a good reason to change code. Complex code is hard to understand; if code is hard to understand, it's hard for just anyone (or anyone at all) to fix bugs in the code or to add features to the code. But, how do you focus your efforts on complex code and how do you find complex code after all, you want to fix code and not spend all your time searching for code to fix. Fortunately, there are many tools out there that will tell you how complex your code is.

You can get a gut feeling about how complex some code is by simply reading it. But, if you can't read all your code, code metrics can help. Software metrics are numerical measurements calculated by the structure and content of source code. Many software metrics focus on putting a numeric value on the complexity of code. Metrics like Depth of Inheritance, Class Coupling, and Lines of Code can help tell you how complex regions of code are. More obviously, metrics like Maintainability Index and Cyclomatic Complexity specifically address code complexity.

Chapter 6 goes into more detail about specific refactorings in response to specific complexity metrics.

Performance

Focusing on complexity can often indirectly help with the performance of code. But, you often want to focus specifically on performance of code. Fixing performance issues is almost refactoring by definition you want to change the code without changing the external behavior of the code. Many applications have functionally with obvious need of performance improvements; and these are obvious areas of code you should focus upon. But, how do you measure improvement; and if you don't have obvious non-performance features, how do you find areas of code to focus your performance-related refactoring efforts? Luckily, there are many tools out there to gather performance metrics about executed code.

Kernel

Many software systems have a core subsystem that performs the majority of the work that the system does. If a subsystem was designed this way, it's likely called a kernel. Depending on the methodology used by the team, a Domain Model may exist that is very similar to a kernel the code that clearly defines all the domain-specific entities and their logic that remain constant regardless of the type of the front-end, how integration points are implemented, and so on.

Focusing on the core of the system and making sure it's not complex, easy to understand, easy to change, and easy to add features too goes a long way in improving the responsiveness of a software team. This core of code often deals with proprietary information the reason the business exists. People with experience with information and logic like this are usually hard to find. You may not be able to simply publish an ad for "C# developers with experience in Telematics" and expect to find many people locally to fill the position. Keeping the kernel simple to understand means you can get people without years of experience in your domain to change and add to the code.

Design methodologies

It's common for source code that has been around for any length of time to have been worked on by team members that have come and gone. Some of those team members may have been influential with regard to the design methodology used (or there may have been some "rogue" developers that deviated from the design methodology accepted by the rest of the team). With some design methodologies, this may not have a noticeable effect on the design of the code; but some design methodologies have fairly distinct effects on the design of the code. Domain-driven design, for example, suggests that domain entities be explicit in the code usually an entity-to-class relationship. These entities often are completely decoupled from the rest of the system (user-interface, infrastructure, utility methods, and so on.) If the design of the system is to remain domain-driven, you may find that some classes may need to move to a different place, be decoupled from other classes, and so on. Depending on the level to which domain-driven has been implemented (or lacking thereof) the code may need to be more organized into layers. Chapter 8 details refactoring to layers. Other techniques attributed specifically to domain-driven design are detailed in Chapters 8 and 10. Specific patterns have been attributed to domain-driven design and details of those patterns can be seen in the chapters dealing with refactoring to patterns: Chapters 5 through 9.

Unused and highly-used code

Identifying how frequently code is used helps to tell you whether the refactoring effort will result in tangible results. One of the easiest refactorings is to simply delete unused code. But, without scouring the code, line-by-line, how do you find unused code or focus on highly-used code?

Luckily there are tools that will tell you how used code is. These tools are called Code Coverage tools. Much like performance metrics tools, they monitor executing code and tell you how frequently code, methods, and classes are used. Some static analysis tools can tell you about code, methods, and classes that are not referenced by other code giving you information about unused code. This type of unused code will help focus your unused-code refactoring efforts, but can't tell you about all unused code. Code, methods, or classes may still be referenced by other code but may never be executed. Code Coverage tools will help tell you about this other type of unused code.

 

Refactoring in Visual Studio® 2010


The Visual Studio® 2010 IDE is much more than a solution/project management system with a text editor. It offers much functionality to aid the user in refactoring their code. Visual Studio® 2010 has had features for common refactorings since Visual Studio® 2005, such as extract method, rename member, encapsulate field, extract interface, and so on. (more detail about these refactorings can be seen in following chapters). Visual Studio® 2010 expands the refactoring palette that Visual Studio® 2010 has to offer by adding refactorings like Generate from Usage, TODO: get final list of any new refactorings.

Refactoring with Visual Studio®'s built-in automated refactorings doesn't have to be the limit of your refactoring efforts with Visual Studio® 2010. Depending on the edition of Visual Studio® 2010 you're using, you can work with other features of Visual Studio® 2010 to focus, prioritize, or introduce areas of your code to refactor.

Visual Studio® 2010 Premium offers several features that when coupled with a tangible refactoring effort, will help improve the quality and maintainability of your code base.

Static code analysis

Static code analysis is the analysis of software by an automated tool that does not involve execution of the code. This is most commonly analysis of the source code. The code is analyzed for common simple anti-patterns, logic errors, structure of individual statements, and so on. Visual Studio® introduced static code analysis in version 2005. The static code analysis in Visual Studio® was based on a Microsoft Internal project started by Brad Abrams and Krzysztof Cwalina called FxCop. FxCop was originally written as a means to analyze .NET software and enforce the guidelines detailed in the "Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries" book. FxCop is different from most static analysis tools available at the time in that it analyzed compiled code. This means it can't analyze code comments. FxCop was eventually subsumed by Visual Studio® and renamed to Code Analysis. Although FxCop is still available, it appears that it is no longer being maintained and any new functionality added to Code Analysis is not being merged into FxCop.

Note

More information on FxCop can be found at http://go.microsoft.com/fwlink/?LinkId=180978

Visual Studio® 2010 Code Analysis includes 200+ rules for reporting possible design, localization, globalization, performance, security, naming, interoperability, maintainability, portability, and reliability improvements. Every software development team should review the rules available in Visual Studio® and decide what subset (or subsets, depending on types of applications) of rules should be applicable. The rules should then be prioritized so that the team is not overwhelmed with trying to remove many Code Analysis rule violations. The team should decide which rules can be used to help focus the refactoring effort. For example, the rule CA1502 Avoid Excessive Complexity might be corrected by one or more Extract Method refactorings.

Code metrics

Code metrics, although another form of static analysis, is a separate feature of Visual Studio® 2010. Code metrics is more commonly called 'software metrics' in the software industry. To a certain extent, some of the rules in Code Analysis are based on specific software metrics (like CA1502 Avoid Excessive Complexity which is based on the Cyclomatic Complexity software metric). Software metrics are specific definitions about how to measure software in a certain way. These measurements generally deal with quality. Quality in software is measured in efficiency, complexity, understandability, reusability, testability, and maintainability.

Most software metrics are not easily actionable. For example, what is specifically actionable from a Maintainability Index of 55? With the Maintainability Index, the lower the metric the higher the estimated maintainability but deciding what course of action to take based solely on 55 is nearly impossible. Some metrics may be easily actionable, like 1000 lines of code for ClassX may need ClassX to be refactored into multiple classes. We'll look more closely at how you can use code metrics to help focus the refactoring effort in Chapter 3.

Software metrics can be used in conjunction with other observances to focus where refactoring should be prioritized. Given ModuleX and ModuleY that both contain code that has known performance issues; if ModuleX has a Maintainability Index of 75 and ModuleY has a Maintainability Index of 50, it may be obvious that ModuleX needs more work and could benefit from refactoring first. But, it's a double-edged sword; you could easily decide that because ModuleY is more maintainable (because it's Maintainability Index is lower) that refactoring efforts should focus on code in ModuleY because you may be able to get a faster return on investment.

Chapters 6 and 7 include details about making use of software metrics to help prioritize refactoring.

 

Summary


Software projects are generally about writing and managing multi-release software with no fixed end-of-life and that evolves noticeably over time. This chapter has outlined how systematic refactoring efforts help in the evolution and management of software to avoid it from becoming brittle and keeping it robust in the face of change. Refactoring helps software developers be more amenable to accepting new requirements.

Writing software, although a science, is not always done perfectly. Many priorities affect how software is written. Sometimes, software isn't designed or written to use the best technique. Software that doesn't use the best technique (but is otherwise functional) is considered to have acquired technical debt. An increased debt load can lead to bankruptcy. By focusing code improvements on refactoring to reduce debt, software developers can avoid reaching a debt load that requires a re-write.

Refactoring efforts can have a specific impetus, or they may simply be part of a general effort to improve the maintainability of the software under development. There are many aspects that can help focus the refactoring efforts whether the impetus is specific or not.

In some code-bases there may be no specific focus on design focusing almost solely on functionality. Code-bases like this have inconsistent design styles and are hard to read, understand, and modify. By refactoring to specific design principles, code can be modified to improve its consumability, robustness, and changeability.

Refactoring to patterns help in maintaining software that is easily understandable and consumable to peers in the software industry. Use of patterns reduces the concepts consumers of code are required to understand before they can understand the overall intent of the code.

Refactoring code to remove code smells can help focus and prioritize the refactoring effort. By prioritizing standard smells and creating proprietary smells, maintainability of code can be enhanced focusing on improvements with the highest returns.

There are many aspects to how software is written and designed that is orthogonal to its functionality but directly related to its usability. Prioritizing refactoring efforts based on usability metrics like performance will give end users the same functionality, only faster.

There are many formulas for analyzing source code to gather specific metrics about the static structure of the code. These metrics are generally geared towards detecting attributes that are side effects of problems. These problems are often related to complexity. There are many tools that automatically gather metrics about code. Complex code has proven to be a major aspect contributing to hard-to-maintain software projects. Complex code reduces the ability of software to change to implement new features and increases the risk that these changes cause other problems. By gathering and prioritizing these metrics, code improvement can focus on refactoring complex code.

Focusing on improving code that when improved will result in the highest return, is an attractive option. By prioritizing change to areas of the code that are highly used or highly dependent can realize more noticeable benefits to the refactoring effort.

Refactoring is about preventative maintenance. As with any maintenance, a systematic approach to it can make the process more tolerable and in many ways it can make it enjoyable. Put some thought into what and where you want to improve your source code and you can avoid drudgery and focus on the value-added parts of your software.

About the Author

  • Peter Ritchie

    Peter Ritchie is a software development consultant. Peter is president of Peter Ritchie Inc. Software Consulting Co., a software consulting company in Canada's National Capital Region specializing in Windows-based software development management, process, and implementation consulting. Peter has worked with such clients as Mitel, Nortel, Passport Canada, and Innvapost from mentoring to architecture to implementation. Peter has considerable experience building software development teams and working with startups towards agile software development. Peter's range of experience ranges from designing and implementing simple standalone applications to architecting distributed n-tier applications spanning dozens of computers; from C++ to C#. Peter is active in the software development community attending and speaking at various events as well as authoring various works including Refactoring with Microsoft Visual Studio 2010.

    Browse publications by this author
Book Title
Access this book, plus 7,500 other titles for FREE
Access now