Refactoring Legacy Code In Practice – Iteration 3

After building the Golden Master in the previous iteration we are sure that we wont introduce any bugs during our refactoring phase: we built a safety net that will prevent us form falling into the regression trap.

The next refactoring I’m going to practice is Subclass to test followed by replace inheritance with delegation. As you can see here subclass to test is a powerful technique to break dependencies and test only the method we are interested in refactoring , but it’s also considered by some an anti-pattern because the complexity of the code becomes higher and higher making it difficult to apply refactoring techniques later on.

Testing the questions’ initialization

In this exercise I decided to refactor the askQuestion method of the Game class.

The first step in building the test suite is to create a class that inherits from the Game class in order to be able to reach it’s internals without actually making them public.

private class GameForTests : Game
{
}

To test the askQuestions I can tests that it pops the correct questions from the correct category and to make my life a little easier I decided to spy on the values contained by the questions’ linked lists, so the next step was to promote the linked lists to protected and expose them in the test class.

private class GameForTests : Game
{
    public LinkedList<string> PopQuestions { get { return popQuestions; } }
    public LinkedList<string> RockQuestions { get { return rockQuestions; } }
    public LinkedList<string> ScienceQuestions { get { return scienceQuestions; } }
    public LinkedList<string> SportsQuestions { get { return sportsQuestions; } }
}
public class Game
{
// ...
    protected LinkedList<string> popQuestions = new LinkedList<string>();
    protected LinkedList<string> scienceQuestions = new LinkedList<string>();
    protected LinkedList<string> sportsQuestions = new LinkedList<string>();
    protected LinkedList<string> rockQuestions = new LinkedList<string>();
// ...

And then my first test checks that the questions get correctly initialized in the constructor

[Test]

public void BoardInizializzedCorrectly()
{
    var game = new GameForTests();
    Assert.AreEqual(50, game.PopQuestions.Count);
    Assert.AreEqual(50, game.RockQuestions.Count);
    Assert.AreEqual(50, game.ScienceQuestions.Count);
    Assert.AreEqual(50, game.SportsQuestions.Count);

    for (int i = 0; i < 50; i++)
    {
        Assert.AreEqual("Pop Question " + i, game.PopQuestions.First());
        Assert.AreEqual("Rock Question " + i, game.RockQuestions.First());
        Assert.AreEqual("Science Question " + i, game.ScienceQuestions.First());
        Assert.AreEqual("Sports Question " + i, game.SportsQuestions.First());
        game.PopQuestions.RemoveFirst();
        game.RockQuestions.RemoveFirst();
        game.ScienceQuestions.RemoveFirst();
        game.SportsQuestions.RemoveFirst();
    }
}

Why should I care?

To test the askQuestion method it would have been enough to stub out the lists instead of checking that they get correctly initialized.  I’m doing this extra work because I’m aiming at extracting a class that represents the deck of cards and testing that the lists get initialized correctly will help me testing that the deck class gets correctly initialized too.

Second step: test askQuestion’s behaviour

Looking closer at the method askQuestion I notice that it have  one indirect input in the form of a method call to currentCategory, one indirect output as it writes one question to the console and some side effects when it pops the questions from the linked lists.

protected void askQuestion()
{
    if (currentCategory() == "Pop")
    {
        Console.WriteLine(popQuestions.First());
        popQuestions.RemoveFirst();
    }
    if (currentCategory() == "Science")
    {
        Console.WriteLine(scienceQuestions.First());
        scienceQuestions.RemoveFirst();
    }
    if (currentCategory() == "Sports")
    {
        Console.WriteLine(sportsQuestions.First());
        sportsQuestions.RemoveFirst();
    }
    if (currentCategory() == "Rock")
    {
        Console.WriteLine(rockQuestions.First());
        rockQuestions.RemoveFirst();
    }
}

My goal with the test is to check that for any possible indirect input, the indirect output and the side effect are what I expect them to be: for example if currentCategory returns “Rock”, then the method should write the next rockQuestion to  the console and then pop it out of the list.

To do so I want to control the values returned by currentCategory so, first of all I add a public property to GameForTests that I can set during my tests with the value I want currentCategory to return, then I promote the later to protected virtual in order to override it and make it return the value of the new field:

protected override string currentCategory()
{
    return CategoryStub;
}

public string CategoryStub { get; set; }

Then I build a test that asks one question of each category checking that the question gets popped from the linked list and that it’s sent to the console

[Test]

public void DeckReturnsQuestionForCategory()
{
    // change the output stream of console to test
    // the indirect output
    var stream = new MemoryStream();
    var writer = new StreamWriter(stream);
    var reader = new StreamReader(stream);
    Console.SetOut(writer);

    // Initialize the game
    var game = new GameForTests();

    // stub the category and ask the question
    game.CategoryStub = "Rock";
    game.AskQuestion();
    
    // check that the first question was popped out of the list
    Assert.AreEqual(49, game.RockQuestions.Count);
    Assert.IsFalse(game.RockQuestions.Any(x => x.Equals("Rock Question 0")));

    Assert.AreEqual(50, game.PopQuestions.Count);    
    Assert.AreEqual(50, game.ScienceQuestions.Count);
    Assert.AreEqual(50, game.SportsQuestions.Count);

// same thing for other categories
           
    writer.Flush();
    stream.Position = 0;
    // Check that the indirect output is correct
    Assert.AreEqual("Rock Question 0", reader.ReadLine());


    stream.Close();
}

Preparing to extract the class

Now that we have the method tested through “subclass to test”, we want to extract the tested code as a dependency in order to separate the miss-placed behavior. the next move then is to prepare the code to extract a class clearly isolating the parts that could go in the extracted class from those that need to remain.

I start by moving the `Console.WriteLine` calls out of the condition blocks so that the askQuestion first evaluate a temporary string that then gets passed to the indirect output.

protected void askQuestion()
{
    string question = string.Empty;
    if (currentCategory() == "Pop")
    {
        question = popQuestions.First();
        popQuestions.RemoveFirst();
    }
    if (currentCategory() == "Science")
    {
        question = scienceQuestions.First();
        scienceQuestions.RemoveFirst();
    }
    if (currentCategory() == "Sports")
    {
        question = sportsQuestions.First();
        sportsQuestions.RemoveFirst();
    }
    if (currentCategory() == "Rock")
    {
        question = rockQuestions.First();
        rockQuestions.RemoveFirst();
    }
    Console.WriteLine(question);
}

Then I change the questions fields to readonly properties that expose a private fields;

protected LinkedList<string> popQuestions { get { return _popQuestions; } }
protected LinkedList<string> scienceQuestions { get { return _scienceQuestions; } }
protected LinkedList<string> sportsQuestions { get { return _sportsQuestions; } }
protected LinkedList<string> rockQuestions { get { return _rockQuestions; } }

private readonly LinkedList<string> _popQuestions = new LinkedList<string>();
private readonly LinkedList<string> _scienceQuestions = new LinkedList<string>();
private readonly LinkedList<string> _sportsQuestions = new LinkedList<string>();
private readonly LinkedList<string> _rockQuestions = new LinkedList<string>();

And the body of the constructor becomes a call to a method that initializes the deck.

All my tests still pass so I’m good to go;

Extract the Parent class

I then change the Game class so it inherits form a parent class called, you guessed it, GameParent. The purpose of this move is to move all the logic for asking a question out of the game class so it will depend only on a call to a parent method. I still don’t need to change my test.

public class GameParent
{
    protected LinkedList<string> popQuestions { get { return _popQuestions; } }
    protected LinkedList<string> scienceQuestions { get { return _scienceQuestions; } }
    protected LinkedList<string> sportsQuestions { get { return _sportsQuestions; } }
    protected LinkedList<string> rockQuestions { get { return _rockQuestions; } }

    private readonly LinkedList<string> _popQuestions = new LinkedList<string>();
    private readonly LinkedList<string> _scienceQuestions = new LinkedList<string>();
    private readonly LinkedList<string> _sportsQuestions = new LinkedList<string>();
    private readonly LinkedList<string> _rockQuestions = new LinkedList<string>();

    public GameParent()
    {
        InitDeck();
    }

    private void InitDeck()
    {
        for (int i = 0; i < 50; i++)
        {
            _popQuestions.AddLast("Pop Question " + i);
            _scienceQuestions.AddLast(("Science Question " + i));
            _sportsQuestions.AddLast(("Sports Question " + i));
            _rockQuestions.AddLast(createRockQuestion(i));
        }
    }
    public String createRockQuestion(int index)
    {
        return "Rock Question " + index;
    }
    protected string Question(string category)
    {
        string question = string.Empty;
        if (category == "Pop")
        {
            question = popQuestions.First();
            popQuestions.RemoveFirst();
        }
        if (category == "Science")
        {
            question = scienceQuestions.First();
            scienceQuestions.RemoveFirst();
        }
        if (category == "Sports")
        {
            question = sportsQuestions.First();
            sportsQuestions.RemoveFirst();
        }
        if (category == "Rock")
        {
            question = rockQuestions.First();
            rockQuestions.RemoveFirst();
        }
        return question;
    }

}

Yon may notice that it has a protected method called Question that gets a category and returns the question to be asked: This method doesn’t have any indirect inputs nor it has indirect outputs, those remain in the askQuestion method which is now super simple: it retrieves the category, then gets the question form the parent and then sends it to the console.

protected void askQuestion()
{
    var category = currentCategory();
    var question = Question(category);
    Console.WriteLine(question);
}

Extract the dependency

Now that the code for building a deck and retrieving questions form it is isolated inside the parent class we can apply the Replace inheritance with delegation refactoring to break the parent-child dependency.

Making this change requires to first change some tests in order to have them pointing directly to the parent class. This isn’t a breaking change but the Game class looses it’s coverage so I’m also forced to write a new test to be sure that the behavior of the askQuestion method doesn’t change.

I first rename the parent from GameParent to CardDeck and build a child class to test:

private class CardDeckForTests : CardDeck
{
    public LinkedList<string> PopQuestions { get { return popQuestions; } }
    public LinkedList<string> RockQuestions { get { return rockQuestions; } }
    public LinkedList<string> ScienceQuestions { get { return scienceQuestions; } }
    public LinkedList<string> SportsQuestions { get { return sportsQuestions; } }


    public string AskQuestion(string category)
    {
        return Question(category);
    }

}

then change the test methods to test the new CardDeckForTests.

[Test]

public void DeckInizializzedCorrectly()
{
    var deck = new CardDeckForTests();
    Assert.AreEqual(50, deck.PopQuestions.Count);
    Assert.AreEqual(50, deck.RockQuestions.Count);
    Assert.AreEqual(50, deck.ScienceQuestions.Count);
    Assert.AreEqual(50, deck.SportsQuestions.Count);

    for (int i = 0; i < 50; i++)
    {
        Assert.AreEqual("Pop Question " + i, deck.PopQuestions.First());
        Assert.AreEqual("Rock Question " + i, deck.RockQuestions.First());
        Assert.AreEqual("Science Question " + i, deck.ScienceQuestions.First());
        Assert.AreEqual("Sports Question " + i, deck.SportsQuestions.First());
        deck.PopQuestions.RemoveFirst();
        deck.RockQuestions.RemoveFirst();
        deck.ScienceQuestions.RemoveFirst();
        deck.SportsQuestions.RemoveFirst();
    }
}


[Test]
public void DeckReturnsQuestionForCategory()
{
    var deck = new CardDeckForTests();
            
    Assert.AreEqual("Rock Question 0", deck.AskQuestion("Rock"));
    Assert.AreEqual(50, deck.PopQuestions.Count);
    Assert.AreEqual(49, deck.RockQuestions.Count);
    Assert.AreEqual(50, deck.ScienceQuestions.Count);
    Assert.AreEqual(50, deck.SportsQuestions.Count);

    // Other Categories
}

As you can see this test doesn’t need to redefine the Console’s output stream as the Question method returns a string and doesn’t have indirect outputs. The new test that I have to write checks that, for any given category, the askQuestion in the Game class sends the question to the console like before:

[Test]
public void GameAsksQuestionForCategory()
{
    var stream = new MemoryStream();
    var writer = new StreamWriter(stream);
    var reader = new StreamReader(stream);
    Console.SetOut(writer);

    var game = new GameForTests();
    game.CategoryStub = "Rock";
    game.AskQuestion();
    game.CategoryStub = "Pop";
    game.AskQuestion();
    game.CategoryStub = "Science";
    game.AskQuestion();
    game.CategoryStub = "Sports";
    game.AskQuestion();

    writer.Flush();
    stream.Position = 0;
    Assert.AreEqual("Rock Question 0", reader.ReadLine());
    Assert.AreEqual("Pop Question 0", reader.ReadLine());
    Assert.AreEqual("Science Question 0", reader.ReadLine());
    Assert.AreEqual("Sports Question 0", reader.ReadLine());

    stream.Close();
}

now that I’m happy with the status of my tests I’m ready to break the inheritance. Now the game class doesn’t inherits from anyone and it has a CardDeck as a private field that gets assigned in the constructor and the ask Question simply calls the Question method of the CardDeck field and it has no clue of how the questions are stored internally in the CardDeck class.

public class Game
{
    private readonly CardDeck _deck;
    public Game(CardDeck deck)
    {
        _deck = deck;
    }
    // ...
    protected void askQuestion()
    {
        var category = currentCategory();
        var question = _deck.Question(category);
        Console.WriteLine(question);
    }
    // ...
}1

Conclusion

Even if a little bit longer than just asking to ReSharper to extract a class, proceeding with this refactoring gave me the opportunity to make every step in a save and clean way without suddenly breaking the test and having to fix them later. The only time the test got broken was when I needed to change the structure of the code so heavily that I decided to change the test to reflect the behavior that I’m expecting from the change.

What struck me was that at every change it was crystal clear to me what was the next step to be taken in order to improve the code. This doesn’t happen a lot when refactoring code at work when you have less time and have to take shortcuts.

If you liked this article and want more Subscribe here and you wont miss any article.

Refactoring Legacy Code In Practice – Iteration 2 – Building a Golden Master

Building a test suite for legacy code can be daunting, so most of us usually approach legacy code in two different ways:

  1. If it ain’t broke don’t fix it
  2. Refactor without automated tests and hope for the best

But there is a third option that can give us the safety net provided by a good test suite and isn’t as slow as manually building a characterization test: using the Golden Master Technique to build the characterization tests.

Iteration Objective

In the earlier post I discussed that the first iteration of the legacy code retreat in which you have to explore the code and understand it’s purpose. As you may have guessed, the purpose of the second iteration is to build a test suite using the Golden Master technique. It’s a smaller iteration as it should last just 20 minutes, but I ran into some issues and it took me longer to figure out how to build the tests, probably because I’m doing these exercises alone and not pair programming as I should. Anyway, here is the procedure I followed to build the Golden Master:

  1. Find all possible inputs, both direct and indirect
  2. Find all possible outputs, both direct and indirect
  3. Find a way to record every output for any given input
  4. Play all inputs and verify that the output correspond to what has been previously recorded.

If you want to follow along you can find the code here.

Find all the possible inputs

All inputs to the Game object are produced by the System.Random library so we have two ways to approach this. We can either record all the values generated by the Random object or we can find a way to have every time the same sequence of random numbers generated by it.  The former approach has two disadvantages:

  1.  Too many inputs to log
  2. I have to exclude the call to rand.GetNext() when testing

These two aren’t really huge impediments, but there is a way to solve the problem with fewer changes to the code: seeding the random number generator. How does this work? the System.Random class is a pseudo-random number generator and one of its properties is that given a seed, the sequence of numbers returned is deterministic so, if during our tests we seed the generator with the same number used while recording the data we can be sure that the random object will return the same sequence and thus the roll() method will be called with exactly the same values.

// GoldenMaster.cs
// The call to GameRunner
for (int i = 0; i < 100; i++)
{
    args[1] = i.ToString();
    GameRunner.Main(args);
}

// GameRunner.cs
// the 
var seed = int.Parse(args[1]);
//...
var rand = new Random(seed);

The for loop in the code above is where I build the input sequence calling the Main function 100 times with different seeds for the random generator, in this way I have to log just the seed and not every number generated.

Find all possible direct or indirect outputs

To be able to verify that for any given input the program creates the same output I need to record all the outputs of the system. Luckily, the program doesn’t do any call to external libraries and doesn’t write anything to a database or hard drive, it just prints strings to the console, it has just one type of output: Console.Writeline.

Find a way to record every output for any given input

Knowing that all possible outputs are just calls to Console.Writeline simplifies a lot the step of recording the outputs for every given input. As we know Console.Writeline. is a static method and static methods are hard to stub, normally when you encounter a static method that you want to stub you wrap it in an isolated class or in a protected method of the same class and the you stub it in your tests. Isolating all calls to Console.WriteLine can lead you to introduce some bug in the program if you don’t have any test to check what you are doing.

Luckily there is a better way to stub the call to Console.WriteLine and it is to use to change the default output stream so it would write to a log file instead of printing to the console:

var file = File.CreateText("gold.txt");
Console.SetOut(file);

// call to GameRunner.Main

file.Flush();
file.Close();

Play all inputs and check the results

After we changed the output to write to a text file, running our test will record all the inputs and outputs of 100 runs of the game and the resulting file will look like this:

**********
0
----------
Chet was added
They are player number 1
Pat was added
They are player number 2
Sue was added
They are player number 3
Chet is the current player
They have rolled a 4
Chet's new location is 4
The category is Pop
Pop Question 0
Question was incorrectly answered
Chet was sent to the penalty box
Pat is the current player
They have rolled a 4
Pat's new location is 4
The category is Pop
Pop Question 1
Answer was corrent!!!!

....

The category is Science
Science Question 4
Answer was corrent!!!!
Sue now has 6 Gold Coins.
__________

You can notice that in the beginning a game log start with a series of asterisks, then our seed for the random number generator then a series of dashes and then it starts recording all the outputs of the game. When the game ends I close the block with a series of underscores (which are probably useless but, who cares at this point?)

Testing the recorded data

After building our golden master we have to build a test suite that uses the recorded data to test that the system correctly behaves. The test built during this step is going to help us refactor the code giving us the safety net that ensures us that we didn’t break the current behavior while refactoring: we have a good amount of data representing what the behavior of the system should be and at every run, if the tests pass we are sure that the software is maintaining such behavior.

Like while building the golden master, during this test we want to access all the outputs for every possible inputs so we change console’s output stream, this times we don’t need to log anything so we just use a memory stream.

var stream = new MemoryStream();
var writer = new StreamWriter(stream);
Console.SetOut(writer);

Then we read the first input from the log and we execute the program

GameRunner.Main(new[] { "false", currentSeed });

Later we iterate through the log entries checking that the output recorded in the memory stream equals the one recorded in the file.

writer.Flush();
stream.Position = 0;

var reader = new StreamReader(stream);
    
foreach (var line in file)
{
// ...
    Assert.AreEqual(line, reader.ReadLine(), "Seed " + currentSeed);
// ...
}

We iterate through the whole file and if we have no errors the creation of the Golden Master was successful.

The next step

In the next iteration I’m going to have some real fun as I can finally refactor with a little more confidence. I think I’m going to practice Subclass to Test as suggested in this blog post;

In the meantime you can download the source here.

And if you don’t want to miss the next post SUBSCRIBE or a kitten will die somewhere.

Refactoring Legacy Code In Practice – Iteration 1

If you have been comdamned to work on old, ugly, unmantained and untested code, then you probably feel the urge from time to time to clean up a little bit and make your life easier. Let’s face it, cleaning up some mess in the code is always fun, until it’s not: until you have to work in the night to fix some bug you introduced in a core feature just because you wanted to clean up the code.

You know there’s a huge risk in every change you make, but you want to work with better code, so you have two options: you either clean it or you leave your job! Leaving a job is slow and painful and you have no guaranties that you are going to land a job which offers better code to work on, so your only choice is to clean up.

But, where to start? How can you minimize the risks? well, we have written a lot about how to handle legacy code (have a look at our Blog series page) and there are some great books that you can read (go to our Resources page). But what if you want to practice and experiment in a safe environment without the risk of breaking your production code? With some luck you can find some User Group near to you that hosts a Legacy Code Retreat but if you are nowhere near a developers meetup like I am you can always find some ugly source code and try to practice by yourself the same excercices they do in the code retreats. And what better code can you use for this purpose than  one that was specifically designed to be legacy: just go here, download it and start refactoring.

That’s exactly what I did and I wanted to share the process with you in this series of post so you can compare your approach to that of a fellow programmer.

First iteration

Like every legacy project that’s worth its name, there is not enough information about the code you are going to touch. So the first iteration is about discovering how bad its state is and what’t the whole purpose of the code. If you had time you could try to apply some refactoring techniques trying to minimize the risk of adding bugs. After 45 minutes you had to stop and delete everything you have done so far.

After some analysis I discovered that this game is basically a simplified implementation of the Trivial Pursuit. It’s formed by two classes the Game class that has all the business logic of the game and the GameRunner that it’s basically the main entry point of the program. For some obscure reason the game loop is located in the GameRunner.

After starting the game, the game runner adds three players to the game and then starts the game loop that can be described by these steps:

  1. a player rolls a standard 6 faces dice
  2. based on the result and the player’s current status the player moves or remains in the same spot
  3. A question is asked to the player which responds to the question with 1/9 percent change of getting it wrong
  4. the game continues until there is a winner: the first one to reach 6 points.

I still haven’t been able to understand how does the player gains it’s points.

The code

Exploring the code gives you headache: you cannot understand if it was so badly crafted written for the purpose of the exercises or if JB asked a fifteen years old kid who is trying to learn to program to write it. I have tried to build examples of bad code and was never able to create something so beautifully ugly!

By the way let’s have a look at the code:

The game runner starts instantiating a Game object and then it calls and add method with something that looks like player names.

Game aGame = new Game();

aGame.add("Chet");
aGame.add("Pat");
aGame.add("Sue");

The method add in fact adds the player name to a list of strings called players then, it looks like it places the player in the first box with zero points, then it returns true (why returning true? what does it means?).

public bool add(String playerName)
{
    players.Add(playerName);
    places[howManyPlayers()] = 0;
    purses[howManyPlayers()] = 0;
    inPenaltyBox[howManyPlayers()] = false;

    Console.WriteLine(playerName + " was added");
    Console.WriteLine("They are player number " + players.Count);
    return true;
}

Back to the GameRunner class. In the game loop a random number between 1 and 6 is picked and then this number is passed to the roll method. Here there is some duplication and some business logic difficult understand, for example: if inPenaltyBox for the current player is true and the random number (roll) is not even (why on earth check that something is not even? wasn’t it the same checking that it was odd?) a variable called isGettingOutOfPenaltyBox becomes true.

if (inPenaltyBox[currentPlayer])
{
    if (roll % 2 != 0)
    {
        isGettingOutOfPenaltyBox = true;

        Console.WriteLine(players[currentPlayer] + " is getting out of the penalty box");
        places[currentPlayer] = places[currentPlayer] + roll;
        if (places[currentPlayer] > 11) places[currentPlayer] = places[currentPlayer] - 12;

        Console.WriteLine(players[currentPlayer]
                + "'s new location is "
                + places[currentPlayer]);
        Console.WriteLine("The category is " + currentCategory());
        askQuestion();
    }
    else
    {
        Console.WriteLine(players[currentPlayer] + " is not getting out of the penalty box");
        isGettingOutOfPenaltyBox = false;
    }

The askQuestion method checks which category is the player in and pops a question from the question from the queue (smells like bug… what happens if you run out of questions of a certain category? a you get a nice little exception!).

if (currentCategory() == "Pop")
{
    Console.WriteLine(popQuestions.First());
    popQuestions.RemoveFirst();
}

The currentCategory method looks like a really malformed array of strings, but you cannot be sure until you know know how many places there are in the game and which one are actually of “Rock” category, are there some bonus places not considered in this method?

private String currentCategory()
{
    if (places[currentPlayer] == 0) return "Pop";
    if (places[currentPlayer] == 4) return "Pop";
    if (places[currentPlayer] == 8) return "Pop";
    if (places[currentPlayer] == 1) return "Science";
    if (places[currentPlayer] == 5) return "Science";
    if (places[currentPlayer] == 9) return "Science";
    if (places[currentPlayer] == 2) return "Sports";
    if (places[currentPlayer] == 6) return "Sports";
    if (places[currentPlayer] == 10) return "Sports";
    return "Rock";
}

 Refactoring

The first thing I would like to address is the players management, there should be a Player class that handles most of the actions on the player, like it’s points or if it’s in a penalty round. Another improvement could be to isolate is the Board itself in a class that knows places and categories for each one. If you then extract a set of classes that represent the deck of cards and move the game loop inside the game class.

The problem is that these entities are spread across multiple methods and extracting requires changing a lot of places, so it would be better to first have a comprehensive suite of automated tests.

Simple things first

There’s not enough time left so I look for the simplest method to start doing some real refactoring.  I don’t really like currentCategory, there’s a lot of  duplicated code, and a ton of ifs.

private String currentCategory()
{
    if (places[currentPlayer] == 0) return "Pop";
    if (places[currentPlayer] == 4) return "Pop";
    if (places[currentPlayer] == 8) return "Pop";
    if (places[currentPlayer] == 1) return "Science";
    if (places[currentPlayer] == 5) return "Science";
    if (places[currentPlayer] == 9) return "Science";
    if (places[currentPlayer] == 2) return "Sports";
    if (places[currentPlayer] == 6) return "Sports";
    if (places[currentPlayer] == 10) return "Sports";
    return "Rock";
}

The first thing to do is to build the test fixture and to do so I apply subclass to test and promote the method from “private” to “protected virtual” to be able to access it from a child class.

protected string CurrentCategory()

Then I can subclass the Game class and expose the protected method with a public one so I can call it in my tests and the I can start building the test suite.

private class GameForTest : UglyTrivia.Game
{
    public string PublicCurrentCategory()
    {
        return CurrentCategory();
    }

    public int Place
    {
        set { Places[0] = value; }
    }
}

Using subclass to test in often considered an anti-pattern for various reasons one been that with it you test the internal of a class instead of shifting to a higher level of abstraction but in this context is permitted because we just want to preserve the functionality while changing the code and we don’t care if we have to get rid of the test later on.

Building the tests

After having exposed CurrentCategory as a public method we can stub also the places array in order to have fine control over which place a given player is in (in our case player 0). In our case this gives us some advantages like not having roll the dice to move the player arround, just set it’s position do the desired value:

[TestCase(0)]
[TestCase(4)]
[TestCase(8)]
public void CategoryPop(int place)
{
    // we need to set differnt values for places[currentPlayer]
    var game = new GameForTest {Place = place};

    var currentCategory = game.PublicCurrentCategory();
    // I expect it to be

    Assert.AreEqual("Pop", currentCategory);
}

[TestCase(1)]
[TestCase(5)]
[TestCase(9)]
public void CategoryScience(int place)
{
    var game = new GameForTest {Place = place};
    var currentCategory = game.PublicCurrentCategory();
    Assert.AreEqual("Science", currentCategory);
}

[TestCase(2)]
[TestCase(6)]
[TestCase(10)]
public void CategorySports(int place)
{
    var game = new GameForTest {Place = place};
    var currentCategory = game.PublicCurrentCategory();
    Assert.AreEqual("Sports", currentCategory);
}

[TestCase(3)]
[TestCase(7)]
[TestCase(11)]
public void CategoryRock(int place)
{
    var game = new GameForTest {Place = place};
    var currentCategory = game.PublicCurrentCategory();
    Assert.AreEqual("Rock", currentCategory);
}

I wrote four simple parametrised tests, one for each category, and the test’s parameter is the place of which I want to check it’s category. As you can see the tests assert that there are 12 positions and that positions 3, 7 and 11 are Rock category: this is an assumption based on what I’ve learned inspecting the code.

Ok, now that the method is covered I’m free to try different things. First of all I substituted the ifs with a much simpler switch statement, but the conditionals are still there. The tests pass. Then I replaced the switch with an array of strings and promoted it to become a private field of the Game class. The tests still pass. With one more step I convert the array of strings in an array of enum because I don’t like magic numbers; The method still have to return a string because if i change the return I’ll have to make changes in a lot of different places in the code. Again, the tests still pass.

private readonly Category[] _categories = {
        Category.Pop,
        Category.Science,
        Category.Sports,
        Category.Rock,
        Category.Pop,
        Category.Science,
        Category.Sports,
        Category.Rock,
        Category.Pop,
        Category.Science,
        Category.Sports,
        Category.Rock
    };

protected string CurrentCategory()
{
    var place = Places[_currentPlayer];
    return _categories[place].ToString();
}

End of the iteration

I ran out of time after this refactoring, but there is no doubt that my understanding of the code has increased a lot and that the little method I chose to refactor is a lot easier to understand now that there are no conditionals.

If you want to practice your refactoring muscles download the project at JB’s repository, or you can download it form mine where I divided every iteration in a different branch so you can explore what I did.

In the next iteration I’m going characterize the behavior of the Game class building a Golden Master, subscribe if you don’t want to miss it!

7 Steps To Code That Doesn’t Make You Scream

Recently I’ve been maintaining a project with poorly written code that, to make it worst, is the core to most of the activities of its users. The project follows the big ball of mud pattern, almost every method is a huge static method that sometimes take even 20 parameters and that uses the session as a global variable container.

When I started working with this thing I decided to read Working Effectively With Legacy Code by Michael Feathers (which I HIGHLY recommend if you have been working with bad code lately) and it gave me a very good guidance on how to tackle this beast. Thanks to the suggestions on the book I’ve developed a workflow that have given me a way to clean the code one step at a time and reduce the huge methods into more manageable classes, without breaking too many things.

I may need to read again the book to freshen up the ideas and to tweak my refactoring process but, for now, this is how I’m curing this patient.

1. Isolate the Megamoth

As Uncle bob suggests, a huge method oftentimes is just a class in disguise, the idea being that classes should do one thing, and mega methods normally do one or more things.

The first step then is to extract the megamoth to a class with one public function to have all the code in one smaller and more manageable file. Obviously, I have to give a name to this new class but, at this point of the refactoring phase, it doesn’t really matters if the name is ugly, long or not really descriptive: it’s just a placeholder. Of course, I try my best to come up with a good name, but if I don’t understand the class really well I use the name of the method it was extracted from, maybe rewording it a little, just to get things going, I’ll rename it later when I have a better understanding of the purpose of the class.

Another point to notice is that the constructor will not have any parameters: all parameters are going to be passed to the only public method of the class

2. Group together functionality as protected methods

The second step I do is to extract protected methods from the main method in order to group together pieces of functionality. The reason to do so is to have a clear view of what the public method does.

How do I decide on how to group functionality?

  1. first, I look for uses of the keyword region in the code. The previous developers used regions, instead of method extraction, to group functionality inside a method.
  2. second I look for the comments that were used to try to explain what the code does
  3. finally I use try, for and if blocks.

This procedure also helps me when naming the extracted methods. Regions already have a name that explains what they are grouping.

Comments too help me understand the intent of the block of code, or the reason why it’s there, or the subject and date of the email that requested the change… or the name of the guy that made the request…. ok, comments tend to be pretty dumb but at least, sometimes, they give me some help in coming up with a name.

At this point the method signature will have a huge parameter list with both output and reference parameters. I don’t care,  this parameters will be helpful later, so keep on reading.

Finally all method are extracted as protected virtual to be able to override them because of reasons…

3. Isolate dependencies

Once again I extract protected virtual methods but this time I extract just one or two lines of code wrapping calls to methods exposed by external libraries. The end goal of this step is to isolate all the code over which I don’t have any control that hurts the testability of the code.

If I’m very lucky the dependency wont be a call to a static method  but a call to a method of an actual object. In this case, I extract the object and inject it in the constructor. If I’m very very lucky, the class implements an interface, whenever this happens I celebrate with a glass of whisky when I get back home… sadly I haven’t drank any whisky lately.

4. Start building a test harness

Now that I have divided all the code into manageable sizes I start building a test harness for the whole class. By having covered the class with tests I’m able to refactor it later knowing that I didn’t introduce any regression.

I start by extending the class I’m going to test and exposing the method that I’m willing to test with a public version of it. Then I override all of its dependencies and use this overridden methods as stubs or spies depending on the need. I build the characterization tests for the chosen method by forcing the execution of all its branches.

A mocking framework like Moq is really helpful at this stage

5. Improving the methods

After I covered a good part of the class and am confident of not breaking functionality I start cleaning it up a little.

First of all I try to remove all duplication by, you guessed it, extracting private methods and using them in all the parts of the code that were clearly built copy pasting wildly.

Secondly, I start renaming variables and method name to explain better the intent of the code, now that I have built a test harness I also have a better understanding of what the code does and thus am able to come up with better names.

Then I try to improve the readability and the structure of the code by limiting its level of indentation and using LINQ whenever I feel it’s more readable this way.

Output parameters are grouped into DTOs that become the return type of the method.

Internal classes

One final step is to group together methods that have a lot of input parameters in common. If two methods share most of their parameters this means that they share the same state and they belong in a class that encloses this state.

This step has the added benefit of allowing you to delete the parameters from the methods signature and promote them to public fields.

6. Removing Globals

Using global variables has been considered harmful for a long time now, but you can still find some legacy code that uses them and this is certainly the case for the codebase I’m currently maintaining.

Every time the extracted class uses a global variable I extract it so that the refactored code doesn’t directly depends on it. If the new class only uses the value held by the global variable I create a public field in the class and assign this value in the caller method after the class gets initialized.

If the refactored class sets the value of the global variable, i tend to wrap the global variable inside a class whose purpose is to manage crud operations on it and then I pass this wrapper to the extracted class. In this way the dependency on the global variable is hidden and managed in a higher level of abstraction and eventually, over time, I’m going to be able to get rid completely of the them.

7. Renaming everything and commenting

Now that I have a better understanding of the extracted class I can rename the methods and the classes accordingly.

During this phase I also add the documentation comments to help future maintainers of this code base.

Working with really huge methods

Sometimes I find myself dealing with a really big method that is so huge that just one small section of it could become a class all by itself. Or maybe worst: a single method that have the same 200 lines of code repeated over and over again.

In this cases I tend to extract classes from just this pieces of functionality and apply to then all the steps above. This give me the advantage of a more focused and faster refactoring, meaning that I can refactor smaller pieces of code more often maintaining a good rate between implementing new stuff and cleaning the old one.

Other quick wins

If I don’t have time to invest in a complex refactoring like the one I described earlier, I try, anyways, to make little improvements to the code by checking in cleaner code that the one I checked out. Some simple refactoring are:

  • renaming variables and methods to clarify its behavior
  • deleting commented out code

Wrapping-up

Working with messy code can be difficult, but if you were condemned by your manager to maintain a legacy application you have nothing to do but to bite the bullet and try you best to simplify your future work. The steps described above can help you to cut the complexity of the system safely but if you have no time at all do a complete refactoring of a method you could still do the first steps and leave the tests and more aggressive refactoring to another moment when you’ll have more time.

The important thing to remember is that all new code should be tested and that it’s important to clean the code constantly and, if necessary, without asking permission.

If you share the pain of working with megamoths and want to get tips on how to clean up your code don’t lose a second more and subscribe to our mailing list!

How Not To Get Fired For A Refactoring

Friday morning.

I get to the office some minutes before 9AM to check the e-mail before starting what I thought was going to be a very productive and very relaxing Friday. The release went live that night but for the first time in months all the check-ins were done weeks in advance to give the testers all the time they needed. No bugs were filed during the testing phase. The day was going to be a fun day.

I was wrong

At 9:08  a super-high manager enters the office and ask what was happening in  production. He looks alarmed. His face says: “We are losing thousands of euros every minute”.

My boss asks me if we changed anything in the functionality that wasn’t working and I replied that we just fixed a minor bug and that the change couldn’t possibly be the source of the error. But as I stepped through the check-in history my heart stopped for a moment. I did not JUST fixed a minor bug, I also cleaned up the function a little bit.

That refactoring could be the cause of thousands of euros lost.

What I did

I had this piece of functionality that was super ugly, it was inside a monolithic method and I was having difficulty understanding what it did. First of all I did an Extract Method refactoring to separate the ugliness and then I started analyzing it. It looked something like the code below, just with some more comments and conditions. I simplified it a little for this post.

private List<BoxItem> UglyOrderBy(List<CompleteBox> boxes)
{
    var result = new List<BoxItem>();
    // Put first those with FlgDefault equals to "S"
    foreach (var x in boxes.Where(cas => cas.FlgDefault == "S"))
    {
        result.Add(new BoxItem(x.Description, x.Code, x.SubCode));
    }
    foreach (var x in boxes.Where(cas => cas.FlgDefault == "N"))
    {
        result.Add(new BoxItem(x.Description, x.Code, x.SubCode));
    }
    return result;
}

Basically, the two for are there to order the list based on the value of FlgDefault (I know it can only have two values: “S” and “N”). So I decided to change it using Linq to improve its readability:

private static List<BoxItem> NiceOrderBy(List<CompleteBox> boxes)
{
    return 
        boxes.
        OrderByDescending(cas => cas.FlgDefault == "S").
        Select(x => new BoxItem(x.Description, x.Code, x.SubCode)).ToList();
}

The refactoring was pretty straight forward and I also run in debug mode to check that the behavior was correct but now I wasn’t sure of having tested every possible combination.

It turns out that neither QA wasn’t able to test thoroughly, and now that we had this bug in production I was trying to understand whether it was my fault or not.

Luckily, after some debugging I was able to verify that the refactoring was correct after all, and that the bug was introduced be a change in a different part of the application (global variables were involved). But the main problem (the one that almost made me faint) was that I wasn’t sure that my simple refactoring didn’t change the behavior of the system.

I could have avoided all this

All that sweat and that auto-accusation, all that “I’m going to get fired because of this” could have been avoided simply by doing some categorization tests. It would have taken me probably an hour to cover that piece of functionality but, after that, I would have been able to refactor safely and not think about it anymore: I would have never thought that the bug was my fault.

Not only, the code coverage would have increased and with it the overall confidence in the code base and, believe me, that code base could use some confidence.

Final thoughts

The problem with working in a legacy system is that there’s never time to improve it, you spend a lot of time trying to add functionalities and fixes without breaking everything and there’s never time to refactor properly.

After this experience, though, I promised myself to never change again the implementation on even the most simple functionality without first writing a test harness that will make sure that the current behavior of the code doesn’t change.

I would love to know your horror experiences and what you did after

To know more about our adventures in legacy code and to get tips on how to clean up your code subscribe to our mailing list!

RULE 1: All new code is tested code

Picture this, you are working on a brownfield project (hard to picture, right? Is there anything else anymore?). You hate it. It’s a mess, and you would like to clean it, but… There’s always a but. You are afraid, you are afraid to change anything, beacuse the last 10 times you changed something, something else broke.

There are not test to save you, and I’m not talking only about those pretty automated unit tests you read about on the web, I’m talking about anything, even manual tests could help here, and the document with the users requirements? Well that hasn’t been updated in 2 years, so it’s useless. Even if you wanted to test the whole system at every change, there is no way to be sure if it is behaving as expected.

There is one rule you need to know when you work in an environment like this. Is the only rule that can make a dent in this kind of software. It isn’t simple, it isn’t fast. But if you follow this rule, with time you will find yourself working in a more pleasant project.

All new code WILL be tested by automated tests

It’s hard to admit it to yourself, but that mess I was talking about, well… we wrote that mess. To complete a deadline, to appease a client, to get home in time. We didn’t write tests, we didn’t update the documentation, we didn’t refactor once we had “working” code.

Now you will tell me that no, this isn’t your fault, you didn’t write this mess. You just started working on it a week ago, once the main developer changed jobs. Well, but you wrote a mess in your previous job and somebody else is maintaining it. KARMA!

It doesn’t matter if you wrote this particular instance of mess, what matters is that you are responsible of it, you are maintaining it, and you will be doing it for the foreseeable future. Now might be acceptable to say “This is a mess, the previous developer was a noob and made a big pile of mud, instead of a software”, but in 6 months, 1 year or 2 years, when you will still be working on it, there’s no way you can say that. And if you do your colleagues should ask you “And… what have you done to improve it?”.

“Nothing” is not an answer.

You had time.

Time is all you need, and time is what you have. Start small. Every little change to the software is new software, it is your little greenfield project. Treat it like that. Work on a new class, write tests for that class, refactor that new beautiful (little) class, and then go back to the mudball and make a one line change calling your new wonderful perfect class.

OK, at first it probably won’t be the most beautiful piece of software ever written, but, it will be tested, you will be able to refactor it, you will be able to improve it, and keep working on it. You won’t be afraid of it, and nobody should.

The beauty of this is that little by little you will carve your way out of the mudball. More and more parts of it will be sane, will be tested. And with proper refactoring techniques you will be able to isolate and test also some of that horrible code. And once it is covered, you can refactor it.

I’m sorry for you. I’m sorry that I can’t sell you the magic silver bullet that will fix all your problems in a week (in that case I could be rich), but this is the way, this is the only way to improve broken software, start small, keep going and sooner or later (OK later), you will be working on good software again.

Just keep in mind:

All new code WILL be tested by automated tests

 

What is legacy code?

vintage-cars-336674_640

Legacy is a term commonly used with a positive meaning, if I’m talking about property law I may have inherited a wonderful house, or a big pile of money. In Italy the legacy of the roman empire are really beautiful buildings. But at work when I say I work on legacy code, everyone knows why I’m sad.

Nobody expect me to mean that I’m working on the Colosseum of code. A beautiful, well crafted piece of art, a bit aged, but wonderful to look at. No, they know I’m probably working on a big pile of mud. With no clear design, no documentation, and no tests.

So what is legacy code?

There are many definitions of it. It depends on who you ask, but in my opinion there are two main factor that indicate that you are working on legacy code:

  1. It is code that is in production
  2. It is code that you are afraid to change

If this piece of code is not in production, nobody uses it, it can’t be called legacy code, it’s just old code, retired. Nobody cares about it. One important thing to remember is that legacy code is code that is working, and the users need it to continue working, either because they need it directly or because some other system depends on it.

The second point is the reason that we hate legacy code. We hate it because we are scared of it. This code is fragile, is complicated, is a mess. Legacy code is code that resists change, it takes an extraordinary effort to change something small and simple. This is because we probably didn’t write it, so we don’t know all its little secrets, or we forgot most of them. A small change in a component might break one or two other components that shouldn’t depend on each other.

Legacy code is code without tests

Michael Feathers

This quote comes up a lot when talking about legacy code, and many people criticize it. They said essentially that it isn’t true, you can have code that is not legacy without unit test.

What I think they are missing is that given enough time, code without unit tests will become legacy code (if not retired). The amount of work required to keep a testless codebase sane, is orders of magnitude higher than a properly unit tested codebase.

That’s why you can say that code without unit test is legacy code. Because maybe not today, maybe not tomorrow but quite soon it will become legacy code. Sooner or later the business won’t keep pouring money at it, developers will have just enough time to make quick changes and the code will rot.