30 Sep 2008

Using NDepend to help guide Refactoring

In my other blogs entries I mention that I have been looking into building a Team Foundation Server Data Warehouse Adapter.  After I got my initial proof of concept version working, I started to extend it to get a list of available builds from TFS and then import the output of Source Monitor – a popular free code metrics tool.

Now I must state that I am not a natural .NET coder. My normal day-to-day work is based around the SQL Server technology stack. I do have some .NET coding skills but generally just enough for what is need in and around SQL Server. So I asked my colleague, Howard van Rooijen who is one of our top .NET coders, to have a look at my code and give me some hints.

This provide me with a great opportunity to learn from Howard about some more advance Design Patterns,  Visual Studio tips and tricks, and how to effectively use some tools I’ve not used before to help me write better code. The first thing Howard did was install the following on my machine:
  • ReSharper – to help me refactor the code, fortunately we have just rolled out ReSharper licenses to all .NET Developers!
  • StyleCop – to help remind me of the Microsoft C# Coding Conventions
  • TDD.NET  to ease the pain of running my unit tests
  • NDepend – a code metrics tool
Howard wanted to integrate NDepend into my TFS Data Warehouse Adapter and suggested that the best way to understand the tool would be to use it to analyse my TFS Adapter code and see if it would be able to give the direction on where to focus our refactoring efforts.   

NDepend  comes with 82 code metrics out of the box, but it’s far more than a standard code metrics tool. It comes with a bundle of best practice design rules which are written using Code Query Language – an idea so simple and elegant that I bet the FxCop team are kicking themselves for not thinking of it first. CQL allows you to write statements like:
// Methods poorly commented
WARN IF Count > 0 IN SELECT METHODS WHERE PercentageComment < 20 AND NbLinesOfCode > 15  ORDER BY PercentageComment ASC

You can even embed CQL directly into your code via attributes. NDepend also has features like being able to compare different builds (Patrick Smacchia, NDepend’s developer demonstrated this feature on the recent .NET Framework 3.5 SP1 release), manage complexity, dependencies and monitor the health of your build process.

We started by running my code through NDepend and the initial results were that my code has high instability and not at all abstracted.  I couldn’t really affect the instability as my code was just a thin wrapper around the TFS API.  It also highlight that some of my old best practices that needed updating as many of them are now redundant with C# 2.0 features (generics, static classes etc). It also highlighted performance gains I could make.

Howard said that he was having some difficulties understanding what the intention of my code was, as he was unfamiliar with building Data Warehouse Adapters. He asked me if I could write some unit tests that would call the code and verify the output – so he could use the “Test with... Debugger” feature of TDD.NET to see what was going on with the code.

Once I had written some tests, Howard wrote two more: one that passed in a null value and another that passed in an empty string; these, he explained were two valid edge cases for handling bad data and to my surprise both tests failed! Once I fixed both failing tests, he asked me to run the whole test suite again and this time I was really surprised as all of my previously passing tests now failed! The issue was with two helper methods that convert string based representations of number within the Data Warehouse into valid number primitives.

NDepend pointed out lots of issues with these methods and Howard said that a simple Regular Expression could replace a lot of the parsing logic I had written, so I set about refactoring the code.

   1:  public static string CleanInt(string number) 
   2:  { 
   4:      string result = string.Empty; 
   5:      bool foundNonDigit = false; 
   6:      int index = 0; 
   8:      if (number == string.Empty) 
   9:      { 
  10:          return "0";
  11:      } 
  13:      while (!foundNonDigit) 
  14:      { 
  15:          if (char.IsDigit(number[index])) 
  16:          { 
  17:              result = string.Format("{0}{1}", result, number[index]); 
  18:              index++; 
  19:          } 
  20:          else 
  21:          { 
  22:              foundNonDigit = result.Length > 0; 
  23:          } 
  24:      } 
  25:      return result; 
  26:  } 
  28:  public static string CleanFloat(string number) 
  29:  { 
  30:      string result = string.Empty; 
  31:      bool foundNonDigit = false; 
  32:      bool foundDecmialPoint = false; 
  33:      int index = 0; 
  35:      if (number == string.Empty) 
  36:      { 
  37:          return "0"; 
  38:      } 
  39:      while (!foundNonDigit) 
  40:      { 
  41:          if (char.IsDigit(number[index]) || (number[index] == '.' && !foundDecmialPoint)) 
  42:          { 
  43:              result = string.Format("{0}{1}", result, number[index]); 
  44:              foundDecmialPoint = number[index] == '.'; 
  45:              index++; 
  46:          } 
  47:          else 
  48:          { 
  49:              foundNonDigit = result.Length > 0; 
  50:          } 
  51:      } 
  52:      return result; 
  53:  }

   1:  public static string CleanInt(string number) 
   2:  { 
   3:      int resultFromTryParse; 
   4:      // check for empty string after cleaning string 
   5:      if (number.Trim() == string.Empty) 
   6:      { 
   7:          return "0"; 
   8:      } 
   9:      // if try parse works return the number in string format 
  10:      if (int.TryParse(number, out resultFromTryParse)) 
  11:      { 
  12:          return resultFromTryParse.ToString(); 
  13:      } 
  15:      // else return results of clean GetCleanNumberString 
  16:      return GetCleanNumberString(number, "Integers"); 
  17:  } 
  19:  public static string CleanFloat(string number) 
  20:  { 
  21:      float resultFromTryParse; 
  22:      // clean the number and check to see if it's empty 
  23:      number = number.Trim(); 
  24:      if (number == string.Empty) 
  25:      { 
  26:          return "0"; 
  27:      } 
  28:      // try parse the number if works the return the number 
  29:      if (float.TryParse(number, out resultFromTryParse)) 
  30:      { 
  31:          return number; 
  32:      } 
  33:      // returns result of Get clean number 
  34:      return GetCleanNumberString(number, "Floats"); 
  35:  } 
  37:  private static string GetCleanNumberString(string number, string numberType) 
  38:  { 
  39:      // Create a regualr patten with name groups do a macth on the number entered 
  40:      Regex regex = new Regex(@"(?<Floats>(?<Integers>\d+)+\.\d+)|(?<Integers>\d+)", RegexOptions.Singleline); 
  41:      Match match = regex.Match(number); 
  42:      string results = string.Empty;  
  44:      // if match groups are found the return the string 
  45:      if (match.Groups != null) 
  46:      { 
  47:          results = match.Groups[numberType].Value; 
  48:      } 
  49:      return results; 
  50:  }

Now all my tests passed and NDepend was happy too.

Next NDepend highlighted some of my methods that had too many lines of code. Howard said he was having a bit of difficulty understanding what the code was doing and explained to me the notion of Programming by Intention.

Next he paired with me and went through the method reformatting the code. He said that by reformatting, reorganising and removing redundant variables you start to see patterns forming within the code, that clumps of logic naturally come together and how you can then use ReSharper’s Extract Method refactoring to extract that clump of code into a separate method. Now instead of having a mass of hard to understand code – the method simply listed a series of steps that occurred – the intention of the method had been made clear. Howard was happy because he now understood what the method was doing and NDepend was happy as the method was now shorter and a great deal less complex.

In my original proof of concept I had used partial classes as a mechanism to extend the main TFS Data Warehouse Adapter and implemented another partial class in order to integrate Source Monitor.  Howard pointed out that this approach wouldn’t work if we wanted to integrate a second tool.

He said that now we had cleaned up the code and broken down the few large methods into a series of smaller methods we had surfaced the behaviour required to integrate an external code metrics tool into the Data Warehouse Adapter and that we could use ReSharper’s Extract Interface refactoring to move the definitions of this behaviour into an interface called ICodeMetricAdapter and then use this interface to implement any other 3rd party code metrics tool.

Howard said that we had now essentially implemented the adapter pattern. Next we refactored the existing code to use the interface instead of the concrete type and we extracted the Source Monitor code into a new file called SourceMonitorAdapter which implemented the ICodeMetricAdapter interface. We added a List<ICodeMetricAdapter> to the parent object and then refactored the code to loop through this list and then call the current ICodeMetricAdapter.

We managed to get this all done in one afternoon, just as I was packing up Howard asked me “Now, what do you need to do to add an NDepend Adapter?” I created a new class called NDependAdapter and made it implement ICodeMetricAdapter. Howard showed me that I could use the Implement Members quick fix by selecting the interface and hitting ALT-ENTER. Then all I had to do was add a new instance of the NDependAdapter to the List<ICodeMetricAdapter> and the new NDependAdapter is automatically called!

Below is the before and after class diagrams of the solution:





No comments: