Encapsulate your classes and avoid exposing references to mutable private fields !

Unfortunately, there are many articles (and code samples on the internet) which are ignoring the object-oriented concept of encapsulating the private details of your classes. You should avoid to expose a direct reference to a private mutable field. It is way too common to consider encapsulation as something as easy as declaring a field as private. When a private field is mutable (i.e. when it can be modified) and a direct reference is exposed through a public method (or .NET property) then declaring the field itself as private does not help very much regarding the encapsulation. You have probably (too often) seen this kind of code:


// below, the return value of "getLines()/Lines" 
// is a reference to a private mutable collection 
// field which is exposed through a public method/property

// Java
Order o = new Order();
o.getLines().add(...);

// C#
Order o = new Order();
o.Lines.Add(...);

It is not uncommon to see the above kind of bad code examples in articles (even in sources you should expect to be of better quality, e.g. MSDN articles or other kind of code samples provided by Microsoft or others). When developers are frequently feeded with this kind of bad example code, then how will they ever learn to do better in their real life application development?


// The code below is better than the code above:
Order o = new Order();
o.Add(...); 
// the Add method will store the 
// order content in a private field 

If you want to expose a readable collection (without the intention of modifying it) then you can for example use the .NET class 'ReadOnlyCollection<T>' or the Java method 'Collections.unmodifiableList' (and you can see Martin Fowler's PDF article Data Access Routines for an example of 'unmodifiableList' in figure 7). However, it should also be noted that since both above implementations (ReadOnlyCollection and unmodifiableList) are providing the client code with mutable methods (which will throw exceptions) as defined in their interfaces, they are both violating Liskov substitution principle. This means that the interface clients will not be able to treat all implementations in a transparent way without considering whether an interface implentation support the mutator (modifying) methods or will throw an exception.

By the way, regarding Fowler, he has also written a partially related article named "GetterEradicator", though that article is more focused at the allocation of behavior, e.g. applying the GRASP "Information Expert" concept (defined by Craig Larman in the book 'Applying UML and Patterns') or the "Tell Don't Ask" principle (by the pragmatic programmer's Andy Hunt and Dave Thomas). IMHO, it is unfortunate that this frequently read article (as all Fowler articles are) does not make a clear distinction between mutable and immutable data. The article implies that it is not too essential to ask yourself the question "am I exposing data" but I wish he would be more negative about situations when you are "exposing mutable data" and let external clients modify private fields directly. The modification should preferably be controlled by the aggregating object.

The rest of this article will focus on one particular example regarding bad encapsulation. I will show an example of what a better approach could have looked like.

I have found the below example in the article "Exploring The Continuum Of Test Doubles" from MSDN Magazine, the September 2007 issue. This magazine is published by Microsoft, and I really think that Microsoft should be able to do better code reviewing of their articles. I mean this is a huge company with enormous resources and they have a part of their website named "Microsoft patterns & practices" which they define as "Microsoft's recommendations for how to design, develop, deploy, and operate architecturally sound applications for the Microsoft application platform. (quoted 2007-10-08)

In my opinion, an "architecturally sound application" should not let the classes ignore the object-oriented concept of encapsulation, which is applicable on Microsoft's platform (as well as Sun's/Java platforms).

Below is a test method as written in the original test code (from the above mentioned MSDN article), which is testing the content (including order) of a private mutable collection field exposed through a public property. The bold code below are the bad parts under discussion.

(I have made some unsignificant changes to the original code, because I used the NUnit framework instead of the test framework used in the article. To be more specific, I made this changes: [TestFixture] instead of [TestClass], and [Test] instead of [TestMethod], and Assert.AreEqual instead of Assert.AreEqual<decimal> )


Code snippet 1
   [TestFixture]
   public class OrderTest {
      ...
      [Test]
      public void CalculateLineTotalsUsingFake() {
         FakeShopDataAccess dataAccess = new FakeShopDataAccess();
         dataAccess.Products.Add(new Product(1234, 45));
         dataAccess.Products.Add(new Product(2345, 15));

         Order o = new Order(9, dataAccess);
         o.Lines.Add(1234, 3);
         o.Lines.Add(2345, 2);

         Assert.AreEqual(135, o.Lines[0].Total);
         Assert.AreEqual(30, o.Lines[1].Total);
      }

The first bold code rows above are mutating a private field through a LoD (Law of Demeter) violation. Also, in the above assertion code, direct access is used to the same private collection field (see code snippet 2 below), and here we are also making assumptions about the order of the elements in the private field. Now, let's refactor the code by introducing an Add method into the Order class. The bold code below is the new code (i.e. not existing in the code in the original MSDN article) in the Order class.


Code snippet 2
   public class Order
   {
      // ...
      private OrderLineCollection orderLines_;        
      // ...
      public Order(int orderId, IShopDataAccess dataAccess)
      {
         // ...
         this.orderLines_ = new OrderLineCollection(this);
      }

      // this public exposure should eventually become eliminated
      public OrderLineCollection Lines 
      {
         get { return this.orderLines_; }
      }

      public void Add(int productId, int quantity)
      {
         this.orderLines_.Add(new OrderLine(productId, quantity, this));
      }

Now, after the refactoring in code snippet 2 above, we can use the method Order.Add instead of Order.Lines.Add. This will also mean that the code below is more LoD compliant, though we still have some test code violating LoD. The bold code below are the changed parts under discussion, compared with the original code as in the MSDN article (see code snippet 1 above).


Code snippet 3
   [TestFixture]
   public class OrderTest {
      // ...
      [Test]
      public void CalculateLineTotalsUsingFake() {
         FakeShopDataAccess dataAccess = new FakeShopDataAccess();
         dataAccess.Products.Add(new Product(1234, 45));
         dataAccess.Products.Add(new Product(2345, 15));

         Order o = new Order(9, dataAccess);
         o.Add(1234, 3); // LoD compliant !
         o.Add(2345, 2); // LoD compliant !

         Assert.AreEqual(135, o.Lines[0].Total);
         Assert.AreEqual(30, o.Lines[1].Total);
      }

This modified test code above works fine (i.e. the execution of the test will still be green) when using the new 'Order.Add' method. Now we can try to change the implementation of the Order class. The bold code below are the changed parts under discussion, compared with the code in code snippet 2 above.


Code snippet 4
   public class Order
   {
      // ...
      private OrderLineCollection orderLines_;        
      // ...
      public void Add(int productId, int quantity)
      {
         this.orderLines_.Insert(0, new OrderLine(productId, quantity, this));
      }

Now, the assertions code in code snippet 3 will fail of course, because the assumptions about the order of the elements in the private field is no longer correct since the private implementation has changed. Actually, there is no obvious good reason for modifying the code as above (as in code snippet 4) but nevertheless the test code and the code under test should not be written is such a way that the test will fail just because we add the ordering of elements in the privately declared field. (at least in this domain it does not make sense to consider the ordering of the fields relevant)

The problem is that the Order class does not properly encapsulate the data, and the test code (and other code!) is allowed to access the private implementation directly, and should not do this kind of assumptions. Below, the test code, and the code under test, is rewritten with better encapsulation. The bold code in the test class (OrderTest) below are the changed parts under discussion, compared with the code in code snippet 3 above, and the bold code in the Order and OrderLineCollection classes are the new parts.


Code snippet 5
   [TestFixture]
   public class OrderTest {
      // ...
      [Test]
      public void CalculateLineTotalsUsingFake() {
         FakeShopDataAccess dataAccess = new FakeShopDataAccess();
         dataAccess.Products.Add(new Product(1234, 45));
         dataAccess.Products.Add(new Product(2345, 15));

         Order o = new Order(9, dataAccess);
         o.Add(1234, 3); // LoD compliant !
         o.Add(2345, 2); // LoD compliant !

         // Also LoD compliant:
         Assert.AreEqual(135, o.GetTotalForProduct(1234)); 
         Assert.AreEqual(30, o.GetTotalForProduct(2345));
      }

   public class Order {
      // ...
      private OrderLineCollection orderLines_;
      // ...
      public decimal GetTotalForProduct(int productId) {
         return this.orderLines_.GetTotalForProduct(productId);
      }

   public class OrderLineCollection : Collection<OrderLine> {
      // ...
      public decimal GetTotalForProduct(int productId) {
         decimal total = 0;
         foreach (OrderLine orderLine in this) {
            if (orderLine.ProductId == productId) {
               total += orderLine.Total;
            }
         }
         return total;
      }

Now we can add some more test. The bold code below are the new code compared with the test code in code snippet 5 above.


Code snippet 6
   [Test]
   public void CalculateLineTotalsUsingFake() {
      FakeShopDataAccess dataAccess = new FakeShopDataAccess();
      dataAccess.Products.Add(new Product(1234, 45));
      dataAccess.Products.Add(new Product(2345, 15));

      Order o = new Order(9, dataAccess);
      o.Add(1234, 3);
      o.Add(2345, 2);
      o.Add(1234, 7); // altogether 10 items of "product 1234"

      Assert.AreEqual(45 * (3 + 7), o.GetTotalForProduct(1234));
      Assert.AreEqual(30, o.GetTotalForProduct(2345));
   }

How would the above test (in code snippet 6) have looked like with the original kind of test impementation (in code snippet 1)? Well, it would have looked something like the code below. The bold code are the new bad code compared with the original bad test code in code snippet 1.


Code snippet 7
   [Test]
   public void CalculateLineTotalsUsingFake() {
      FakeShopDataAccess dataAccess = new FakeShopDataAccess();
      dataAccess.Products.Add(new Product(1234, 45));
      dataAccess.Products.Add(new Product(2345, 15));

      Order o = new Order(9, dataAccess);
      o.Lines.Add(1234, 3);
      o.Lines.Add(2345, 2);
      o.Lines.Add(1234, 7);


      Assert.AreEqual(135, o.Lines[0].Total);
      Assert.AreEqual(30, o.Lines[1].Total);
      Assert.AreEqual(7 * 45, o.Lines[2].Total);
   }

Now I will repeat the problem with the above kind of code: External code is directly modifing a private field (the lines collection) and we then will (in the above assertions) continue to make assumptions about things we should not know about (the private structure of an aggregated field) since the private code should be better encapsulated.

Now, let's say that we have the better test code as in code snippet 6. Then we will now change the private implementation of the 'Order' class (as below) in such a way that we will be able to change it without breaking the client code. Below we eliminate the exposure (and usage) of the previously exposed private mutable field, while also modifying the actual implementation. Most of the bold code below are the new and changed code compared with the 'Order' class in code snippet 2 (and the method 'GetTotalForProduct' has changed implementation compared to code snippet 5)


Code snippet 8
   public class Order {

      // 'OrderLineCollection' is not used/exposed anymore:
      // private OrderLineCollection orderLines_;	 
      // public OrderLineCollection Lines {
      //     get { return this.orderLines_; }
      // }

      // ...

      // new private implementation field 
	  // (which is not exposed, but only used internally!)
      private Dictionary<int, OrderLine> _products = 
				new Dictionary<int, OrderLine>();

      public void Add(int productId, int quantity) {
         if (!_products.ContainsKey(productId))
         {
            OrderLine orderLine = new OrderLine(productId, quantity, this);
            _products[productId] = orderLine;
         }
         else
         {
            OrderLine orderLine = _products[productId];
            orderLine.AddQuantity(quantity);
         }
      }

      public decimal GetTotalForProduct(int productId) {
         if (_products.ContainsKey(productId)) {
            return _products[productId].Total;
         }
         else {
            return 0;
         }
      }

   public class OrderLine {
      // ...
      private int quantity_;
      // ...

      // new method (compared to the class as written in the MSDN article)
      public void AddQuantity(int quantity) {
         quantity_ += quantity;
      }

Another option to the above code in snippet 8, could be to let the 'OrderLine' class remain immutable (which it now is not, with the above addition of the method AddQuantity). Then, instead of modifying the referenced OrderLine instances, you can replace the instances with new instances, as illustrated with the code below. The bold code below are the changed code compared with code snippet 8.


Code snippet 9
   public class Order {
      private Dictionary<int, OrderLine> _products = 
			new Dictionary<int, OrderLine>();
      // ...
      public void Add(int productId, int quantity) {
         OrderLine totalOrderLine = null;
         OrderLine newOrderLine = new OrderLine(productId, quantity, this);
         if (!_products.ContainsKey(productId)) {
            totalOrderLine = newOrderLine;
         }
         else {
            OrderLine existingOrderLine = _products[productId];
            totalOrderLine = newOrderLine.AddOrderLine(existingOrderLine, this);
         }
         _products[productId] = totalOrderLine;
      }

   public class OrderLine {
      // ...
      private int quantity_;
      // ...

      // removed method (i.e. compared to code snippet 8) 
	  // (to let the class remain immutable)
      //public void AddQuantity(int quantity) {
      //   quantity_ += quantity;
      //}

      // new method
      public OrderLine AddOrderLine(OrderLine existingOrderLine, Order order) {
         return new OrderLine(
			this.ProductId, 
			this.quantity_ + existingOrderLine.quantity_, 
			order
         );
      }

The above method 'AddOrderLine' is almost a pure closure of operation method (see below) with the exception of the 'Order' parameter. I do not like the 'Order' parameter since it leads to a bidirectional dependency between concrete classes (Order and OrderLine) but I consider it as beyond the scope of this article to refactor it away.

By the way (a little offtopic), the above mentioned "Closure of operation" (link to MS Word document) is a concept defined in the book about Domain-Driven Design, aka DDD, which means that the types of the return values and argument(s) are the same as the type which defines the method, i.e. generally speaking a "Closure of operation" will look something like this:


// "Closure of operation" general method example: 
interface SomeType {
	SomeType method(SomeType s);
}

// Actual "Closure of operation" method 
// example from the .NET class library: 
public sealed class String : ... 
{
	...
	public String Replace(String oldValue,String newValue) {...}
	...
}

Summary

The following kind of code (copied from code snippet 1) should be avoided.


o.Lines.Add(1234, 3);
o.Lines.Add(2345, 2);
Assert.AreEqual(135, o.Lines[0].Total);
Assert.AreEqual(30, o.Lines[1].Total);

The reason is the bad non-object-oriented encapsulation which is exposing the private mutable private field (and in this case also through a 'Law Of Demeter' violation). Instead, the API for the code under test should be written in such a way that the test code can look for example like this instead (copied from code snippet 5):


o.Add(1234, 3);
o.Add(2345, 2);
Assert.AreEqual(135, o.GetTotalForProduct(1234));
Assert.AreEqual(30, o.GetTotalForProduct(2345));

When the code instead looks like above, then the private implementation of the 'Order' class can change without affecting the above client code. For example, the added order lines might be privately aggregated in some 'OrderLineCollection' class, but they also might be stored with some other kind of class, e.g. 'Dictionary<int, OrderLine>' (as in code snippet 8)


// Tomas Johansson , 2007-10-08 //

Valid XHTML 1.0! Valid CSS! Blog