let's talk about Java: The consequence of using Test Double

”With great power comes great responsibility”. This quote of Uncle Ben was and is with Peter Parker through all his career as a Spider-Man. It was used to describe how we should handle with difficult situations in our lives and that each choice we make can have an impact not only on ourselves, but also on many other people. I think that quote should be the sentence that each developer will repeat all over again. Just a sec before he/she will starts using Test Double (TD). Because TD gives as the great power which allow us to control everything around tested unit. And yes, it makes everything easier, but as Einstein said: “Everything should be made as simple as possible, but not simpler.“.

Each choice you made while you were deciding whether to use or not TD has impact on tests readability, code coverage, tests value, ease of introducing changes or doing refactoring. Each choice you made has impact on your team mates and probably even on those developers that you don’t even know yet.

That’s why we have to made our choices conscientiously and carefully.

That’s why it is so important to have as wide knowledge about used tools and techniques as possible.

What are Test Double?

Test Double are patterns that allow us to control dependencies between the tested unit. To make it possible to provide wanted behavior whenever we want to or/and verify whether wanted behavior occurred.

We have the following Test Double Patterns:

Stub
Spy
Mock Object
Fake Object
Dummy Object

Today’s article is not about covering what each pattern is. However, if you want to learn about Test Double Patterns you can find more information here.

Unit – what is your definition?

I don’t want to discuss what Unit is: whether a method, a class or a whole component. We can assume that it can be each of these, because even when we are starting with unit as a method, our project lives and during its lifecycle the method can grow and become a complex component. Is it still a Unit or maybe it’s not anymore? I leave this question open and encourage you to open your mind to it.

Why it is so important?

Because your understanding of Unit will have an impact on how your unit test will look like. The more things your definition include, the bigger probability that you will abuse TD in your code.

Houston, we’ve got a problem…

Once TD are a great tool to make our lives easier, our tests faster and focused only on a particular unit, there are pitfalls that is really easy to find ourselves after a while when we are using them.

Below are some situations/behaviors that I observe are happening when developers falls into the TD traps:

Unit tests are everything we need
Let’s write unit tests for our fresh functionality. Let’s care about coverage, let’s make it 100% or a little bit less if it is not possible. Great, now we know that everything is… working?
This type of thinking gives fake certainty that our code will do what it should do. But it’s important to remember that using TD is making assumptions of external behaviors and reactions. What do unit tests tell us? That unit work as expected. Yet, until we won’t test the parts as a whole, with no assumptions, but with real flow, we don’t know whether functionality build form this pieces will fulfill the requirements.
Testing unreal situation
When a developer is getting used to TD, he/she starts using it automatically and such an automatisation sometimes leads to the stopping of brain usage. We create a group of tests which will cover each loop and each statement, we build assumptions around it, but we forget about one important thing – are our assumptions valid?
I saw a few times that coverage was high, yet, we were testing situation that was impossible in real life. And TD makes it possible -- there is no problem to assume that something may happen, even if it’s not.
We are using aggregations – everywhere!
We are not using composition (in the meaning of association type) anymore. Even in the situation where it makes sense that one object should be responsible for lifecycle of another one and created object shouldn’t exist in a different context than inside the one which would create it. To make our lives easier, and to make it possible for ourselves to use TD, we are using aggregation instead.
Dependency? This is a place for TD!
That’s also true, many of us start to use TD everywhere they can. Even if using a real object is cheaper, even when the given object is an inherent part of the object where it would be injected (ie. places where we used aggregation instead of composition).
More Test Double = more fragile code
The more Test Double Patterns you are using the more assumptions you are making around your code. The more assumptions you are making, the less accurate your tests are. The less accurate tests are the more fragile code becomes.
Mocks returns mocks returns mocks returns…
You know Law of Demeter? I bet you do. After I mentioned this I believe it become clear for you that mocks which are returning other mocks that we would like to interact with are nothing more than violation of it.

Testing implementation, not functionality

We should test functionality, not implementation. How we do it is not important, but the result. Yet, when we are using TD sometimes we do nothing more than checking whether a unit is implemented as we thought it is. And what are we doing it for? It can check nothing about functionality. Are we doing it to satisfy coverage? Or maybe we are just get used to writing unit tests? But writing them without any value is nothing more than waste.

Let’s consider a situation when we have a few instructions that are executed in a really important order:

public void process(Event event) {
a.process(event);
      b.process(event);
      c.process(event);
}

To test this functionality, we are mocking all dependencies which are objects a, b and c.

What’s next? We are writing a test when we are checking whether each object’s process() method was executed with given parameter. It’s fine, but it gives us not value because we have to verify that each operation was executed one by another. So what we are doing? We are checking the order of execution. Done. We did a good job, so we can move to the next task.

But is it really true? What we just did is test implementation. Not functionality, in such a test there’s nothing about functionality at all.

So what we can do about this? I’ve got a few:

Add return value to each method.
In our example it is obvious that each executed method does something with an event object. If this is the case, we can return event from each and with usage of Mocks we can verify that passed parameter was the one returned from the previous method.
Unfortunately it does not always make sense nor it is possible to do.
Write tests without TD
We have to verify whether this functionality works anyway. Component tests, integration tests, system tests or any other type of tests that. Yet, you have to create a test which will verify whether this functionality works as a whole, not only check whether each part is doing what you design it to do.
So, if there are tests which will cover this shouldn’t be enough for us? Is there any need for something more?
Use Chain of Responsibility Pattern
Using this pattern is nothing more than protecting the order, but thanks to implementing it we won’t need to show this order in our test. Information details won’t leak and will stay in the place where they belong – in the application’s code.

Sometimes it is not a case that we are fully duplicate our code, however we are still in the tests are using knowledge that shouldn’t be known in that place.
Let’s consider a method which is responsible for building a query, execute it and return a result. And what would verify the test if we would check what SQL was passed to method responsible for query execution? Are we even interested what was passed there? I believe that only thing which gives us any value is output of the tested method.

Another problem connected to testing implementation is the fact that refactoring code is not safe anymore. You can change the implementation of the class and you can even don’t notice this. Your intention is to extract some part of code or change one structure into another or whatever else which shouldn’t have any impact of functionality. But because of the number of assumptions in tests, and because everything is stubbed, spied or mocked, you can change the code, you can think that everything is ok, but… rather sooner than later you will know that your refactoring affects the functionality.
Going further into this topic, refactoring can affect not only the functionality, but with each change we have to also change our tests. Extracting a class? Adding Test Double. Change internal dependency? Changing test double. And so on.

Changes are not safe as well. With only test like this you don’t have a place which would verify whether everything is working or not. You can change functionality of the method in classes which is widely used in other places, but assumptions in other tests will remain, and won’t fail. Even if they are not valid anymore. Of course you can check each test and see whether it still has valid assumptions, you can change those assumptions if needed and those tests should fail after this change. If the change of course affects them. And if their fail, you should look also for the tests where this objects are used as a dependency. And so on.
Ok, we can do this. But, wait a sec, is it not why we are writing a test? To get faster feedback without analyzing each part of code?

Boundary Objects

Boundary objects are the ones which are nothing more than a wrapper for external dependency. Using this classes is great, because it is safe for us to mock them in other places. Why? Cause we are the owner of it. It depends on us whether its API will change or not. We have not such a certainty with external classes.

Those type of classes are the one where you shouldn’t use any TD at all. Why? Because the only thing that is important for us is whether the wrapper brings us the functionality of wrapped class. That’s why we should create integration tests which will prove that the result of each method will meet our expectation.
And even if external API will change we won’t have to change our tests. Our wrapper class protects us against this.

Code extraction

It always happens. We are starting with a small functionality, just a small piece of code, a few lines of code. But as new requirements appear, change is needed and our code grows. After some time it can turn out that we need to extract some part of code from the birthday place, because with a number of new lines it becomes unreadable. Nothing new and I believe that all of us experienced this many times before.

Ok, but where are TD in here? There is none yet, but when we will extract the part of code we will have to ask ourselves whether our test should stay and we still use the real code, but it will be just placed in a different class. Or maybe we would like to mock this dependency?

There are a few different results of extraction and they will have a bearing on our decision:

Depend on Class
We can extract our code and our dependency will be based on a class, we will expect instance of it. Why? We could decide to split the code to increase readability and that was pure refactoring, not preparation for coming changes and being dependent on class is simply enough. In such cases I think we can leave the test at it was and we can base on real code, not introducing TD. Yet, it is important to remember to add unit tests for extracted class and when something new will appear we shouldn’t add more tests to our first class, but we should cover with tests extracted class. The origin one should cover only the most typical cases.
Depend on Interface
When the reason of refactoring was the preparation for a change it sometimes happens that we extracting part of code which can be replaceable. In that cases except for classes with functionality we are also creating an interface. This situation is a great example where using TD is a really good idea.
Depend on Class from different module
Sometimes we extract our code, because we either noticed that we would need the part of the functionality in another place or we notice there is duplication in our code or maybe we would like to use a class which already provides such a functionality. Whatever is the case we should treat these classes like boundary objects.

An ideal solution would be to put class that we are depending on in various places in a separate module, in the places where we are using it (different modules) we are creating interfaces (which would guarantee the same functionality) and then our extracted classes can implement each of …?.

Or we can create a wrapper for this class in each module.

Still, if the part of code is not to big I think that it’s pretty ok to do exactly how I suggest with situation where we are depends on a class. If the code is big it would be wise to consider applying the two first solutions.

Test Double? Yes? No?

Below I want to share with you a few things that can help you decide whether you should think about TD or not.

We are the owner of the interface
I have mentioned it already, but I think it is so important that it deserves its own section. We should use TD only on interfaces that are ours. Why? Because their API depends on us and by doing so we are creating less fragile code.
In places where we are using external APIs we should either wrap them into our own class or we should test the code without TD.
Creating an instance is lighter
Why should we use TD if creating a real object is lighter and faster? And additionally more readable and it is not an assumption anymore!
We shouldn’t use TD only because we can do it. We should use TD to make our life easier and if easier is create a real object than we should not use TD.
Tests are more readable without TD
If tests should fulfill role of documentation than it have to be as easy to understand as possible. I believe that you agree with me that ten lines of assumptions to set up environment has nothing to do with readability.
That’s why we have to use as little TD as possible. If we need to have a huge setup methods, with many lines, maybe it’s time to split tests into a few files? Maybe we are testing a few methods and this is a sign that in this case our unit is a method? Maybe it would solve a problem? Maybe this is a place which needs to be covered with component or system tests? Because unit tests starts to tell you nothing and somehow you feel that instead of fast feedback they will delivery only rapid changes into incomprehensible code. Just to make it green?

Too many Test Double?

I touched this subject in the previous paragraph. We now know what happens with tests that have too many TD in it – they don’t bring needed value. They are not clear and readable anymore. They cannot be used to share knowledge or to explain something. Or even remind us of anything.

Of course firstly it would be great to agree how many TD should be treated as a warning signal and how many should be treated as more than enough TDs. There’s no one clear answer, but from my experience and from what I tried to find we should see warning sign whenever amount of TDs is from 2 to 5. In these cases we need to pay attention. If the number is higher, we definitely should think about refactoring.

But before we start talking about what can we do, I just want to stress it to make it perfectly clear, that if we have one mocked object and we write a code which says what two its method should return we are not talking about one TD (object), but two (methods). Why? Because our assumptions is not a fact that instance is not real, but what effect will have those two methods.

Below you have some ideas which maybe can help you to improve your code:

Law of Demeter
If you see mocks that returns other mocks, it means that you are increasing dependencies of the object and what is even worse – you are hiding it by passing different object. Also this code violate Law of Demeter and this is a good idea to consider what is wrong in here? Maybe you are passing wrong object as a param? Maybe you should add method to the given object and remove this unwanted dependency?
Single Responsibility Principle
If you see tests and tones of assumptions around this it can mean that you are passing too much or there is too much dependencies. Both are signs there maybe responsibility of this object during life time changed into responsibilities? It’s worth spending a minute and investigating it.
Parameter Object Pattern
If you are testing the method that takes a few parameters and executes only a small handful of methods from the input objects it is reasonable to consider introduce Parameter Object Pattern. It will allow you to decrease of assumptions (one object instead of x) and because of this it will make your tests more readable.
It is often possible if you are using in your method’s tests many input dummies to test particular cases. Introducing Parameter Object Pattern will make those dummies unnecessary. And fewer lines means less code to read and that means tests that are easier to understand.
Your Unit is no longer a Unit
You create a method and you wrote tests for it. Then the method starts to grow and your initial object’s API also. New methods and new functionality were also covered by tests, but after a while you see the number of TD and you asking yourself what went wrong? Nothing, it’s good when our code grows and we are constantly adding something new. It means that is still alive. Still, probably you missed the moment when your Unit (one small method in one small code) become a component (class with a complex methods) and maybe it is still fine to have all those methods in one class. But probably it would be a good idea to split your test file into files and make all of your tests more readable. If it still won’t help you can always try ideas from the previous points.

Make your life easier

Below you can find some tips which make your life with TD easier.

Don’t use accessors
Accessors are hardly related to implementation. Well, to be honest, they’re all about implementations and by adding them to our code we are doing nothing less than allowing to leak implementations details outside. And I think this is not what we want to do. Functionality is about behavior and our code should be focus on it.
Yet, I know there are cases when it fair to use accessors, but until they aren’t placed in any interface there’s a big chance that you don’t need it.
What impact on test accessors (mainly getters) have? Those are the places where you are violating Law of Demeter. Those are the places where you are creating hidden dependencies. In simple terms,, side effect of accessors is more TD in your tests.
Don’t add additional behavior
You shouldn’t create ie. mock which expects params and doing a calculation on it to decide what should be an output. Why? Our code is complicated enough and that’s why we need tests – to make sure that functionality in particular scenarios will work as we want it to.
By adding complexity to our TD we are making our tests less accurate and code more fragile. In case of failure we don’t know any more whether the code or our implementations of TD were the reason.
Tests need to be simple to read. It brings value. The more complicated they’re, the less information we will get without careful analysis of them.
No logic in your constructors
Let’s imagine a situation where you are passing to parameters to constructor and inside method body you are executing instructions to calculate value for an instance attribute. And you are using external objects for that.
How would you test it? How to inject something? What about SRP? An object is responsible now for setting its own state and for a provided functionality.
We don’t want these problems. Of course we need to create objects in a valid state, but we can solve the issue by using creational patterns and move responsibility about building an object there. Simple, isn’t it?

Summary

Test Double Patterns are a great and powerful tool and because of this it is so important to know as much as possible about pitfalls that are waiting for all of those, who are using them on daily basis. I’ve hope that article will help you and will make it easier for you to see warning signals before they will change into something more problematic.

If you want to learn more about Test Double Patters and the ways how to use it wise below you can find a few really good links:

And at the end I want to share with you quote that is good to remember whenever you are considering whether you should use TD or not:

Mock objects can give you a deceptive sense of confidence, and that's why you should avoid them unless there is really no alternative.

Cedric Beust

Thursday, April 16, 2015

The consequence of using Test Double