In order to improve the build time of a recent project, we took steps to identify and split out our build based on an emergent testing strategy. I will try to outline that strategy based on things we tried and thoughts we had. In an attempt not to descend into a discussion over tools and library preferences, I will only mention our intentions and not tools. Where I supply code examples, these will be in pseudo code in the Given/When/Then style.
We commonly use the ‘outside-in’ testing pattern, and that is how I will describe the steps we have taken when thinking about this strategy.
End to end testing
I believe there are two broad types of end to end testing which fit nicely into white box and black box testing methods. In both cases we drive the application from the outside interfaces, often a web UI.
On this recent project we had an overwhelming number of tests that drove a web UI. Anyone who has done web UI testing will know the time if often takes to run these tests. We frequently repeated a series of steps to get the application into a known state before getting to the actual test. This led to an ever increasing build time. In order to get the build time under control, we started to programmatically get the application in the right state before visiting the part of the application under inspection. This was not to replace the previous tests, but complement them.
These thoughts were recently solidified by Badri Janakiraman in his talk on Creating maintainable automated acceptance tests. When talking about curation of tests, Badri mentions identifying the core user journeys in an application. He goes onto encourage extracting those journeys from the acceptance test suite. The remaining tests are of a more functional feature, as in they are testing functions of a system, rather than the system holistically.
For example, a journey test may look like the following
Given an anonymous user
When I select items for my shopping cart and checkout
Then I receive an email with confirmation of my purchase
A functional test may look more like the following
Given a user at the checkout
When that user applies a discount code
Then a discount is applied to the final cost
These may not look all that different, but the journey is testing an end to end core path of the system. The code underneath should only drive the web UI in trying to accomplish this task. However the functional test gets the system into the state required and then drives the web UI to complete the test and run assertions.
With end to end testing, we inspected our application from it’s very edges. We controlled a web browser so that the tests behaved as if it were a user of the application. With integration testing, we want to inspect a slice of our application but we will do so by calling application code directly rather than drive a UI.
Our tests will look very similar to our functional end to end test.
Given a shopping cart
When a discount code is added
Then that code cannot be redeemed in another shopping cart
We are testing functions of the application and we want to ensure all parts of the application are integrated correctly. We will test at certain layers but the execution paths will visit many areas of the code base, dependent libraries and persistent mechanisms.
Testing can occur at various layers and it really depends on where you want the test coverage. Testing database integration is a good example of this sort of test. The application has started, perhaps there are classes which talk to a database directly through a driver using a query language or maybe the code goes through a library that abstracts the intricacies of the underlying persistence system. It can sometimes seem as if the library is being tested rather than the logic of the application but by focusing on complex queries, validations and scopes can reduce the likelihood of that happening.
What to do with third party services?
These examples may include interacting with a third party service, perhaps some payment service. For this, the test may choose to stub out the interaction with the external service, or run a fully integrated test. Integration with the service should absolutely be tested, but I leave it to the reader to decide where that test runs.
I like to have the possibility of running offline so I would consider integration with 3rd party services as being a separate part of any testing strategy. As with testing 3rd party libraries, you don’t want or probably need to test these as much, especially if there are limits or costs involved. However with the prevalence of software as a service offerings out there it is wise to have tests that run something in the order of once a day to make sure no breaking changes have occurred in the service and that the application continues to integrate as expected. These tests may not be part of the main build pipeline but you’d also want these to run if your application went into maintenance mode to ensure the service remains compatible.
Dealing with non-deterministic tests
Non-deterministic tests are the bane of any project. When the build time increases a flakey test can really suck the reliability out of any build. One way to deal with these tests is to ‘quarantine’ them. This typically means having a new build in your CI server which is triggered by the core build. This means if it fails you only have to run this one build instead of all of the rest of the tests you know are ok. In many CI servers, triggers can be configured to fire on many things, including a failed build. If the flakey build keeps retrying when it fails then it will pass soon enough. If this build is included in any pipeline to staging and production you can make sure all the tests are good, even if it just once.
This is a way of dealing with flakey tests, but I’m not endorsing keeping them there. Efforts should be made to fix those and get them back into the core build.
One of the anti-patterns that comes up is hitting the database in unit tests. I’ve heard arguments in favour of this, but what you’re really trying to test is the business logic of your application, not the persistence library you’re using. I’m not saying don’t have tests that hit the database at all, but they should be reserved for elsewhere in your testing strategy, notably in the integration and end to end stages.
If you really must have a database in your unit tests, consider using an in-memory database so you avoid all the disk reading and writing. I know some will say you should be using the same setup as much as possible as your production environment, but those environments will be tested in a later phase of the strategy.
Another possibility is wrapping your tests in a transaction that can be rolled back. Many tests I’ve seen rebuild datasets for each test or group of tests and this can mean a lot of time spent doing setup rather than test execution. If you rollback on each test, nothing will be committed to the database, ensuring the data is consistent without rebuilding each time.
Tests are a way to increase confidence in an application but that confidence can be shattered if you do not trust your tests, or they take so long to run that you bypass them in various ways. Fundamentally, tests and build scripts should be considered in the same way that production code is. This small change of thought process can lead to better, more reliable and quicker tests.