One thing I’m particularly proud of on our team is the number of unit and integration tests we have. Our code coverage isn’t 100% by any means (not that code coverage matters), but we’ve got a solid suite of tests that allow us to quickly check to see if new code has broke any existing processes. On top of that, it’s helped us find a few bugs before they hit production which is always nice. I’d like to document a few of the hurdles we had to jump and the patterns we use for our test suites.
Culture of Testing
First off, it’s been a bit of a bumpy road to get our unit and integration tests to the level of importance they need to be. And it’s not like our team didn’t want tests to be important, everyone saw the value in unit and integration testing. It just takes a concious effort to make tests as important as new features, or bug fixes, or anything else. Recently our code base was in a state where about 1/3 of the integration and unit tests weren’t working for various reasons, and as a team we decided either we had to fix all those tests or straight up abandon them if we want to get any value from it.
So, as a team, we made a concious effort to set aside 3 days of our work week and do a “code retreat” of sorts to get all our tests passing. And now as a result, all of our tests work on all our dev machines. This is important for many reasons, but the main one is confidence in the tests. If integration tests work at random or don’t work at all for a developer, they’re not going to bother running them because they’re always going to fail. In addition, if you never run the integration tests because they always fail, you’re sure as hell not going to write new ones. It’s amazingly easy to fall into this trap as a team, so it’s important to adopt a culture of test maitenance.
Some of the most common problems we ran into with our broken tests were configuration issues, such as developer databases not being up to date, local IIS and other services not configured properly, and in a few rare cases some path-specific stuff. One neat trick that Matt found recently was the file attribute that you can use in your app/web.configs in .net:
1 2 3
The issue we were trying to address was that we had our web and app configs checked into source control, but they also contained developer-specific values for database connections and the like. By putting those developer-specific app settings into a separate file that is not maintained in source control, we now don’t have to worry about checking our web.configs (and app.configs in integration test assemblies) every time we pull from git.
One thing I’d also like to introduce to our team is some sort of “setup script” as well to help address the IIS and other config issues. Most of these issues are easily solved by hand, but only if you know where to look. By creating a powershell or some equivilent script we could automate setting up the apps, virtual folders, and application pools we need in IIS without requiring each team member to have indepth knowledge of our configuration. This would have saved us many times when we onboard new developers or even when existing team members get new hardware.
For database-related issues, we use Database Projects in Visual Studio. Personally I’m not a fan of how these are setup, but we’ve done a bit of devops work to make it more managable. In our team (and company) we use an xml file and custom dev tool to auto-generate table, key, and stored procedures (as well as typed data rows, tables, and sets) for working with sql databases. The problem we were facing required us to migrate all those generated files from the generated C# project path to the database project path one at a time by hand, and also rename a few them as well. Doing that manually for dozens of sprocs every time you create a new table definition became a pain in the ass, so we created a command line utility that just does a blanket copy & replace action from one path to the other. We also changed the dbproj file to use wildcards instead of having a specific entry for each individual sql file, which saves us from having to manually update the project file (either by modifying the xml directly or using visual studio) every time we add something.
Once your databse project is setup properly, it’s a simple matter of using a Deploy action in visual studio to auto-migrate your database to the current version. It really takes the headache out of making sure everyone is up to date, as any developer can just manually update their own database. It also handles changescripts via the Post-Deployment step, so you can do manual data migrations as well (keep in mind you’ll want to make “defensive” changescripts to ensure you don’t re-apply the same changes every time you deploy). And, when we do deploy to test or live, we can generate a changescript and give that to our deployment team with the confidence that it’ll update to the same version as everyone else. Of course, all of this is in source control as well, and can be tagged / merged / branched / all that good source control stuff.
There are other tools that provide similar database functionality, such as Red Gate’s SQL Source Control. The key is having a database setup that is easily reproduced and migrated on new and existing developer machines.
Test Initialization & Cleanup
Specifically for integration tests, it’s important to have the data source you’re using in a state that you can assert against, and also to clean up after yourself. If you write a test that says “get all invoices from the database”, and you haven’t actually inserted an invoice into that database, you can’t be confident that it’ll work every time. What happens if someone else runs your test against a new, empty database? Failure. And it won’t be apparent why it’s failed – the person running your test on their new database just sees that one of the assertions failed, or that an exception is thrown from somewhere in the depths of the data layer.
Just as important, you need to be able to predict the results of your integration test so you can do proper asserts against them. Asserting that you got “at least one row” from a database call provides little-to-no value; if someone breaks a calcualtion on your invoice, your test will still pass because you only care that you got “something” back. Asserting that you got the invoice you previously inserted into the database, and it came back with the customer name and total you expect, has a lot more value. And you can only do that if you create and know about the state of the database when you set up your test.
On the other side of this is database cleanup. The idea is that you leave the test system (database, local file system, etc.) in the same state that you found it when you start the test. That involves deleting any inserted or modified rows, removing files that may have been created, etc. There’s a bit of flexability in this, in that as long as you’re not leaving invalid data behind technically having an extra invoice in your database probably won’t break anything. The main reason that you might want to do this is if in your test setup you insert an invoice with a very specific id, and if that id already exists in your database you’ll throw a “duplicate key” error. By removing that invoice at the end of every test, you can be confident that your test setup will work every time. Other than that, having extra rows in your database shouldn’t really break your application (or if it does, you’ve found a bug!).
The end result of getting all these tests working consistently on everyone’s machine: test confidence. Now, anytime someone does a pull request on a new feature, they can run all the tests and ensure that they didn’t break any existing functionality. And they can add their own tests to help other developers on the team ensure that they haven’t harmed their new feature. Now that everyone can be confident that the tests normally pass, they know for certain if their code has broke anything.
All this and I haven’t even got to the actual test stuff. Oh well, this is all important, so I’ll save it for the next post.