How I sped up Beanstalk’s test suite by 5x
Over the years of Beanstalk’s evolution, our test suite has grown a lot. We started on test/unit, then moved to rspec and moved back to test/unit couple years ago. We are far from 100% test coverage and we’re not truly test-driven most of the times, but we’re striving to test more and more.
Unfortunately this goal has one negative consequence: the more tests you have the slower they run.
According to our Jenkins setup at some point our complete test suite build time was taking 36 minutes (!!) and that is excluding slow integration tests running against real systems. With some optimizations that I’m going to talk about below we’ve got this down to 5 minutes, which is still not ideal, but much better.
First of all, our test setup is:
We are not yet using latest and greatest libraries for testing because we’re still running on Ruby 1.8.x and Rails 2.3.x, but we have plans to upgrade Beanstalk to Rails 3 and Ruby 1.9 in near future. On my local machine our test suite took 650 seconds. Here are my steps towards optimization.
Testing in RAM
So my first obvious idea of optimization was to try to point tests to RAM disk as a temporary storage. We do a lot of file IO during tests, so I thought that’d make a difference. I was surprised to see how little it helped, only shaving 20 seconds from the total time, not nearly enough to make a difference. I think that the gain wasn’t dramatic because I was already running tests on Intel SSD. Regardless, I’ve decided to stick with that optimization and now tests use ram disk when run locally. I’m creating ramdisk with Esperance DV, a free app for Mac, but you can do it easily with terminal almost on any platform.
Parallel-tests
Next idea was to run tests in parallel using parallel-tests gem. While it’s very nice tool, it broke some of our tricky Subversion/Git hooks code, because of the DB configuration changes that are required to set it up. I imagine for simpler project it could be a drop-in solution for seriously improving tests performance, however it didn’t work in our case. It’s quite possible that we’ll get back to trying it out in near future, though.
For a second I thought about upgrading to a 12-core machine with crazy SSD RAID0, but then I just had to face the reality — our tests have to be made fast by fixing the tests, not the supporting code or computers that run them. So I dived in to investigate what is it that is taking so long to run?
Testing against mocks
One of the reasons why our tests were slow is because most of them run against the database, not mocks. We initialize a lot of stuff in every setup block with Factory Girl and sometimes that even requires creating a repository on disk with commits in it. Needless to say, it’s a slow operation. To fix this properly we would have to rewrite everything with mocks, not really an option for a project the size of Beanstalk. However I noticed how wasteful we used test/unit setup blocks.
Optimizing test/unit setup blocks with tranactionata
Each setup block was initializing some fixtures using FactoryGirl and then it was run over and over again against each single test, even if the test didn’t do anything destructive with the data or didn’t touch the data at all.
We have 1139 tests so these setup blocks were repeated a lot. After Googling for some time I found transactionata, a gem which was created to solve this exact problem. What it does is it allows you to have a generic “test_data” block that runs once per test suite, instead of a setup block that runs before each individual test. If you use transactional fixtures in Rails, then each test runs in a transaction and even if it changes records created in test_data block, those changes will not be visible to another test, so the integrity of test data is preserved.
Here’s how our test looked before using transactionata:
class AdminMailerTest < ActiveSupport::TestCase
context "Admin Mailer" do
setup do
@account = Factory(:account)
@account_record = AccountRecord.create_from_existing(@account)
end
should "deliver some notification" do
AdminMailer.deliver_some_notification(@account)
assert_sent_email do |email|
email.subject =~ /Some Notification/ &&
email.body =~ /Notification text here/
end
end
should "deliver another notification" do
AdminMailer.deliver_another_notification(@account)
assert_sent_email do |email|
email.subject =~ /Another Notification/ &&
email.body =~ /Notification text here/
end
end
end
end
And after:
class AdminMailerTest < ActiveSupport::TestCase
# This block runs only once per test case
test_data do
acc = Factory(:account, :name => "admin-mailer-account")
AccountRecord.create_from_existing(acc)
end
context "Admin Mailer" do
setup do
@account = Account.find_by_name("admin-mailer-account")
@account_record = @account.account_record
end
# ... rest is the same ...
end
end
You can also do other expensive initializations needed to run your tests in test_data and it will work much better than in generic setup block or method. We do this to initialize file system fixtures with repositories
Important: In order for transactionata to work properly you need to have empty fixture.yml files in your test/fixtures directories for the Rails models you’re going to use in test_data blocks. You also need transactional fixtures enabled in your Rails environment (which is on by default). The way transactionata works is by nicely hooking into the Rails fixtures and that eliminates a lot of potential issues that exist with other similar solutions.
Thanks to transactionata and some other hacks internal improvements, on my machine our test suite now runs in less than 120 seconds, which is 5.5 times faster than before. Jenkins also reports test suite to be 5-6 times faster. Kudos to Christoph Olszowka for a great tool.
Generally in Beanstalk we’re moving to building more decoupled systems and APIs and testing them more with mocks. However given already big legacy test suite, the technique presented here allowed us to quickly gain some speed and make testing a bit more fun. I hope this will be useful for other folks out there.
Update: After reading the comments I’ve decided to give parallel_tests gem another try and I was not disappointed — tests on my machine now take 57 seconds when running on 4 cores. It took some time to make our test suite work in parallel, but it was totally worth it.