How we migrated Beanstalk to Rails 4 and Ruby 2
In October 2011 the groundwork began to migrate Beanstalk to Rails 3, the latest version of Rails at the time. In a few weeks it became apparent that the migration would take much more effort than we originally anticipated and the work had been scrapped. Since then the idea to migrate to newer Rails came up several times during our meetings. However, each time the decision was to postpone the migration to deal with more pressing updates.
In March we scheduled the migration once again and the plan was not only to update the Rails version from 2.3 to 4.0, but also bump the Ruby version from 1.8 to 2.0, because 1.8 was near its end of life deadline. We ended up successfully finishing the migration in just eleven weeks.
TL/DR of the migration:
- Migrated from Rails 2.3 (4 years old) to Rails 4.
- Migrated from Ruby 1.8.7 (5 years old) to Ruby 2.
- Migrated from jQuery 1.5 (2 years old) to 1.9.
- Took us 11 weeks.
- Went through almost 200 tickets in our Sprint.ly account.
- Made 1069 commits.
- Changed more than 1,100 files.
- Changed around 25k lines of code (w/o white-space changes).
- Decreased amount of generated routes by 37%.
Planning the Migration
The new migration was planned much more thoroughly than all of our previous attempts. The development of all features was stopped and all developers were assigned to work solely on Rails 4 (apart from critical production fixes). We set an optimistic deadline of two months and pessimistic of four. It wasn’t an easy decision for us to stop developing features for that long, but we decided that if we plan to stick with Rails we had to bring our stack up to date.
We also decided that the migration was a great opportunity for us to refactor some things in the code base that would be otherwise too expensive to touch during regular iterations. Each major refactoring requires an extensive QA stage that can be very time-consuming and expensive. But since during the migration we have to test 100% of the app no matter what, why not “break” some things on the way? The decision to upgrade jQuery from version 1.5 to 1.9 was one of such things and it ended up being a huge success.
About Our Application
Before the migration Beanstalk was using Rails 2.3 and Ruby 1.8. The application is 6 years old and has more than 60,000 lines of code. Our gem bundle contains about 150 gems. Originally Beanstalk was written on Rails 1.2 but was easily migrated to Rails 2 as soon as it came out.
The Big Rewrite
First thing we did after installing new Ruby and Rails versions on our dev machines was to make the Beanstalk app bootable. We had to update the environment files, built-in Rails scripts, boot.rb, routes and so forth. We also had to quarantine all initializers, rails patches and std lib extensions we had and add them back one-by-one to make sure that they still make sense and work under new conditions. A lot of patches were removed that way because they were either already fixed in Ruby/Rails or were replaceable by new language or framework features.
A thorough review of the Gemfile was conducted to find gems that we no longer needed or didn’t work on Ruby 2 and Rails 4. We replaced some outdated gems with their newer counterparts and forked other gems to fix compatibility issues in them ourselves.
Meanwhile, we planned with Eugene to update the jQuery version from 1.5 to at least 1.7 to support new UJS functions in Rails. It wasn’t necessary for the migration, because we could rewrite UJS functions easily to work on 1.5, but we decided to give it a shot because we wanted to update jQuery for a long time anyway. Eugene ended up updating all of our scripts to jQuery version 1.9 singlehandedly in just several days.
It took us a few days to fix the environment and make the app bootable and available via browser. Next stop was to make the tests run. This was a bigger challenge as we were using a severely outdated version of Shoulda which no longer supported new Rails nor new Ruby. We made a decision to throw it away and use minitest instead. It took me a few days to convert our existing test suit to minitest and refactor some things to make it run on Ruby 2.0 and Rails 4.
After the test suit became runnable Dima, Chris L. and I started churning through failures and fixing things. It took us weeks to decrease the number of failures/errors from hundreds to dozens. At this time we didn’t even run a web server to play with the actual app. We decided that until all tests are green it doesn’t make sense to move forward with any other kind of testing as it would be a waste of time.
The biggest challenge during the rewrite was dealing with encoding issues. Ruby 2.0 is drastically different from 1.8 when it comes to handling character encoding. In 2.0 you have to be very careful when manipulating strings and reading/writing IO streams. You have to know exactly what encoding you are working with and what results you are expecting. Otherwise you will end up with millions of different exceptions. We use official Ruby bindings for Subversion to communicate with repositories from the app. Unfortunately version 1.7 of the bindings (which we used at that time) didn’t support even Ruby 1.9 completely. There were some patches in the master trunk that fixed a few issues but it was never released. So we had to compile edge Subversion just to use newer bindings. And then we had to patch more problems ourselves to make it work with Ruby 2.0. Similar work was done to add Ruby 2 compatibility to mercurial-ruby and grit gems that we use for Mercurial and Git support respectively.
We ended up changing more than 25,000 lines of code in over 1100 files and making more than 1060 commits. We even managed to decrease the number of generated request routes by 37%, added support for Asset Pipeline and improved encoding handling throughout the app.
After the main bulk of rewriting was done and all tests were fixed we moved to the next stage: dev testing. The idea behind dev testing was to save time during QA. Every feature that we ship to production has to go through the QA process. Igor, our QA pro, has to test all aspects of the feature on staging to make sure that it works the way it should.
The only problem with QA is that it’s very time consuming. It involves quite a lot of setup, communication and back and forth between the developers and the QA team. Especially considering that Igor is in a different time zone from us. We figured that if we just push Beanstalk to staging straight after fixing tests we would drown in QA tickets. So instead we decided to test every feature on our local machines first and only then push to staging.
Igor created a testing plan for us that contained around 500 test cases. Our test plan was basically a spec of Beanstalk that was aimed at humans rather than computers. Each test case would describe a behavior of a specific feature: where to find it, how to trigger it and what result to expect. We split the test plan between three of us (Chris L., Dima and me) and started verifying features on our local machines, catching issues that weren’t caught by our test suit.
We fixed a lot of issues that way and I’m glad we didn’t waste Igor’s (and everyone else’s time) by doing this on staging during QA. It was a good decision.
After completing the developer testing phase we pushed the app to staging for proper QA. We couldn’t push it to our existing staging server as our environment was completely different at this point, so Russ used this as an opportunity to refactor our Chef cookbook and build a new server from scratch. And so Igor got his hands on Rails 4 version of the app for the first time and started testing.
Igor was basically going through the same testing plan that we used and was testing the same features that we tested. Each day he would generate a pack of bug reports for us (usually when we were asleep because of the time zone difference) and we would fix all of them while he was sleeping. This way we rarely waited for him and he rarely waited for us, it was very efficient.
In the end we processed around 200 tickets during the QA phase.
Roll Out Plan
Beanstalk is using roughly 4 types of servers: web machines, daemon machines, deploy machines and VCS access frontends. We planned to roll out Rails 4 first to web, then daemon and then deploy machines. Frontend is the only type of servers that doesn’t use Rails and therefore didn’t require any updates. All other servers we had to build from scratch to support the new stack.
It’s important to note that when we were rewriting the application we made sure to preserve compatibility between the web app and daemon/deploy workers. We wanted to gradually mix Rails 4 machines to the pool of Rails 2 machines, so we wanted to make sure that, for example, a Rails 4 web machine can communicate with Rails 2 daemon and deploy machines without any issues. This gave us amazing flexibility during roll out and made it possible to launch Rails 4 without a big stressful launch day.
After all testing was done we built new Rails 4 web servers and started adding them to the balancer with minor weight during low hours. We watched the machines and incoming customer requests very closely during that time and quickly fixed arising bugs and ironed out edge cases.
When we felt comfortable with the new version of the app we made the weight of the Rails 4 machines equal to Rails 2 on the balancer. At this point we had equal number of Rails 2 and Rails 4 machines, our daemon and deploy machines were still completely on Rails 2. Soon we started slowly decreasing weight of Rails 2 machines and then shut them down completely. After some grace period we replaced them with Rails 4 servers and killed the temporary Rails 4 machines that helped us during the transition (virtual machines vs physicals).
We then did the same procedure for daemon and deploy machines: we slowly mixed in Rails 4 machines into pool of Rails 2 machines, then phased out Rails 2 machines completely. Until there were no more Rails 2 machines left and I was able to sit down and write this blog post.
I can’t believe we actually did it. For years we anticipated a Rails migration and began to feel like it was never going to happen. I couldn’t even imagine migrating straight to Rails 4 AND changing Ruby version AND jQuery version at the same time!
Now that the migration is done we are focusing entirely on shipping great new features. Our plan for the upcoming months is packed with things that will change the way you use Beanstalk. Stay tuned!