Beanstalk

New Servers: The Hardware

Posted on October 31, 2012 by Chris Nagele Chris Nagele

It’s been more than a week since we migrated to the new environment. While we continue to tune the servers, overall it was a huge success. Based on our internal benchmarks and metrics, performance has improved dramatically. I really want to say thank you to all of our customers for dealing with our issues before the migration and for the patience during the process. I really think it was worth it and I hope you do as well.

Now that we’re running smoothly I want to cover some of the hardware specs and decisions that were made in the process. I already covered some items in my post about moving to colocation. Now let’s get into the more interesting details.

Choosing the hardware

When we made the decision to go colo we had to analyze every piece of our existing infrastructure to figure out where to allocate resources. There were a ton of decisions we had to make. What vendor or manufacturer do we choose? Do we go with the latest and greatest or older stable hardware? Which SSDs will give us reliability and speed, but not insane cost?

Beanstalk and Postmark servers

Russ, our sys admin, led most of this process. We decided to work with Rackmounts Etc, a company he worked with in the past, for the server builds. This let us put many of the compatibility and reliability issues in their hands since they have more experience. They build thousands of servers and deal with the RMA process all day, so they know much better than we do.

Most of the servers are pretty straightforward. They use a Supermicro chassis and motherboard, LSI raid controller or HBA, and newer generation Ivy bridge or Sandy bridge processors. Installing the servers went pretty smoothly, except for a few issues with BIOS firmware and the RAID controllers.

General specs across all of the servers:

  • A minimum of 24GB RAM and some go up to 198GB.
  • All servers run on 64GB, 256GB or 512GB Crucial M4 SSDs in at least RAID1, a total of 70 drives.
  • Other than our back-end drives for storage we don’t have any spinning disks.
  • 36TB of total usable storage space for repositories on our file servers. This does not include our stand-by backup/recovery servers or offsite backups.
  • Dual HA firewalls and network switching.
  • Currently we’re taking up a full cabinet and beginning setup in the second, adjacent cabinet. Hosted at Server Central in Chicago.
  • 10Gbit-E networking for all storage related traffic.

Our goal was to purchase hardware to support 2-3× growth. We want to know we can safely grow, but also have room to add more resources at any time if needed.

An Example: Storage Servers

The largest servers in the mix are definitely the storage servers. The specs we require for these servers are also a big reason why we moved away from Rackspace to colocation. We need servers that can hold a lot of disks and provide extremely high IOPS. At the time, the most we could get from Rackspace were Dell R710s which only allow six disks. We had to custom order R510s with a long provisioning process, but even they fell short of our needs. When we realized the performance and growth potential of building our own servers we dove right into the colocation process.

Storage Servers

Before I get into the server hardware I should explain the software side. We are huge fans of ZFS over here. Features like compression, snapshots, hybrid storage and send/receive makes every other file system seem primitive compared to ZFS. We used to use Nexenta, a software appliance built on top of OpenSolaris/Illumos. Since we know enough about managing the servers on our own, we decided to move away from Nexenta and use OmniOS, which is also based on Illumos. What this gives us is a very minimal server that we can use for ZFS.

The specs for each storage server include:

  • 4U Supermicro chassis
  • Supermicro X9DRI-F Motherboard with Dual quad E5 processors
  • 9211-8i HBA
  • 198GB DDR-3 RAM, expandable to 512GB
  • 2 × Intel 320 SSD (OS drives)
  • 1 × ZeusRAM drive (log device – will get into this in a minute)
  • 3 × 256GB Crucial M4 SSD (read cache)
  • 20 × 2TB Western Digital (our only spinning disks)
  • 4 × Intel x520 10Gbit Adapters

These servers are beasts, but they have to be. They get the bulk of the workload on our service since our customers are constantly pushing and pulling data on their repositories. Our internal processes are also putting them to work for things like imports, deployments, checking storage size, and caching items for our web app.

The ZFS pool is setup in the following way on each server:

  • 2TB drives used for primary data in mirror config
  • Crucial SSD for read cache
  • ZeusRAM for log (write acceleration)

If you are not familiar with ZFS, the huge strength of it is hybrid storage and caching. There are three aspects of this:

  • ARC, or Adaptive Replacement Cache. ZFS will place the most recently and most frequently used data in RAM for very fast read performance.
  • L2 ARC, or Second Level ARC. By adding SSDs as cache devices, ZFS will place any read cache that does not fit in the ARC on SSDs.
  • Log, or ZIL. By adding SSDs as log devices, ZFS will accelerate random write performance.

In our servers, we packed them full of RAM (198GB!) to take advantage of the ARC. We use about 512GB of L2 ARC for anything that does not fit in RAM. And we purchased the very specialized ZeusRAM device ($2,250 each) for speeding up writes. This device is benchmarked to do about 100,000 4k write IOPS! The rest of the data gets stored and accessed on the 2TB drives, with a couple of spares just in case.

The storage is accessed over 10Gbit Ethernet using NFS. We did a lot of testing and tuning to reduce latency in order to get around the network overhead of NFS. After a little more than a week in production the cache is building nicely and working extremely well. For instance:

  • We have a 93% average hit rate for ARC cache. Up to 40k IOPS according to arcstat.
  • The L2 ARC is still warming up, but the ARC is working so well that it is hardly being used so far.

Hopefully you’ve noticed the speed improvements as well. If you have (or have not) please let me know!

How much did all of this cost?

As I stated in the colo post, a big decision for this move was about having the choice to go with whatever hardware we wanted. Another big reason was cost. The average monthly cost compared to leased hosting is significantly lower with colo, even considering we tripled resources in some places. The goal was not to spend less, because we were easily affording the managed hosting. The goal was to invest more into the future of our environment, allowing us to constantly improve performance and keep them running reliably.

All said and done, it cost us in the six figure range to get the hardware we wanted. However, considering what we were spending on leased servers each month, it was a no-brainer. Here is an example of our storage server again.

  • Total one-time cost for a single storage server: About $15,000
  • Monthly cost with leased hosting: Around $7,000/m

So in two months in production, a server like this pretty much paid for itself. And this does not even equate, since with the new servers have a lot more space and options to use things like ZeusRAM, which are not readily available if you are leasing servers.

This process was a huge education for our entire team. It was also costly and a little scary, but we took our time and constantly questioned ourselves and our decisions. We tested the servers extensively ourselves and, thanks to our customers, had accounts testing the servers before migrating. We’re extremely happy with the outcome and really excited for the future of Beanstalk in this new environment. The end result is a faster, more reliable service for our customers.