As you may have read in our previous posts (here, here and here), one of the largest pieces of our data infrastructure at Localytics is the petabyte scale Vertica analytics database that we host on Amazon Web Services. We've been relying on Amazon's Elastic Block Store (EBS) as the storage solution for this database for more than a year now. EBS allows you to provision virtual block devices and attach them to your compute instances as regular drives, giving you the ability to effectively decouple the size of your computing resources from their storage.
EBS has been a good choice for us because of the flexibility it allows in the operation of our warehouse. One feature that we really take advantage of is the ability to quickly snapshot and restore the contents of these virtual drives to and from S3. In addition to being an excellent backup/recovery solution, it also allows us to replicate the entire contents of our data warehouse in order to scale or benchmark new features.
Recently, we had the opportunity to spin up a warehouse with Amazon's shiny new SC1 magnetic backed EBS volume type. Amazon already has three different storage types, the "standard" magnetic backed storage, a general purpose SSD backed storage solution called GP2 and a provisioned high-throughput SSD solution called io1. The SC1 type is a member of their new SC/ST family of magnetic backed volume types that are designed specifically for data warehouse use cases such as ours. They offer a lower price point, stronger performance consistency guarantees and "burst" optimization for long sequential reads that column-oriented databases such as Vertica really love. SC1, specifically, is optimized for colder storage.
EBS standard magnetic has worked pretty well for us in the past. Over the summer, we flirted with GP2s and found that we really liked the consistency the newer generation of SSD backed volumes offered, but couldn't justify the price as our data volume has grown exponentially.
For a quick background, here's our current cluster setup and requirements:
- r3.4xlarge memory optimized instances
- 1 TB EBS standard magnetic volumes in a RAID-0 arrangement (1 TB is the max size of a standard volume)
- Data striped across all nodes in the cluster
- Continuous trickle load of real time data
Based off a month of performance testing, SC1/ST1 hits the sweet spot for us because it's designed with our use case in mind and offers an excellent price per performance tradeoff.
Here's a side-by-side comparison of the peformance distribution of the two different volume types serving a 24 hour sample of read/write queries that take Vertica a median time right at 1 second to serve on standard EBS:
|Type||Data Size||50th pct (ms)||75th pct||90th pct||95th pct||99th pct|
"pct" refers to the percentiles of query times, ie 50pct means 50% of our queries ran in this amount of time or less. The small / large designation here is a combination of client size (active data points) and date range of time-series data traversed. Our data is optimized to take advantage of client and date order.
As you can see, there is about a 500ms penalty added to this set of queries, but the top end of query time ends up in the same bands as standard volume types, which is well within our performance expectations. Other benchmarks we've run show the standard deviation between runtimes of similar queries to be significantly smaller on SC1 than on standard volumes.
We were so impressed with the performance, that we were comfortable powering our analytics dashboard for a day. Here's our New Relic view of three days of performance.
There was such a small difference in peak performance, it's hard to distinguish which of the three days above was backed by SC1 volumes.
Performance characteristics aside, running this configuration is expected to shave 10% off our platform operating costs, which is a huge win.
There are a few things we plan on trying now that these drives are generally available:
- ST1 - We spent so much time playing with the SC1s, we didn't have a chance to run these same tests against the ST1 volume type. Our dashboard & public API use case can be incredibly volatile, so a little extra "oompf" over the peaks would be nice.
- Larger volumes - Since we had to restore this data set from standard volumes, we were stuck with striping across many 1 TB volumes. SC1/ST1 allow much larger volumes. We might be able to shave down that half second by allowing Vertica to slurp up much larger sequential strips on those big queries.
As we grow our marketing platform capabilities, EBS products like this will allow us to maintain the high level of data granularity that we pride ourselves on. For us, shifting workloads to this new volume type is a no brainer.
I hope this is as interesting and exciting for you as it is for me. If so, you might want to do this kind of stuff as your day job and we'd love to hear from you. Hit me up (email@example.com) or visit our jobs page directly.