Amazon Web Services (AWS) offers two flavors of networked block storage as a service with Elastic Block Storage (EBS): Standard and Provisioned IOPS. The two offerings differ primarily by performance, meaning that the speed of an application hosted on AWS using EBS for storage will be impacted by this choice. Although EBS Provisioned IOPS is the higher performance option, you get the best performance if and only if certain conditions (which are described below) are met.
To optimize your EBS volumes for Provisioned IOPS, it is important to understand how it differs from Standard. Performance for EBS is primarily measured in input/output operations per second (IOPS).
In the EBS case, IOPS refer to operations on blocks that are up to 16 KB in size.1 Standard volumes deliver 100 IOPS on average. This is roughly to number of IOPS that a single desktop-class 7200 rpm SATA hard drive can deliver. In comparison, a similar desktop-class SSD drive can deliver anywhere between 5,000 and 100,000 IOPS. Server-class SSD drives can go much higher.
EBS Provisioned IOPS can deliver a maximum of 4,000 IOPS, if and only if the conditions described below are met.
Meeting these conditions should lead to disk volumes using Provisioned IOPS that deliver between 90% and 100% of their expected performance 99.9% of the time in a given year:
- Your application sends enough requests to the volume as measured by the average queue length (i.e. the number of pending operations) of that volume.
- The read and write operations apply to blocks of 256 KB or less. For example if your block size is 1024 KB, you should expect only 1/4 of the provisioned IOPS you purchased.
- The blocks on the volume have been read at least once. The first time a block is accessed, there is a 50 percent reduction in IOPS. Performance is restored after first access.
- No EBS snapshot is pending.
- Total read/write operations do not exceed 128MB/s per EBS volume.
While these extensive conditions are normal for a networked storage service, they’re easy to overlook if you’re not aware that they’re necessary to optimize performance.
The one metric that matters is VolumeQueueLength from AWS CloudWatch. Let’s see what happens when an application is pushing too many IOPS to an EBS volume. The graph below shows an EBS volume that has maxed out. For a Provisioned IOPS volume, the rule of thumb is to keep the VolumeQueueLength at 1 per 100 IOPS.
The graph below shows an example of the VolumeQueueLength over 24 hours for an EBS volume.
When the volume is maxed out I/O operations queue up and cause a delay that is directly visible by the operating system via various I/O related metrics (e.g. % of CPU spent in “I/O wait”). At that point your application is likely to go only as fast as that EBS volume goes.
Below is an example of the CPU IOWait metric for the server in the same timeframe. Notice that as much as 27% of the CPU cycles are diverted away from the application as the instance is waiting for the EBS volume to process all the I/O requests.
Maintaining the conditions mentioned above to achieve optimal performance for Provisioned IOPS in EBS can be accomplished through careful monitoring. To ensure that our EBS volumes at Datadog are constantly running at optimal performance, we have set up a dashboard that constantly tracks VolumeQueueLength and other key EBS performance metrics.
For more on how to avoid these and other EBS performance issues, check out our free eBook The Top 5 Ways to Improve Your AWS EC2 Performance.