Sourcing the Cloud

Comparing Public Cloud Pricing – Part 3

In Parts 1 and 2 we discussed several ways that public cloud pricing can vary dramatically based on what you actually order – what configuration, options, support, discounts, etc., and how those very situational needs can make a very big difference regarding which service from which provider will save you the most money.  In this post I’ll focus specifically on the availability of per-minute pricing from Google and Microsoft, and identify exactly when it makes a difference and how big a difference it makes.

The first thing to understand is that compute instance pricing from these four leading IaaS providers is not really usage-based.  If you spin up an AWS compute instance and don’t release it for an hour, but your workload only uses 5% of the CPU’s horsepower for that hour, you will still pay the same price as if it used 100%.  Even though cloud has been around for a while, that still comes as a shock to a lot of people.  The pricing is closer to being usage-based than it was with traditional IT outsourcing because we pay for 1-hour increments rather than 1-month increments with multi-year commitments, but we’re still paying for access to capacity, just in smaller chunks.

So how does that relate to our pricing comparison?  Have a look back at Figure 1 from Part 1 of this series.  The label on the X-axis is “Percentage of Hours Used Per Month,” which is my attempt to indicate that, to a cloud service, hours used and minutes used (or even seconds used) are not the same thing.  For example, since there are 730 hours in an average month, that means that the 50% mark on the chart signifies 365 hours of the month with at least some usage in each hour.  If your application runs 12 hours a day, 7 days a week, and releases the instances for the remaining hours, that would just about get you to 50% on the chart.  And I’m sure there are plenty of applications out there that fit that profile or something close to it.  For our sample configuration, apps that run on GCE, Azure, AWS On Demand or SoftLayer Hourly compute instances are generating much less cost for cloud infrastructure than they would with reserved instances or monthly billing options because they pay nothing for compute 50% of the time.  But what if the workload isn’t all concentrated within those 365 hours each month?  What if it’s more dispersed over time?

For illustration let’s take the most extreme example and see what happens to Figure 1 when we assume that the workload for the compute instances is as dispersed as possible over an average month.  That would mean that the 50% usage is spread over every hour of the month, 30 minutes each hour.  You’re still using your compute instances 50% of the time, but now you’re using 100% of the hours in the month instead of 50% of them.  The number of minutes you’re using each month, however, is still 50%.  In that scenario, AWS On Demand, SoftLayer Hourly Billing, and any other cloud service that charges for each hour that your instance runs, will cost the same as if you were using it 100% of the time.  Here’s what the cost looks like graphically, using the same configuration we used for Figure 1:

picture-4
Figure 4

If you are like a lot of people, you may be having one of those “holy crap” moments right about now.  Sorry about that.  Let’s talk about what happens to each curve:

  • AWS reserved instances and SoftLayer Monthly stay where they were because they are unaffected by usage – you pay the rate you signed up for regardless.
  • AWS On Demand and SoftLayer Hourly are now horizontal lines (the AWS On Demand line may actually look dotted here because it’s hidden under the dotted line for SoftLayer Monthly). Since those two services charge for every hour you use any of, and you are using at least a little of every hour, you pay the same as 100% usage until you get to zero percent, where you’re only paying a small amount for 1 TB of storage and nothing for compute.
  • Azure’s curve stays exactly where it was, since the per-minute pricing with no minimum is the most granular of all these services (we haven’t looked at services with true usage-based pricing, like AWS Lambda). If you only use 30 minutes each hour, you only pay for 30 minutes each hour.
  • The curve for Google Compute Engine stays where it was until you get to 16.7% usage. Why?  Because GCE has per-minute pricing, but with a 10-minute minimum each hour, and 10 is 16.7% of 60.  So you only pay for what you use, but only after you’ve used at least 10 continuous minutes.  Once you release the instance, the 10 minute minimum starts again, so the chart is actually a little kind to Google since it assumes continuous use each hour.

Conclusion:  If you are using an IaaS service that charges by the hour for compute instances, and your workload is very dispersed over time, you could be leaving A LOT of money on the table vs. a service that charges by the minute or even reserved instances.

What about other configurations?  Just for fun, let’s look at the “More RAM” Configuration that we showed in Figure 3 (from Part 2), and adjust it for a time-dispersed workload:

picture-5
Figure 5

That’s a huge difference, because the extra memory makes the compute instances more expensive for this configuration.  The explanation of what happens to each curve is the same as for our first configuration, but Google looks even better in this scenario, just as they did in Figure 3, because the AWS and Azure configurations required some over-provisioning to meet the requirements.  The important thing is to look at the size of the gaps in this chart.  When you get down to lower average usage levels, say around 30%, you’re looking at differences in actual cost of almost 4 times.  Don’t let anyone tell you that all the big IaaS services are priced the same.  Each of them has strengths in different situations, but they can be very far apart on actual cost.  And we’ve only looked at a few, small sample configurations.  We’ve also looked at workloads that are as concentrated over time as possible vs. workloads that are as dispersed as possible.  There’s also every scenario in between, and there’s a very good chance that your workload is one of those!

Conclusion:  If you don’t thoroughly understand your application’s workload patterns, you could be leaving a large amount of money on the table with hourly pricing, especially if your workload is intermittent and dispersed over time.

Comparing Public Cloud Pricing – Part 2

In Part 1 we talked about how public cloud pricing from different providers compares for a very simple sample configuration, and then we saw how that comparison changes when the storage requirement is increased.  We also talked about the importance of pressuring public IaaS providers to continue lowering prices and pass on the benefits of falling hardware costs to customers.  And we noted that reserved instances may be cheap enough to make them economical for large Windows shops, even when applications can’t automatically release instances to lower the time billed for to less than 100%.

This time I’m going to go into more detail on the providers’ offerings so that you get a real sense of what’s involved in doing a comparison.  If you look at the configuration table provided last time, you’ll see some significant differences, including the fact that the compute instances aren’t the same across all four providers. For example, for Google to meet the requirements we had to use instances with significantly more RAM.  So what would happen if, say, we increased the memory requirement to 8 GBs on the small instances and 12 GBs on the bigger ones?  Have a look:

picture-3
Figure 3

That’s a huge difference compared to Figure 1 (the first chart from Part 1)!

  • The Google price curve hasn’t moved because the configuration we priced the first time already had enough RAM to meet the new requirement.
  • The Amazon price curve shifted upward because we had to use larger compute instances (t2.large and c3.2xlarge instead of t2.medium and c3.xlarge).
  • Azure pricing shifted up for the same reason, but not as far as Amazon’s did, since going to D2 and D3 from A2 and A3 instances had a smaller impact.
  • Now look at SoftLayer. When you configure the virtual servers from them you don’t pick a preset instance type.  You just tell them how many cores and how much RAM you want, so you can get the exact amount you need.  That resulted in a much smaller price shift than for Amazon or Google, and since we’re still using only 1000 GBs of storage here, the higher storage pricing isn’t having much of an impact.Just making a simple change to the memory requirement, often the critical resource when sizing compute instances, completely changed the pricing comparison!  Before, the lowest price for most usage levels was AWS.  With this new configuration it’s Google or SoftLayer, but you have to keep in mind that we had to over-provision slightly on AWS and Azure to meet the memory requirements.  And there’s more to consider – way more.
  1. SoftLayer’s monthly billing option gives you 250 GB per month of free outbound data transfer per virtual server. The other three are going to charge extra for that.
  2. Performance can and will vary from one provider to the next and possibly even from one data center to the next. You may need to spend more money on an underperforming solution to get it to meet your performance requirements.
  3. You’re going to want support, and the cost of that isn’t factored into our charts. Amazon and Google have similar pricing schedules for “Business” and “Gold” level support, depending on what your total monthly spend is, but Microsoft’s Professional Direct is a simple, flat $1000 per month (great if you’re paying for a lot of usage, not so hot if you only a couple instances) and SoftLayer’s support is “free” with your purchase.  For large accounts those may only represent differences of a few percent, but they do matter, and if you need higher levels of support you generally have to negotiate the price.
  4. Sometimes, as with AWS reserved instances, the provider publishes the volume discounts that apply, but often it’s negotiated, so those discounts aren’t factored into these results.
  5. Microsoft’s discounts on Azure for existing Windows customers can be substantial, and they aren’t factored in either. If you’re a big Microsoft customer you must find out what discounts you qualify for and, importantly, which prices they apply to and how long they last.  Microsoft can essentially forgive you for the cost of some of your Windows investment when you move from locally installed Windows to Azure, similar to how you pay less for a Windows upgrade than for a new purchase.  No other provider is in a position to do that, and the cost of Windows can be a significant piece of the compute instance price.
  6. Other unique options are available, such as Amazon’s spot instances and Lambda, Google’s pre-emptible instances and custom configurations, Microsoft’s “basic” instances for low-traffic applications and SoftLayer’s bare-metal server hosting. Any of these could potentially save you money if they are suited to your needs.
  7. You will very likely need a lot more than just the simplistic configurations we’ve looked at here. What about database services, VPN, load balancing, advanced monitoring, directory services, etc., not to mention specialized services like High Performance Computing?  Any one of those can change the price comparison significantly.  You may even run into some providers that simply can’t give you what you need for a specific application at all.
  8. Last, but not least, two of these providers give you per-minute pricing, which is up to 60 times more granular than per-hour pricing. That can have a major impact on what you actually pay, but I’m going to have to leave the details of that until next time.

Conclusion:  If you haven’t modeled the resource requirements thoroughly, you don’t know what it will cost to put your application in the cloud, and you certainly don’t know which service option from which provider will save you the most money.

Next time – per minute pricing!  Read Part 3.

Comparing Public Cloud Pricing – Part 1

In June of last year, ISG published the first report benchmarking the average price of four leading public IaaS providers against the corresponding average internal IT costs for large Windows data centers. One of the key findings was:

“At a usage level of approximately 55 percent, public cloud prices are at parity with the prices in our internal IT benchmark.”  – ISG

Though the report was freely available for a few months, access now requires a research subscription. That finding meant public IaaS, using “on-demand” or “pay-as-you-go” compute instances, was more expensive than large scale internal IT unless those instances were running during less than 55% of the hours in a month. The finding made perfect sense, since the promise of cloud is to save money by making you pay only for what you use. It follows that if you use it 100% of the time, you shouldn’t expect to save money with an on-demand pricing model.

Now that well over a year has passed since that report, I thought I’d re-price the same configuration that ISG used just to see how things have changed. I also included AWS reserved instances and SoftLayer monthly billing in my analysis just to see if any new conclusions could be drawn. What I found was pretty interesting.

Picture1
Figure 1

The first thing I noticed was that, for the priced configuration, three of the four providers (Amazon, Google and Microsoft) had left their prices virtually unchanged. IBM’s pricing was lower. Google had actually lowered prices just before the ISG report came out, but it’s been months longer for Amazon and Microsoft. I wasn’t expecting to see that kind of pricing stability, for two reasons:

  1. During the early days of public IaaS, price drops were coming fast and furious. Many people expected them to continue for the foreseeable future just due to market pressure, though there was certainly no guarantee that this would happen.
  2. Public IaaS providers own the server hardware, and you pay for access to it. That means that natural market reductions in the cost of CPU power (e.g., the famous Moore’s Law) go 100% to the cloud provider’s bottom line unless they make a conscious decision to pass those savings on to customers via a price drop. We’ve already seen that public providers have begun making some serious money, and their profits are growing. Is it possible that the leading market players are currently happy with their positions and will simply let the cloud market’s rising tide raise all ships? I’d be surprised, especially given Amazon’s usual preference for crushing competitors over generating profits, but If customers aren’t complaining, I guess it’s possible.

Conclusion:  Customers must pressure providers to drop prices to pass on hardware savings, or they will not realize the full financial benefits of public IaaS.

The second thing I noticed was the price for the AWS configuration with 3-year reserved instances. When you reserve your instances you don’t get any of the “pay only for what you use” benefits of cloud, which is why the curve representing reserved instances and SoftLayer’s monthly pricing (essentially a 1-month reserved instance) are horizontal lines. But look where the AWS 3-year Reserved line crosses the On Demand line. Right about 58% of hours used per month. Sound familiar?  That’s very, very close to the ISG benchmark for internal IT costs from last year. The financial disadvantage of internal IT vs. cloud is that you pay up front and then depreciate over 3 years whether you continue using the instance or not, very similar to the way you pay Amazon for a reserved instance, so I don’t think this is entirely a coincidence. AWS reserved instances appear to be intentionally priced to compete with traditional in-house infrastructure.

Conclusion:  AWS 3-year reserved instances may be price-competitive with large, internal Windows shops, simply by giving up certain advantages of cloud that internal IT also cannot provide (e.g., pay only for what you use). That indicates that smaller Windows shops may be able to save money by using reserved instances over in-house infrastructure, even if their applications can’t release instances during intervals of lower demand for compute power.

How can Amazon do this and still make money?  I’m not sure what fraction of their compute business is Reserved instances vs. On Demand, but I know they use very inexpensive, white-box machines, relying on redundancy to ensure quality of service. They also don’t have to dedicate any of that hardware to you unless you specify it (and pay more for it), so CPU cycles you may not be using during each hour on the clock that you’re paying for, could, at least in theory, be used to serve other customers. And of course the data centers are enormous, highly automated and often located to exploit cheap hydroelectric power, so economies of scale should be superior.

Now, let’s talk about some of the limitations of this chart, because there are some BIG ones. In fact, I would strongly caution you not to use it to draw any conclusion regarding which provider offers the lowest cost service. For one thing, pricing a single configuration doesn’t tell you much. If I make a very simple change and just multiply the amount of storage by 5, look what happens:

Picture2
Figure 2

SoftLayer’s curves shift upwards, while Azure shifts down. In fact, Azure is now the cheapest service when instances are in use during about 34% of the hours in the month or less. Why the change?  Simple. SoftLayer’s storage costs more than Amazon’s and Azure’s costs less, for this configuration.

That’s just the tip of the iceberg folks. This post has already gone long, so I’m going to leave you with the details of the sample configurations used in the charts. In Part 2 we’ll go over the differences in the services and why they must be taken into account before you can do a valid comparison. That includes things like discount opportunities and the availability of per-minute pricing for GCE and Azure that can have a dramatic effect on what you actually pay.  Read Part 2!

Table 1