Sourcing the Cloud

Lambda: What We Thought Cloud Computing Was All Along?

It’s right there in Amazon’s tag line for Lambda, what I would argue is one of the most interesting cloud services they’ve released to date:  “Run code without thinking about servers – pay only for the compute time you consume.”  Let’s have a look at those claims, and especially let’s look at them in relation to EC2, their older and more traditional IaaS cloud offering, because, hey, hasn’t Amazon made those same claims before?  If Lambda just offers the same thing as EC2, why is it a separate service?  Or did EC2 not really offer those things, but Lambda really does?

“Serverless” Computing?

Let’s start with the idea that you don’t need to think about servers.  In fact, Amazon (and Microsoft and Google with similar offerings) uses the new word “serverless” to describe Lambda.  That language has been rightly criticized because Lambda does still run on servers, of course.  You just don’t need to specify them to use the service the way you did with EC2, since Lambda is just about running functions.  No more choosing instance types from the dizzying array in the AWS catalog.  You simply tell the service how much memory your function needs to run, as well as a “timeout period” after which they will terminate an execution if it is taking too long.  That’s important, since it protects you from getting a huge bill if your function doesn’t work as expected.  Once you select your memory requirement, Lambda allocates “proportional” CPU power, network bandwidth, and disk I/O to a Linux container for your function, and the CPU power allocated is in the same proportion as a general purpose EC2 instance.  Your function also receives 500MB of non-persistent disk space in a temporary directory.  If you need persistent storage, then you go outside Lambda and use another service like S3.

In short, Lambda is simply about running code, and so it removes as many of the infrastructure planning and management tasks as it possibly can, to a much greater degree than EC2.  That includes handling redundancy across multiple availability zones, so you don’t need to manage disaster recovery, as well as automated backups of your code.  It also includes continuous horizontal scaling, so that whatever server instances are needed to run your code as your functions are called more and more frequently are simply added to the service automatically.  All of this is great for developers, because running code is all you really want to do while you’re developing – the less infrastructure hassles you have the better.  Some have begun referring to Lambda and its competitors as “Function as a Service,” and I certainly find that to be much more accurate and descriptive than “serverless.”

I should mention, and this is very important, that while it’s true that Lambda removes the need to “worry” about servers, there is still one IT management task that you will need to worry about:  security.  Security is handled via AWS Identity and Access Management.  You need to specify the IAM role required to run each Lambda function, and of course you still need to make sure your users (or other event sources) are properly authorized to have that role, and that your Lambda functions are authorized to access any required resources outside of the Lambda containers.  This isn’t all that different from the security requirements around any application that you design, but you may need to think about them a little earlier in the process than you would if your development platform was all on your own premises.

Pay Only for What You Use?

EC2 charges for on-demand server instances on a per-hour basis.  As we’ve discussed in previous posts, that’s not true usage-based pricing because you may not need every instance for more than an hour.  If your application runs for 1 minute each hour of the day, you’ll pay for 24 hours just as if it ran continuously.  Lambda pricing, in contrast, is based on sub-second metering.  Amazon says “you don’t pay anything when your code isn’t running.”  Unfortunately, that’s not quite true.  As with EC2, Lambda has a minimum interval that you will be charged for, and it’s 100 milliseconds (ms).  For context, 100 ms is a tenth of a second.  I’m sure Amazon’s marketing department prefers saying 100 ms because, well, it sounds smaller, right?  “But 100 ms is nothing” you say!  In fact, it’s the most granular pricing we’ve seen in the cloud thus far… but it’s not nothing.

Remember that each Lambda charge is for a single execution of your function.  Some functions take less time to execute than 100 ms.  Some take more.  If your function takes 101 ms to execute you will be charged for 200 ms.  It’s just a fraction of a penny you say?  Well true, but when your function executes millions of times the charges add up.  This situation, where usage under the minimum interval generates profit for the provider, exists for both Lambda and EC2, and Amazon is quite aware of it.  Add to this the fact that Lambda can take several times longer to execute your function the first time it’s called than it does for subsequent executions.  This is called a “cold start,” and it takes longer because Lambda is getting all your resources “ready” to execute your function quickly.  It’s not completely clear how long Lambda will wait after an execution before you need another cold start, but while one execution per minute may be enough to avoid cold starts completely, one every 10 minutes may not be (see this post for an example).

So, my first law of cloud cost optimization holds true for Lambda every bit as much as it did for EC2:

“Know Thy Application!”

In this case that means knowing your function.  If your function’s performance is CPU or memory-bound, increasing the memory might decrease the execution time, thus giving you much better performance for the same or even a lower price than you got with less memory.  If the performance bottleneck is somewhere else, then you may want to decrease the memory allocated to the function to save money if you can do it without losing performance.  Optimizing your Lambda pricing is always going to be a game of choosing just enough memory for your function but not too much.

My conclusion is that Lambda, along with similar FaaS offerings from other providers, does require you to think less about servers than EC2 does.  It also has more granular pricing, but we’re still not quite paying only for “what we use.”  Now that we’ve covered the basics, next time we’ll look at Lambda pricing in detail (complete with pretty charts), how it compares to EC2, and what that means for when you should use Lambda… and when you shouldn’t!

Why We Need Public Cloud Price Wars to Continue… Forever

In response to a few nominal price drops from public IaaS providers of late, as well as a few pundits noticing that those drops have become less frequent than they once were, I’ve seen more than one piece questioning whether the providers can keep dropping their prices.  They say things like “margins are already thin” and “the cost of the labor to support public cloud isn’t going down.”  There appears to be a narrative under construction here that the cost to provide you with, say, an hour of access to a compute instance or a GB of disk space, is fixed for the provider unless they can find some innovative way to reduce it.

That narrative is false.  Here’s why.

Remember the sample configuration we used to compare IaaS pricing in my earlier post?  The internal IT costs that would be replaced by public cloud service for that configuration break down as follows:

cost-breakdown-pie

Notice that almost half of the cost is the infrastructure hardware itself – servers and storage.  This is using large internal IT shops as a model, and since the leading public cloud providers tend to use their own infrastructure software, cheap hydroelectric power and labor that gets more efficient with scale, it’s reasonable to assume that the hardware is actually more than half of what you are getting with public cloud.

Now remember that unit costs for IT hardware tend to go down significantly over time.  By “unit” costs I mean the cost for each unit of capacity, such as a GB of disk or an EC2 Compute Unit of processor capacity.  Moore’s Law has decelerated a bit since its inception, but we’re still seeing a doubling of processor power around every two and a half years at an equivalent price point.  Disk storage gets cheaper for equivalent capacity even faster than that.  The end result is that the 100 AWS compute instances and 1000 TB of disk you signed up for a couple years back now cost Amazon significantly less than they did back when you started using them.  And that means that they have a choice:  they can either let their service get more and more profitable, or they can drop their prices.

The providers generally don’t like to talk about things like Moore’s Law, for the same reason that IBM Global Services didn’t like to talk about it back in the early days of traditional IT outsourcing.  Too many customers were looking at cost savings on day one of the deal only and didn’t understand that they’d be overpaying a few years later if they were still being charged the same amount.  It wasn’t until the advent of benchmarking clauses in outsourcing contracts that this began to change.  Nobody is benchmarking cloud deals since barriers to switching providers are so low, but those barriers are pretty meaningless if the dominant providers price services in lock step with one another.

The one exception among the cloud leaders right now is Google, sort of, since they do mention Moore’s Law on their site as being a part of their pricing philosophy, but it seems to get little attention, and I’m not convinced they are still walking that talk.  If they are then they’ll drop prices very soon.  In any case, all of the providers are very, very aware of this situation, and they track it very closely.  Many of the customers, not so much.  They are historically awful at preparing business cases, and the pundits that make a living writing about what the providers do tend to be focused on easy things like the next great provider innovation rather than hard things like cloud service financials.  It’s now been over a year and a half since I priced the configurations we’ve been discussing in this blog, and the pricing from Amazon, Google and Microsoft still hasn’t changed in all that time.  But business is good, and the money is just rolling in, so why would they change a thing?

Are Public Cloud Services Commodities?

I’ve seen this question being asked more often lately.  I’ve even seen a few pundits, peddlers and consultants referring to public cloud services as if they are commodities, as if that’s something that everyone knows and simply accepts, so perhaps it’s time to give the question a close examination.  There are many definitions of the word “commodity,” but in the context that I’m writing about here a commodity tends to have the following attributes:

  1. It’s usually an economic good that has some value.
  2. One unit of that good has about the same value regardless of who sells it to you.
  3. It’s usually mass produced and
  4. Markets for commodities are usually driven primarily by price, with brand names being of far less value than in markets for non-commodity goods.
  5. Because of these factors commodities can usually be easily bought and sold on market exchanges.

Common examples of commodity goods are things like gold, copper, crude oil, soy beans and wheat.  An ounce of gold, a barrel of oil or a bushel of ordinary wheat have more or less universal values no matter where they come from.  They are mass produced and have a single unit price that you can look up on a commodity exchange at any point in time.

Certainly, public cloud services are not very similar to this.  Right off the bat we see that they are not goods; they are services.  How can a service be a commodity, since the quality of a service almost undoubtedly varies from one provider to the next?  Well, they can’t, but to be fair there are some services that are at least “commodity-like.”  Two that come to mind are the notary services and vehicle inspections that we all need from time to time.  Those aren’t traded on exchanges like true commodity goods would be, but they are at least similarly priced and something you can get from pretty much any provider and receive more or less the same value.  So, for the sake of argument, let’s give cloud services a pass on our first criteria above and stretch the meaning of “commodity” to include services.

On to points 2 and 3.  One could argue that public cloud services are sort of mass produced, since they are built on huge “farms” of equipment, large groups of which are identically configured, and the same menu of services is offered to every customer.  Unfortunately, however, that’s only true if you focus on one provider.  If you look across providers, you find that very different services are provided on top of equipment configurations that are very different from one another.  The prices, and what’s included in the prices, vary dramatically from one provider to the next, as we’ve discussed in-depth in my previous posts.  Support, SLAs, security, functionality, performance and other factors are all quite different from one provider to the next, and anyone who’s read The Cloud Service Evaluation Handbook appreciates the critical importance of these differences.

If that’s not enough to end any debate, it’s on point 4 that cloud services spectacularly fail the commodity test.  I would argue that the market for public cloud is in no way driven by price.  Some may be shocked to hear me say that, but a brief review of my posts on cloud pricing below shows that to be true.  A market that was truly price-driven would have its major competitors much closer together on what they charge, just as you often see from two gas stations situated across the street from each other.  Brand names are, in fact, almost everything in the cloud service market, at least so far.  Amazon is seen as the inventor of this space, and they are clearly reaping the benefits of that.  Microsoft has a large mind-share among developers and Google is perceived as a pioneer for almost any type of new, automated technology.  This is why IBM isn’t crushing the rest of the market, since their brand is more associated with their leadership of the “old,” not-much-loved traditional outsourcing space.  Add the relative strengths of these important brands to the actual features and functionality of the services they sell, which are all different from each other, and you can fully explain their relative market success.

Finally, to our last criteria, there is no commodity market exchange for cloud services.  “Hold it!” you say.  What about Amazon’s “spot instances?”  Those are traded on an exchange!  Well sure, but not the kind of exchange that the meaning of the word “commodity” alludes to, and I actually think this might be where a lot of misuse of the term is coming from.  You’ve got to remember that Amazon’s “market” is only for AWS EC2 instances, traded only with Amazon customers on an exchange run by Amazon.  Microsoft can’t sell one of their instances on that exchange.  You could argue that the spot instances themselves are commodities because anyone can buy and sell them, but that’s only one type of instance from one provider.  There is no reasonable way to generalize that and infer that all public cloud services are commodities.  Nor will that happen in the future unless Amazon becomes the only public CSP and all AWS infrastructure becomes “spot” infrastructure.  Today what you have is more analogous to an equity stock market like NASDAQ,… if it only sold multiple types of stock from a single company and if NASDAQ were owned by that company.  NASDAQ doesn’t trade commodities, and there’s a reason for that – commodities and equities really are quite different things.

So, hopefully that put’s the question to rest.  Amazon did not look at IT infrastructure and decide “we can turn this into a commodity.”  If they did they would have made all of their automation open source and their APIs non-proprietary.  What they did do was look at IT infrastructure and decide “we can sell this like shrink-wrapped software or like books, and there’s nobody in the world better at doing that than us; all we need is the right automation.”  Software packages are not commodities; books are not commodities and neither is IaaS.

Comparing Public Cloud Pricing – Part 3

In Parts 1 and 2 we discussed several ways that public cloud pricing can vary dramatically based on what you actually order – what configuration, options, support, discounts, etc., and how those very situational needs can make a very big difference regarding which service from which provider will save you the most money.  In this post I’ll focus specifically on the availability of per-minute pricing from Google and Microsoft, and identify exactly when it makes a difference and how big a difference it makes.

The first thing to understand is that compute instance pricing from these four leading IaaS providers is not really usage-based.  If you spin up an AWS compute instance and don’t release it for an hour, but your workload only uses 5% of the CPU’s horsepower for that hour, you will still pay the same price as if it used 100%.  Even though cloud has been around for a while, that still comes as a shock to a lot of people.  The pricing is closer to being usage-based than it was with traditional IT outsourcing because we pay for 1-hour increments rather than 1-month increments with multi-year commitments, but we’re still paying for access to capacity, just in smaller chunks.

So how does that relate to our pricing comparison?  Have a look back at Figure 1 from Part 1 of this series.  The label on the X-axis is “Percentage of Hours Used Per Month,” which is my attempt to indicate that, to a cloud service, hours used and minutes used (or even seconds used) are not the same thing.  For example, since there are 730 hours in an average month, that means that the 50% mark on the chart signifies 365 hours of the month with at least some usage in each hour.  If your application runs 12 hours a day, 7 days a week, and releases the instances for the remaining hours, that would just about get you to 50% on the chart.  And I’m sure there are plenty of applications out there that fit that profile or something close to it.  For our sample configuration, apps that run on GCE, Azure, AWS On Demand or SoftLayer Hourly compute instances are generating much less cost for cloud infrastructure than they would with reserved instances or monthly billing options because they pay nothing for compute 50% of the time.  But what if the workload isn’t all concentrated within those 365 hours each month?  What if it’s more dispersed over time?

For illustration let’s take the most extreme example and see what happens to Figure 1 when we assume that the workload for the compute instances is as dispersed as possible over an average month.  That would mean that the 50% usage is spread over every hour of the month, 30 minutes each hour.  You’re still using your compute instances 50% of the time, but now you’re using 100% of the hours in the month instead of 50% of them.  The number of minutes you’re using each month, however, is still 50%.  In that scenario, AWS On Demand, SoftLayer Hourly Billing, and any other cloud service that charges for each hour that your instance runs, will cost the same as if you were using it 100% of the time.  Here’s what the cost looks like graphically, using the same configuration we used for Figure 1:

picture-4
Figure 4

If you are like a lot of people, you may be having one of those “holy crap” moments right about now.  Sorry about that.  Let’s talk about what happens to each curve:

  • AWS reserved instances and SoftLayer Monthly stay where they were because they are unaffected by usage – you pay the rate you signed up for regardless.
  • AWS On Demand and SoftLayer Hourly are now horizontal lines (the AWS On Demand line may actually look dotted here because it’s hidden under the dotted line for SoftLayer Monthly). Since those two services charge for every hour you use any of, and you are using at least a little of every hour, you pay the same as 100% usage until you get to zero percent, where you’re only paying a small amount for 1 TB of storage and nothing for compute.
  • Azure’s curve stays exactly where it was, since the per-minute pricing with no minimum is the most granular of all these services (we haven’t looked at services with true usage-based pricing, like AWS Lambda). If you only use 30 minutes each hour, you only pay for 30 minutes each hour.
  • The curve for Google Compute Engine stays where it was until you get to 16.7% usage. Why?  Because GCE has per-minute pricing, but with a 10-minute minimum each hour, and 10 is 16.7% of 60.  So you only pay for what you use, but only after you’ve used at least 10 continuous minutes.  Once you release the instance, the 10 minute minimum starts again, so the chart is actually a little kind to Google since it assumes continuous use each hour.

Conclusion:  If you are using an IaaS service that charges by the hour for compute instances, and your workload is very dispersed over time, you could be leaving A LOT of money on the table vs. a service that charges by the minute or even reserved instances.

What about other configurations?  Just for fun, let’s look at the “More RAM” Configuration that we showed in Figure 3 (from Part 2), and adjust it for a time-dispersed workload:

picture-5
Figure 5

That’s a huge difference, because the extra memory makes the compute instances more expensive for this configuration.  The explanation of what happens to each curve is the same as for our first configuration, but Google looks even better in this scenario, just as they did in Figure 3, because the AWS and Azure configurations required some over-provisioning to meet the requirements.  The important thing is to look at the size of the gaps in this chart.  When you get down to lower average usage levels, say around 30%, you’re looking at differences in actual cost of almost 4 times.  Don’t let anyone tell you that all the big IaaS services are priced the same.  Each of them has strengths in different situations, but they can be very far apart on actual cost.  And we’ve only looked at a few, small sample configurations.  We’ve also looked at workloads that are as concentrated over time as possible vs. workloads that are as dispersed as possible.  There’s also every scenario in between, and there’s a very good chance that your workload is one of those!

Conclusion:  If you don’t thoroughly understand your application’s workload patterns, you could be leaving a large amount of money on the table with hourly pricing, especially if your workload is intermittent and dispersed over time.

Comparing Public Cloud Pricing – Part 2

In Part 1 we talked about how public cloud pricing from different providers compares for a very simple sample configuration, and then we saw how that comparison changes when the storage requirement is increased.  We also talked about the importance of pressuring public IaaS providers to continue lowering prices and pass on the benefits of falling hardware costs to customers.  And we noted that reserved instances may be cheap enough to make them economical for large Windows shops, even when applications can’t automatically release instances to lower the time billed for to less than 100%.

This time I’m going to go into more detail on the providers’ offerings so that you get a real sense of what’s involved in doing a comparison.  If you look at the configuration table provided last time, you’ll see some significant differences, including the fact that the compute instances aren’t the same across all four providers. For example, for Google to meet the requirements we had to use instances with significantly more RAM.  So what would happen if, say, we increased the memory requirement to 8 GBs on the small instances and 12 GBs on the bigger ones?  Have a look:

picture-3
Figure 3

That’s a huge difference compared to Figure 1 (the first chart from Part 1)!

  • The Google price curve hasn’t moved because the configuration we priced the first time already had enough RAM to meet the new requirement.
  • The Amazon price curve shifted upward because we had to use larger compute instances (t2.large and c3.2xlarge instead of t2.medium and c3.xlarge).
  • Azure pricing shifted up for the same reason, but not as far as Amazon’s did, since going to D2 and D3 from A2 and A3 instances had a smaller impact.
  • Now look at SoftLayer. When you configure the virtual servers from them you don’t pick a preset instance type.  You just tell them how many cores and how much RAM you want, so you can get the exact amount you need.  That resulted in a much smaller price shift than for Amazon or Google, and since we’re still using only 1000 GBs of storage here, the higher storage pricing isn’t having much of an impact.Just making a simple change to the memory requirement, often the critical resource when sizing compute instances, completely changed the pricing comparison!  Before, the lowest price for most usage levels was AWS.  With this new configuration it’s Google or SoftLayer, but you have to keep in mind that we had to over-provision slightly on AWS and Azure to meet the memory requirements.  And there’s more to consider – way more.
  1. SoftLayer’s monthly billing option gives you 250 GB per month of free outbound data transfer per virtual server. The other three are going to charge extra for that.
  2. Performance can and will vary from one provider to the next and possibly even from one data center to the next. You may need to spend more money on an underperforming solution to get it to meet your performance requirements.
  3. You’re going to want support, and the cost of that isn’t factored into our charts. Amazon and Google have similar pricing schedules for “Business” and “Gold” level support, depending on what your total monthly spend is, but Microsoft’s Professional Direct is a simple, flat $1000 per month (great if you’re paying for a lot of usage, not so hot if you only a couple instances) and SoftLayer’s support is “free” with your purchase.  For large accounts those may only represent differences of a few percent, but they do matter, and if you need higher levels of support you generally have to negotiate the price.
  4. Sometimes, as with AWS reserved instances, the provider publishes the volume discounts that apply, but often it’s negotiated, so those discounts aren’t factored into these results.
  5. Microsoft’s discounts on Azure for existing Windows customers can be substantial, and they aren’t factored in either. If you’re a big Microsoft customer you must find out what discounts you qualify for and, importantly, which prices they apply to and how long they last.  Microsoft can essentially forgive you for the cost of some of your Windows investment when you move from locally installed Windows to Azure, similar to how you pay less for a Windows upgrade than for a new purchase.  No other provider is in a position to do that, and the cost of Windows can be a significant piece of the compute instance price.
  6. Other unique options are available, such as Amazon’s spot instances and Lambda, Google’s pre-emptible instances and custom configurations, Microsoft’s “basic” instances for low-traffic applications and SoftLayer’s bare-metal server hosting. Any of these could potentially save you money if they are suited to your needs.
  7. You will very likely need a lot more than just the simplistic configurations we’ve looked at here. What about database services, VPN, load balancing, advanced monitoring, directory services, etc., not to mention specialized services like High Performance Computing?  Any one of those can change the price comparison significantly.  You may even run into some providers that simply can’t give you what you need for a specific application at all.
  8. Last, but not least, two of these providers give you per-minute pricing, which is up to 60 times more granular than per-hour pricing. That can have a major impact on what you actually pay, but I’m going to have to leave the details of that until next time.

Conclusion:  If you haven’t modeled the resource requirements thoroughly, you don’t know what it will cost to put your application in the cloud, and you certainly don’t know which service option from which provider will save you the most money.

Next time – per minute pricing!  Read Part 3.

Comparing Public Cloud Pricing – Part 1

In June of last year, ISG published the first report benchmarking the average price of four leading public IaaS providers against the corresponding average internal IT costs for large Windows data centers. One of the key findings was:

“At a usage level of approximately 55 percent, public cloud prices are at parity with the prices in our internal IT benchmark.”  – ISG

Though the report was freely available for a few months, access now requires a research subscription. That finding meant public IaaS, using “on-demand” or “pay-as-you-go” compute instances, was more expensive than large scale internal IT unless those instances were running during less than 55% of the hours in a month. The finding made perfect sense, since the promise of cloud is to save money by making you pay only for what you use. It follows that if you use it 100% of the time, you shouldn’t expect to save money with an on-demand pricing model.

Now that well over a year has passed since that report, I thought I’d re-price the same configuration that ISG used just to see how things have changed. I also included AWS reserved instances and SoftLayer monthly billing in my analysis just to see if any new conclusions could be drawn. What I found was pretty interesting.

Picture1
Figure 1

The first thing I noticed was that, for the priced configuration, three of the four providers (Amazon, Google and Microsoft) had left their prices virtually unchanged. IBM’s pricing was lower. Google had actually lowered prices just before the ISG report came out, but it’s been months longer for Amazon and Microsoft. I wasn’t expecting to see that kind of pricing stability, for two reasons:

  1. During the early days of public IaaS, price drops were coming fast and furious. Many people expected them to continue for the foreseeable future just due to market pressure, though there was certainly no guarantee that this would happen.
  2. Public IaaS providers own the server hardware, and you pay for access to it. That means that natural market reductions in the cost of CPU power (e.g., the famous Moore’s Law) go 100% to the cloud provider’s bottom line unless they make a conscious decision to pass those savings on to customers via a price drop. We’ve already seen that public providers have begun making some serious money, and their profits are growing. Is it possible that the leading market players are currently happy with their positions and will simply let the cloud market’s rising tide raise all ships? I’d be surprised, especially given Amazon’s usual preference for crushing competitors over generating profits, but If customers aren’t complaining, I guess it’s possible.

Conclusion:  Customers must pressure providers to drop prices to pass on hardware savings, or they will not realize the full financial benefits of public IaaS.

The second thing I noticed was the price for the AWS configuration with 3-year reserved instances. When you reserve your instances you don’t get any of the “pay only for what you use” benefits of cloud, which is why the curve representing reserved instances and SoftLayer’s monthly pricing (essentially a 1-month reserved instance) are horizontal lines. But look where the AWS 3-year Reserved line crosses the On Demand line. Right about 58% of hours used per month. Sound familiar?  That’s very, very close to the ISG benchmark for internal IT costs from last year. The financial disadvantage of internal IT vs. cloud is that you pay up front and then depreciate over 3 years whether you continue using the instance or not, very similar to the way you pay Amazon for a reserved instance, so I don’t think this is entirely a coincidence. AWS reserved instances appear to be intentionally priced to compete with traditional in-house infrastructure.

Conclusion:  AWS 3-year reserved instances may be price-competitive with large, internal Windows shops, simply by giving up certain advantages of cloud that internal IT also cannot provide (e.g., pay only for what you use). That indicates that smaller Windows shops may be able to save money by using reserved instances over in-house infrastructure, even if their applications can’t release instances during intervals of lower demand for compute power.

How can Amazon do this and still make money?  I’m not sure what fraction of their compute business is Reserved instances vs. On Demand, but I know they use very inexpensive, white-box machines, relying on redundancy to ensure quality of service. They also don’t have to dedicate any of that hardware to you unless you specify it (and pay more for it), so CPU cycles you may not be using during each hour on the clock that you’re paying for, could, at least in theory, be used to serve other customers. And of course the data centers are enormous, highly automated and often located to exploit cheap hydroelectric power, so economies of scale should be superior.

Now, let’s talk about some of the limitations of this chart, because there are some BIG ones. In fact, I would strongly caution you not to use it to draw any conclusion regarding which provider offers the lowest cost service. For one thing, pricing a single configuration doesn’t tell you much. If I make a very simple change and just multiply the amount of storage by 5, look what happens:

Picture2
Figure 2

SoftLayer’s curves shift upwards, while Azure shifts down. In fact, Azure is now the cheapest service when instances are in use during about 34% of the hours in the month or less. Why the change?  Simple. SoftLayer’s storage costs more than Amazon’s and Azure’s costs less, for this configuration.

That’s just the tip of the iceberg folks. This post has already gone long, so I’m going to leave you with the details of the sample configurations used in the charts. In Part 2 we’ll go over the differences in the services and why they must be taken into account before you can do a valid comparison. That includes things like discount opportunities and the availability of per-minute pricing for GCE and Azure that can have a dramatic effect on what you actually pay.  Read Part 2!

Table 1