When I first started using AWS EC2 instances regularly it was to run Amazon EMR jobs. During this phase I would mix and match EC2 instance types to get the most cost-effective solution for crunching our data. With the cluster size needed to process all of our data cost was always going to be a factor, so I ventured into the using spot instances over on-demand instances.
What are spot instances?
One of the main benefits of using AWS is that you only pay for what you use and EC2 instances are a great example for that. So as the demand for EC2 instances fluctuates Amazon attempts to sell the surplus resources off in the form of spot instances.
The approach Amazon have adopted for users to get this surplus computing power is a bidding process, so the highest bid gets the surplus computing power for the time they are the highest bidder or until they no longer need it.
For example; if there is surplus availability of m1.small type instances then Amazon may start the bidding at 2 cents per hours. So if someone bids 6 cents then they will get that resource at 2 cents and the current bid stays at 2 cents until all the surplus computing is used by that bidder; or by any other bidders.
If someone requires more than is currently free and available at the time; due to it having already been allocated, then at this point the current bid raises to whatever the person is willing to bid, if this is less than your maximum bid then you keep your instances otherwise you will have your instances; unceremoniously, terminated and these will be re-allocated to the highest bidder. The new current bid will now be set above your old maximum bid of 6 cents and someone will have to outbid this to gain control of the surplus resource.
How bidding works
The only upside to losing out whilst in flight is that you don’t pay for the current hour you are using, so if you are 50 mins into your allocated time and you lose the instance due to a higher bidder then you don’t pay for that hour, so in fact you get 50 mins free.
Bidding is per instance type, in a particular Availability Zone in a particular region and Amazon implies that bidding for an m1.small will not affect the bidding for an m1.large in the same Availability Zone nor will bidding for an m1.small in Availability Zone A affect the bidding for an m1.small in Availability Zone B.
When you submit your spot request you give the details of what instance type you need, Availability Zone, Region, etc. along with your maximum bid. Now the maximum bid price is sometimes misunderstood in that people think this is the price you will actually pay for the instance; which is not the case, it is the maximum you are willing to pay per hour for the instance. If the instance bid price remains at 2 cents and your maximum bid price is 10 cents you still only pay 2 cents per hour. If another bidder came in and bid 8 cents then you will still keep the instances but now you will be paying 8 cents per hour and not 2 cents but still not 10 cents.
Choosing when to use spot instances
Losing out on instances whilst in flight can obviously be a major issue in certain situations so you must carefully decide whether using spot instances is acceptable for you. In my case this was fine because I could merely re-run the job again manually or waiting until the next hourly scheduled job kicked in again but if you are supporting critical services then spot instances may not be suitable.
Beware the m1.small spike factor
From my experience I have noticed that the activity on the spot market for m1.small instances is more volatile than the other instance types. The result of this is that the spot pricing for this instance type can vary dramatically thus leaving you either unable to get a spot instance or losing it in flight.
I have a few theories on why its so volatile, the first one being that when a company starts to use AWS; and EC2 instances, they normally start this venture with a spike or two and with little knowledge of the pricing structure and with the cost implications at the back of their minds they normally choose the cheap (and default) option; that being m1.small spot instances. Also with the knowledge that they pay for what they use they often terminate their instance early thinking that this will cost them less not knowing that they still pay for the hour regardless. So if all companies start via this route then this will cause the spot price to fluctuate and increase the activity on the spot market.
To compound this issue; whilst the aforementioned is going on another company performs the same actions but sold on the fact that spot instances are cheaper they raise their bid price to make sure they get their spot instance, not realising that their bid price is greater than the cost of the on-demand instance (you can often see bid prices of $10 for a $0.20 instance). This; again, causes the spot price to fluctuate.
Another reasons for high activity on the spot market for m1.small instances is that companies often create throw away environments such as QA, Staging, R&D, etc. which require less Ec2 power than the Production environment so they opted for the smaller cheaper instances; such as the m1.small spot instance, which obviously drains the resources, raises the bid price and, causes high activity on the spot market.
To overcome the m1.small spike factor you might want to try using m1.medium spot instances instead. These are hopefully less popular and could be cheaper than the m1.small spot instances; due to their demand being less, and hopefully a little more reliable.