Amazon’s persistent storage beta program for Elastic Cloud Computing (EC2) has been unleashed on the general public. I’ve been hanging out to play with this for a while and will duplicate any tests I perform on Slicehost servers with comparable EC2 servers with EBS. Since I’m also essentially comparing two different providers, I’ll look at the costs.
The page I’ve linked to above is worth reading as it covers performance and durability quite succinctly. It’s pretty awesome to learn that you can just treat each block (EBS) as you would traditional physical storage, striping multiple mounts to achieve greater performance. Hurrah! Given the only different in cost of having three 30gb blocks striped compared to a single 30gb block would be three times the access costs I wonder if this would be the more common style of configuration? I also wonder how backups / snapshots work in this situation as synchronization would be quite important one would think.
I find sometimes it’s the case that numbers quoted by a merchant aren’t often attained in the real world as they only reflect usage under ideal conditions. Internet speeds in New Zealand, pictures of fast food and cosmetic benefits being good examples. So I wonder if the same will apply to this new service? Time will tell and I’m sure it’ll be big news if Amazon’s numbers are considerably off the mark.
Come to think of it I don’t know how Amazon’s existing services stack up against what is advertised. I’m gonna find out though as I intend to see how much I can squeeze out of their service too. Apart from the geek factor, I’m doing this as clients are increasingly asking about Amazon EC2 after I normally recommend either Slicehost or a dedicated box somewhere. Basically at the moment I don’t know, when I need to know.
So I have four big questions around using Amazon EC2. Firstly a big cause of concern for me is their service having already experienced a number of notable down times including what appears to be reported data loss. With their latest significant event Amazon has been very open about the problem and what they are doing as a result - nice and positive. Although, as nice as that is, they should really stop having them.
Secondly is performance, obviously. I’m gonna put everything that can’t be replicated on EBS, including all logs that interest me. Of note, I’ll have a webserver, a load balancer and many many application instance continuously logging away from multiple servers and having a good time about it. Oh and also the database. Performance better not blink. Previously a big turn off for me has been the situation of ‘what logs?’ - if an EC2 server suddenly stops, you have no way of knowing what just happened. Yeah I know about services like RightScale that attempt to minimize this, but its not good enough by my standards. Plus their sign-up fee is a big turn off to me, presumably they are trying to protect their IP from free access. Anyway, who does big sign-up fees anymore??? It’s so 2005.
Thirdly is unexpected restarts. What event based actions can I automate? I don’t know and this is possibly just because I’ve not dug into it deep enough yet. When they restart, EC2 servers are blank again. Can I set them to be automatically loaded with a script which reverts their state back to what it was pre-reboot? I doubt this would be as quick as I’d want so I could just symlink most of the OS to EBS - depending on it’s performance. It implies that either Amazon provides a harness that sits around all your servers for kicking off the scripts as needed, or you use a third party or you roll your own. I’ve been looking out for an excuse to setup a big kickass high availibilty setup, then it wouldn’t matter! Hmmmm.
UPDATE: You can mount EBS as the file system on EC2 as noted here. That basically implies it’s up there in the performance stakes and obviously no need to reinstall everything after a reboot. You can also create a new EBS from a snapshot which is extremely handy.
Forthly, permanent ip addresses - I know they have them but again that’s about the limit of my knowledge. The lack of an EBS like service has stopped me seriously investigating AWS until now. Can I have more than one ip address per server? Can I have a floating one if I want to roll my own my high availability configuration?
Backing up user content and other important data is a really big deal for a website, especially as it starts accumulating and accumulating. I have found Amazon S3 is my best mate for this - especially for doing regular rsync like backups. If a website is entirely hosted with EC2 and EBS this becomes a whole lot easier to the point of being stupidly easy. That unto itself is a really big deal.
At the moment all server oriented virtualized services that I can recall using and reading about are essentially just replications of physical devices - normally with marginally lower performance characteristics while being more reliable due to their redundant nature. I do wonder though when someone will come up with something new that’s not available in the physical world and what that will be.
Aside from the concerns I’ve expressed above, I’ll just add that what an awesome learning tool AWS and like services are becoming. Who cares if you screw up - learn, wipe and start again.
I’ve got even more geeking out ahead of me now, sweet.
No related posts.
![[del.icio.us]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/delicious.png)
![[Digg]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/digg.png)
![[dzone]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/dzone.png)
![[Fark]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/fark.png)
![[Google]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/google.png)
![[LinkedIn]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/linkedin.png)
![[Reddit]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/reddit.png)
![[Shoutwire]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/shoutwire.png)
![[Slashdot]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/slashdot.png)
![[Sphinn]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/sphinn.png)
![[StumbleUpon]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/stumbleupon.png)
![[Technorati]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/technorati.png)
![[Twitter]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/twitter.png)



One Comment
> Firstly a big cause of concern for me is their service having already experienced a number of notable down times including what appears to be reported data loss.
Amazon has always said “expect failure” with regards to their services. S3/SQS *will* (and does) return 500/503 errors at times. And EC2 instances *will* go away at some point. If people ignore this and assume their EC2 instances will be up for ever and ever, then they will experience data loss.
> Thirdly is unexpected restarts. What event based actions can I automate? I don’t know and this is possibly just because I’ve not dug into it deep enough yet. When they restart, EC2 servers are blank again. Can I set them to be automatically loaded with a script which reverts their state back to what it was pre-reboot?
So, one way is to use a system configuration tool like puppet. Simpler option is to write a startup script that is passed to the instance as part of the user-data and executed. This is how the Ubuntu & Debian AMIs are set up. Such a script can install packages, download other scripts or do other more exciting and complicated things too. And if you get an “installed” image set up, its pretty easy to save it back to S3 as a new AMI - this leaves you only to do minimal node configuration on boot, since the OS and related packages are already installed. EBS makes some of this easier, but if you want to scale using EC2 you’ll need to deal with it at some point.
> Forthly, permanent ip addresses - I know they have them but again that’s about the limit of my knowledge. Can I have more than one ip address per server? Can I have a floating one if I want to roll my own my high availability configuration?
Each EC2 instance has a private IP, a public IP, and optionally one or more “elastic IPs”. These are static addresses that can be mapped onto one running EC2 instance, and you can remap it via the “ec2-associate-address” command. They’re free as long as they’re assigned to instances - if you have unassigned IP addresses they’re charged at 1c/hour.