Introduction to Amazon Auto Scaling
AWS has introduced Auto Scaling so that you can take advantage of cloud computing without having to incur the costs of adding more personnel or building your own software. You can use Auto Scaling to scale for high availability, to meet increasing system demand, or to control costs by eliminating unneeded capacity. You can also use Auto Scaling to quickly deploy software for massive systems, using testable, scriptable processes to minimize risk and cost of deployment.
Cloud computing is an exciting suite of technologies that has come to dominate the discussion of computing for two main reasons:
It can provide the flexibility to quickly deploy computing environments that are not only properly configured for current needs, but that can also expand or contract according to future needs.
It can help an organization save money.
Cloud computing can deliver flexibility and cost savings because it is uniquely capable of being scaled to the "right size" for a particular environment, no matter how frequently usage of the environment expands or contracts. Taking advantage of this flexibility can be difficult; however, some companies try to do it manually, while others create a custom automated system. Either method has the potential of introducing new challenges into your environment:
Depending on manual processes to start new servers, stop unneeded systems, or change allocated storage space is expensive, slow, and worst of all, error prone. Any savings from moving environments to the cloud can easily be erased by the costs of manual intervention.
Creating an automated scaling system customized for your organization's needs takes a long time, almost certainly costs more than planned for, and requires a risky test and deployment phase. Such a system also requires its own hardware, software, and support environment that scales itself for expansion or contraction.
As with so many of its other services and products, Amazon Web Services (AWS) created Auto Scaling to solve its own scaling issues, and now provides the service to its customers for free. On its own, Auto Scaling monitors your environment to ensure that the desired systems stay running. What is even more powerful is that you can tie the Amazon CloudWatch monitoring service into Auto Scaling, which allows your environment to automatically scale up or down based on current conditions:
As load increases, Amazon CloudWatch can call Auto Scaling to add new computing capacity.
As load decreases, Amazon CloudWatch can trigger Auto Scaling to shed computing capacity and reduce cost.
This paper describes what Auto Scaling is, when to use it, and provides an example of setting up Auto Scaling.
What is Amazon Auto Scaling?
Using AWS to meet your business needs is easy because of free services like Auto Scaling. Without Auto Scaling, you may have no problems configuring an initial environment in AWS, but dynamic growth can quickly exceed your existing team's ability to respond. Hiring enough operators to respond to your computing environment's conditions 24/7 is too expensive. Guessing the proper environment size means that the systems are probably over-provisioned, which wastes money. Finally, ensuring that all the systems you want running are actually operational can be overwhelming, especially if an Availability Zone (AZ) fails.
Auto Scaling allows you to specify a server group, called an Auto Scaling Group (ASG), in which you define:
A minimum number of servers to run (in this context, servers refers to EC2 instances configured to run your software)
A maximum number of servers to run
Optionally, an initial number of servers to run
The AZs in which you want the servers to run
When you complete the ASG definition, Auto Scaling starts the number of servers you specified. The system distributes the servers as evenly as possible across the designated AZs.
If any server stops working, the ASG replaces the server. If an AZ goes out of service, the remaining AZs have their servers increased to the specified amount.
The ASG definition can also include rules that adjust the number of servers running and the minimum/maximum number of servers based on system conditions or schedules. For example, you may want your environment to respond to changes in CPU percentage over the whole group, or you may want to ensure a certain number of servers are running at the start of business hours. Note that these rules exist to not only expand your environment as needed, but to contract it: you may want to decrease the number of servers running at the end of business hours. Expansion and contraction are equally important, and taking advantage of both can lead to real cost savings.
Auto Scaling consists of the following four components, which are described in the following subsections:
Launch Configuration (LC)
Auto Scaling Group (ASG)
Auto Scaling Policy (ASP)
Scheduled Action (SA)