This article explains about the ECS service auto scaling cloudformation configuration, as it is quite different when compared to EC2 autoscaling. Before we start, download the script and will go through each section. In this article, I am going to explain the most important properties which if not mentioned will not successfully create the alarms.

After the alarm is created, you would see like the one in this image. For each section of this alarm the cloudformation configuration exists and accordingly the article goes by explaining.


ECSAutoScaleUpAlarm

Below is the configuration which ensures the above data in the image is met.

"ECSAutoScaleUpAlarm": {
            "Type": "AWS::CloudWatch::Alarm",
            "Properties": {
              "AlarmName": { "Fn::Join": [ "", [{ "Ref"  :  "ServiceName" }, "-ECSAutoScaleUpAlarm" ] ] },
              "AlarmDescription": "CPU alarm for ECS autoscaling group when the CPU alarm is greater than 75",
              "MetricName": "CPUUtilization",
              "EvaluationPeriods": "3",
              "Threshold": { "Ref"  : "ECSScaleUpThreshold" },
              "ComparisonOperator": "GreaterThanThreshold",
               ...
               ...
            }
}


Below is the configuration for the above configuration ( image ).

"TreatMissingData": "breaching" / "notBreaching"

“breaching”: signifies that this alarm is breaching the threshold and is ‘bad’

“notBreaching”: signifies that this alarm is not breaching the threshold and is ‘good’


Below is the configuration for the above image details.

"ECSAutoScaleUpAlarm": {
            "Type": "AWS::CloudWatch::Alarm",
            "Properties": {
              ...
              ...
 
              "AlarmActions": [ { "Ref": "ECSScaleUpPolicy" } ],
               
              ...
              ... 
             }
          },

The above invokes the action, as defined in the policy referred by the AlarmActions. There is a specialty of the property AlarmActions, where the policy only gets triggered ONLY when the state is in ALARM. There are other states too like, ‘OK’ and ‘INSUFFICIENT’, for which corresponding actions can be configured, as defined the documentation. Here in this example we have chosen to trigger the action when the state is ALARM and hence AlarmActions is configured. Later in the below article will get to know about the scale up policy configuration.

The resource type holds the value as ‘EC2 Container Service’. There are two options provided in the drop down, one being “Autoscaling” and other ‘EC2 Container Service’. ‘Autoscaling’ is for EC2 autoscaling and ‘Ec2 Container Service’ signifies the ECS autoscaling. This dropdown is preselected based on the type of the autoscaling policy either it being ECSScaleDownPolicy or ECSScaleUpPolicy, which is described later in the article.


Below is configuration for the above image details.

"ECSAutoScaleUpAlarm": {
            "Type": "AWS::CloudWatch::Alarm",
            "Properties": {
              ....
              ....

              "Namespace": "AWS/ECS",
              "Statistic": "Maximum",
              "Period": "300",
              "MetricName": "CPUUtilization",
              "Dimensions": [
                { "Name": "ClusterName",  "Value": { "Ref"  :  "ClusterName" } },
                { "Name": "ServiceName",  "Value": { "Ref"  :  "ServiceName" } }
              ]
             }
}

This section of the configuration is very important as if done incorrect, wont enable triggering of actions.

Namespace : Make sure the value is as configured ‘AWS/ECS’ to ensure which resource category needs to be monitored for CPUUtilization.

Dimensions : property is very important which defines that to which of the the cluster and service name this alarm is linked to.

For more information on dimensions follow the below links : http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CW_Support_For_AWS.html http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ecs-metricscollected.html

NOTE: If any of the above property is missing then the link to the cluster-service autoscaling misses and no action is triggered. Try configuring the same alarm manually through AWS console will understand and can correlate the given configuration.


ECSAutoScaleDownAlarm

Similar to the above scale up policy the same goes for the sale down policy, with respective properties changes related to scale down.


ECSScalableTarget

Now comes the section to configure the scale up and down policies as referred in above alarms actions under AlarmActions. This configures the autoscaling policy configuration for the Cluster Service and below is the flow to navigate to it.

ECS console --> click cluster name --> click service name --> select autoscaling tab

In this section, below are the required values and make sure it should not be changed. These are fixed values.

"ScalableDimension" : "ecs:service:DesiredCount",
"ServiceNamespace" : "ecs"

If not configured then below is how it will be displayed.


ECSScaleUpPolicy

This refers to the ecs autoscaling configuration since the type of the policy signifies "Type": "AWS::ApplicationAutoScaling::ScalingPolicy".

"ScalingTargetId": { "Ref" : "ECSScalableTarget" },

Only when the PolicyType is StepScaling, StepScalingPolicyConfiguration is required to be configured.

"PolicyType": "StepScaling",
"StepScalingPolicyConfiguration": { }

When step scaling is configured, you can either increase or decrease the direct count of tasks or can provided the percentages. Here I have used count of tasks to increase which ScalingAdjustment decides on how many tasks to scale. If mentioned as positive value, it increases tasks and if negative value is provided, it decrease the number of tasks by the configured value ( here by 1 task ).

StepAdjustments: [
 {

   "MetricIntervalLowerBound": 0,
   "ScalingAdjustment": 1
 }
]

Suppose you have a scenario where you want to scale the number of tasks in different levels of CPUUtilization ( can use any other metric but in this example, used CPU ) i.e. as shown below

say if 50 < CPUUtilization < 60 then increase 1 task

say if 60 < CPUUtilization < 75 then increase 2 tasks

say if 75 < CPUUtilization then increase 4 tasks


This ensures that tasks count can be increased in step wise manner. This type of configuration can be done using StepScaling. For this to achieve, there are two properties MetricIntervalLowerBound and MetricIntervalUpperBound, which defines the boundaries as explained.

So for the above step wise increments of the tasks, below is the configuration required to be done:

StepAdjustments: [
 {
   "MetricIntervalLowerBound": 0,
   "MetricIntervalUpperBound": 10,
   "ScalingAdjustment": 1
 },

{
   "MetricIntervalLowerBound": 10,
   "MetricIntervalUpperBound": 20,
   "ScalingAdjustment": 2
 },

{
   "MetricIntervalLowerBound": 20,
   "MetricIntervalUpperBound": 30,
   "ScalingAdjustment": 4
 }
]

So if the threshold value configured in the alarms section ( i.e. ECSAutoScaleUpAlarm ) is say 50 then the above configuration is parsed as

(Threshold + MetricIntervalLowerBound of first element ) 50 < CPUUtilization < 60 ( Threshold + MetricIntervalUpperBound of first element ) then increase 1 task

(Threshold + MetricIntervalLowerBound of second element ) 60 < CPUUtilization < 70 ( Threshold + MetricIntervalUpperBound of second element ) then increase 2 task

(Threshold + MetricIntervalLowerBound of third element ) 70 < CPUUtilization < 80 ( Threshold + MetricIntervalUpperBound of third element ) then increase 4 task


ECSScaleDownPolicy

Similar to the scale up policy, scale down policy is similar except for two changes:

First being the ScalingAdjustment which holds negative value.

StepAdjustments: [
 {
   "MetricIntervalUpperBound" : 0,
   "ScalingAdjustment": -1
 }
]

Second, the bounding configuration is inverted. As from the above example, the lower bound is replaced to upper bound and vice versa, since the scaling adjustment is now negative.

NOTE: If this is not done then you would see empty value in the ECS autoscaling policy as show in the below figure and wont be able to scale down the ECS tasks.

ECS console --> click cluster name --> click service name --> update service button --> configure autoscaling --> And click on the scale down policy


Download Source Code Browse Source Code