Keeping your monitored node list in sync with the dynamic lifetimes of EC2 instances within an Auto Scaling Group can be combersome. Not only do the instances themselves need to be kept in sync, but the connection and credential information to enable each instance to be scanned must also be regularly configured.
This article begins by describing how an Auto Scaling Group works in terms of the existence of EC2 instances in the context of UpGuard Core and scanning those nodes. The article then goes on to provide a technical walkthrough of how we can solve this problem with a scheduled sync job.
This process only works if the EC2 instances have the same connection method and credentials. That is, if each EC2 instance in an auto scaling group that ever exists uses SSH, then they must all be accessible for a node scan via the same username and either password or common SSH public key.
How does an AWS Auto Scaling Group work?
Auto Scaling Groups provide a dynamic way to scale your infrastructure to meet demand in real-time. That is, when traffic to your website is low, a small number of instances are live. When traffic is high, more instances are spun up to meet the load. For the lifetime of a particular EC2 instance, this could be anywhere from a few minutes to days depending on fluctuations in demand. Another important fact to note is that once an Auto Scaling Group EC2 instance is terminated it is terminated forever.
In the context of monitoring each of these EC2 instances with UpGuard Core, there are two important points to consider:
- Keeping the list of monitored nodes in UpGuard in sync with those that are currently live as part of the group. This includes both adding newly detected instances and retiring terminated instances; and
- When a new node is added making sure the node scan be scanned. That is, making sure the connection method (WinRM or SSH), the hostname and port to connect to, and the account credentials, are correctly set.
Setup up a Scheduled Sync Job
This guide outlines how to add an Auto Scaling Group node and how to create a sync job revolving around that node that keeps EC2 instances in sync with those alive in AWS. Once created, the Auto Scaling Group node will live in an auto-created node group with the same name as the node, with associated member nodes being added and removed from that auto scaling group. The sync job allows you to either sync the outer EC2 config, or the inner EC2 host instance (the Windows or Linux VM itself), or both.
Adding an Auto Scaling Group Node
The first step is to add an Auto Scaling Group node. In addition to being able to monitor this node type as a valid node itself, the settings and credentials associated with this node assist:
- the sync job in listing member EC2 instances,
- provides a basis for credentials to in inherited from, for EC2 config node types,
- provide a base for the associated auto-created node group
To create a new Auto Scaling Group node, navigate to Discover > Add Nodes, then use the search bar to locate the AWS Auto Scaling Group node type. Click the node type, then click Go Agentless.
Select the connection manager group to use - the Default group should be fine, unless you have a custom network layout. Enter the name of the Auto Scaling Group, the region and your AWS credentials. Then click Scan Node to finish the registration process and perform an initial scan of the node. For more information on the required AWS security group permissions required to scan and sync from an Auto Scaling Group node, please see our guide on Security Group Permissions.
This should create a stand alone Auto Scaling Group node which is just like any other node. It can have change reporting and policies applied to it. The next section outlines how this node can be combined with a sync job to keep your list of auto scaled EC2 instances up to date.
Creating a Sync Job
To create an AWS Auto Scaling Group Sync Job, navigate to Control > Job Schedule
and then click Add Scheduled Job. Select the AWS Auto Scaling Group Sync
Job Type, then select the auto scaling group node you want to base this job from. Here,
we are creating a scheduled job based on our
production-auto-scaling-group auto scaling
node and are scheduling to sync just the VM/host nodes every 15 minutes. The nodes will be
synchronized using their Public DNS property as the hostname we will attempt to node scan with.
Click Create Scheduled Job. This will create a scheduled job that will run after the defined time from saving. When a job runs it collects up all existing EC2 nodes that are known to already be part of that Auto Scaling Group and sends them to the connection manager with the associated Auto Scaling Group node’s connection details. The connection manager will download a list of current EC2 nodes in the auto scaling group and then cross reference this list with the known list sent to the connection manager.
Pre-existing nodes that are still alive are double checked against the appliance to make sure they exist in the correct state, are in the correct node group and are still internally associated with this auto scaling group.
Any EC2 nodes already being monitored before the job was first run will remain in place but be added to the auto scaling group’s node group. Any EC2 nodes sitting in a detected state from a previous AWS integration detect job will automatically be promoted to active/monitored and have the correct login credentials inherited from another EC2 instance already in the node group.
Newly detected EC2 config nodes are added into the appliance and are added into the associated node group. The connection details for these nodes are inherited from the Auto Scaling Group node as the API only needs to be queried to scan the configuration of these node types.
Newly detected EC2 host nodes are also added into the appliance and are added into the associated node group. Since it is impossible to get the actual connection details (SSH or WinRM) of the VM itself from the AWS API (for good security reasons), the connection details are inherited from the most recently added EC2 host node of the same OS and connection type. That is, if I am adding a new Windows EC2 instance, I will look to the most recently added (before this one) Windows EC2 instance for the WinRM connection details such as port, username and password. As the hostname of each EC2 instance is unique, the hostname will be automatically provided to the appliance as part of the registration of the node.
Since EC2 host nodes inherit the connection details from the most recently added EC2 host node of the same OS and connection type in the same Auto Scaling Group, this process doesn't work the very first time you sync EC2 host nodes into the appliance. A good practice would be to either manually seed the group by adding an EC2 host node manually to make sure the connection details are correct, or waiting for the first run of the job to complete, then selecting all detected nodes and bulk editing the connection details to be correct.
Terminated nodes will be retired from the appliance. This means that they will be soft deleted so that the historic node scan data is still available, but they will not have policies attached to them anymore and they will no be in an active/monitored state and therefore will now be scanned on the normal scan schedule.
For more information on creating an AWS integration, please view our guide on AWS Integration.