Managing Baseline AMIs at Scale with Systems Manager and CloudFormation

Managing baseline AMIs is critical to providing a secure, quick to production solution. Working on outdated images that require compliance tweaking and updates that were not tested through continuous integration of updated image can result in unexpected behaviors, frustration between well-meaning teams, and ultimately additional friction in the delivery of value to the stakeholders and customer.

Management of the AMI and the methods used to build the images are outside of the scope of this conversation. With the assumption of a well-developed and tested image in place, how does this image get distributed across multiple accounts? What is the method for communication with the consumers of the AMI at the local account level while the management of the image exists at the organizational level?

(As of the creation of this post, AWS does not support sharing AMIs with all members of an Organization or Organizational Unit. When that becomes available, I will expand on the usefulness of that feature elsewhere.)

For now, assume the existence of a creation process that results in AMIs being shared securely with the needed accounts using some sort of automated process (an example). Modification to an existing AMI must results in a newly created AMI ID. The next step will be to ensure that all relevant stakeholders are adopting this new image as a part of their build process. Communicating a new AMI ID with multiple teams across hundreds of accounts can be addressed using an alias.

AMI aliases allow consumers of AMIs to reference the AMI ID in a logical way, as well as have a static reference in code for a dynamically changing resource on the backend. Aliases are created using SSM Parameter Store. Like the method used by AWS to provide public parameters for their managed AMIs, users can create account-specific parameters to reference in code. With the AMI being shared from a central account, the only missing component is a simple way to reference a centrally shared AMI using local parameters.

Deploying an SSM parameter to all member accounts of an Organization can be accomplished using a CloudFormation StackSet. As seen below, the CloudFormation resource for the SSM parameter identifies the parameter name being created for every account. (This example shows how naming can be logically provided in a hierarchical manner.)

"UnixRHEL7": {
"Type" : "AWS::SSM::Parameter",
"Properties" : {
"DataType" : "aws:ec2:image",
"Description" : "v0.0.2",
"Name" : "/ami/linux/rhel7/latest",
"Tier" : "Standard",
"Type" : "String",
"Value" : "ami-a1a1a1a1a1a1a1a1a"
}

Once the AMI has been shared and the parameter deployed, on-going management of the image will be transparent to the consumers of the AMI. This can introduce another set of problems in making sure that this continuous delivery does not create an ever-changing landscape on which solutions are being built.

  1. CloudFormation StackSet deploys stack instances to all member accounts in which the AMI is shared.
  2. The local stack creates an EC2 Alias in Systems Manager Parameter Store.
  3. Newly launched EC2s use the parameter store value to map back to a centrally shared AMI.
  4. Continued lifecycle management occurs with updates made to the base AMI while keeping the newest image available for consumption.

By providing a central location for the lifecycle management of AMIs, operations teams can integrate in with other teams using infrastructure as code. Providing standard channels for consumption of approved images drives down the need for remediation of baseline configuration drift in rapid development scenarios.

Modernizing companies’ AWS security and governance programs at scale.