AWS Cost Optimization: Automating RDS and EC2 Instance Management

AWS

RDS

EC2

Cost Optimization

Automated instance management

Cloud cost optimization

AWS Cost Optimization: Automating RDS and EC2 Instance Management

by: Ashish Sharma

April 05, 2024

In the ever-evolving landscape of cloud computing, optimizing costs while maintaining operational efficiency is a top priority for businesses leveraging Amazon Web Services (AWS). Among the plethora of strategies available, harnessing automation to manage resources effectively stands out as a cornerstone for sustainable cost optimization. In this blog post, we'll delve into distinct approaches for cost-saving measures tailored to RDS and EC2 instances in AWS infrastructure, employing AWS Lambda, Amazon EventBridge, and Terraform.

Automated RDS and EC2 Instance Management

Managing non-production instances efficiently is crucial for cost optimization. An effective strategy is to automate the stopping and starting of instances based on predefined schedules. To implement this, we'll introduce a tagging mechanism where instances that need to be managed have a tag named "schedule" set to "true". This approach ensures flexibility and granularity in managing instances based on specific requirements.

Utilizing Tags for Instance Management

By tagging instances with a "schedule" tag set to "true", we introduce a flexible mechanism to identify which instances require automated management. This tagging approach allows for granular control over instance behavior, enabling organizations to apply automation selectively based on their needs. Additionally, tagging simplifies the management process, making it easier to track and categorize instances across the AWS environment.

Automated Management of RDS Instances

Architecture Overview:

Lambda Function: A Lambda function will be created to stop and start RDS instances based on predefined schedules.
Amazon EventBridge: EventBridge will trigger the Lambda function at specified times using cron expressions.
Terraform Configuration: Terraform will provision and manage the necessary infrastructure components.

Implementation Steps:

Lambda Function (Python):

import boto3

def lambda_handler(event, context):
    action = event.get('action')
    if action == 'stop':
        stop_rds_instances()
    elif action == 'start':
        start_rds_instances()

def stop_rds_instances():
    rds = boto3.client('rds')
    instances = get_instances_with_tag('schedule', 'true') # Get instances with tag 'schedule' set to 'true'
    
    for instance in instances:
        rds.stop_db_instance(DBInstanceIdentifier=instance)

def start_rds_instances():
    rds = boto3.client('rds')
    instances = get_instances_with_tag('schedule', 'true') # Get instances with tag 'schedule' set to 'true'
    
    for instance in instances:
        rds.start_db_instance(DBInstanceIdentifier=instance)

def get_instances_with_tag(tag_key, tag_value):
    ec2 = boto3.client('ec2')
    response = ec2.describe_instances(Filters=[{'Name': f'tag:{tag_key}', 'Values': [tag_value]}])
    instances = []
    for reservation in response['Reservations']:
        for instance in reservation['Instances']:
            instances.append(instance['InstanceId'])
    return instances

Amazon EventBridge Configuration:

Create two rules in EventBridge to trigger the Lambda function at specific times, one for stopping instances and another for starting instances.

Terraform Configuration:

resource "aws_cloudwatch_event_rule" "stop_rds_instances" {
  name                = "stop-rds-instances"
  schedule_expression = "cron(0 21 ? * MON-FRI *)" # Adjust schedule as needed
}

resource "aws_cloudwatch_event_target" "invoke_lambda_stop" {
  rule = aws_cloudwatch_event_rule.stop_rds_instances.name
  arn  = aws_lambda_function.instance_manager.arn
  input = jsonencode({
    action = "stop"
  })
}

resource "aws_cloudwatch_event_rule" "start_rds_instances" {
  name                = "start-rds-instances"
  schedule_expression = "cron(0 7 ? * MON-FRI *)" # Adjust schedule as needed
}

resource "aws_cloudwatch_event_target" "invoke_lambda_start" {
  rule = aws_cloudwatch_event_rule.start_rds_instances.name
  arn  = aws_lambda_function.instance_manager.arn
  input = jsonencode({
    action = "start"
  })
}

Automated Management of EC2 Instances

Architecture Overview:

Lambda Function: Another Lambda function will be created to stop and start EC2 instances based on predefined schedules.
Amazon EventBridge: EventBridge will trigger the Lambda function at specified times using cron expressions.
Terraform Configuration: Terraform will provision and manage the necessary infrastructure components.

Implementation Steps:

Lambda Function (Python):

import boto3

def lambda_handler(event, context):
    action = event.get('action')
    if action == 'stop':
        stop_ec2_instances()
    elif action == 'start':
        start_ec2_instances()

def stop_ec2_instances():
    ec2 = boto3.client('ec2')
    instances = get_instances_with_tag('schedule', 'true') # Get instances with tag 'schedule' set to 'true'
    
    for instance in instances:
        ec2.stop_instances(InstanceIds=[instance])

def start_ec2_instances():
    ec2 = boto3.client('ec2')
    instances = get_instances_with_tag('schedule', 'true') # Get instances with tag 'schedule' set to 'true'
    
    for instance in instances:
        ec2.start_instances(InstanceIds=[instance])

def get_instances_with_tag(tag_key, tag_value):
    ec2 = boto3.client('ec2')
    response = ec2.describe_instances(Filters=[{'Name': f'tag:{tag_key}', 'Values': [tag_value]}])
    instances = []
    for reservation in response['Reservations']:
        for instance in reservation['Instances']:
            instances.append(instance['InstanceId'])
    return instances

Amazon EventBridge Configuration:

Create two rules in EventBridge to trigger the Lambda function at specific times, one for stopping instances and another for starting instances.

Terraform Configuration:

resource "aws_cloudwatch_event_rule" "stop_ec2_instances" {
  name                = "stop-ec2-instances"
  schedule_expression = "cron(0 21 ? * MON-FRI *)" # Adjust schedule as needed
}

resource "aws_cloudwatch_event_target" "invoke_lambda_stop" {
  rule = aws_cloudwatch_event_rule.stop_ec2_instances.name
  arn  = aws_lambda_function.instance_manager.arn
  input = jsonencode({
    action = "stop"
  })
}

resource "aws_cloudwatch_event_rule" "start_ec2_instances" {
  name                = "start-ec2-instances"
  schedule_expression = "cron(0 7 ? * MON-FRI *)" # Adjust schedule as needed
}

resource "aws_cloudwatch_event_target" "invoke_lambda_start" {
  rule = aws_cloudwatch_event_rule.start_ec2_instances.name
  arn  = aws_lambda_function.instance_manager.arn
  input = jsonencode({
    action = "start"
  })
}

Assumptions for Cost Savings Calculation:

For RDS:

Number of non-production RDS instances: 2
Cost per RDS instance per hour: $0.10

For EC2:

Number of non-production EC2 instances: 3
Cost per EC2 instance per hour: $0.05

Cost Savings Calculation:

For RDS:

Savings per RDS instance per day: $0.10 * 12 hours = $1.20
Total savings per day (for two instances): $1.20 * 2 = $2.40
Monthly savings (30 days): $2.40 * 30 = $72.00

For EC2:

Savings per EC2 instance per day: $0.05 * 12 hours = $0.60
Total savings per day (for three instances): $0.60 * 3 = $1.80
Monthly savings (30 days): $1.80 * 30 = $54.00

Complete code here: https://github.com/18-ashish-sharma/cost-optimisation-terraform

Conclusion:

Implementing automated cost-saving measures tailored to specific AWS resources like RDS and EC2 instances empowers businesses to optimize their cloud spending effectively. By leveraging AWS Lambda, Amazon EventBridge, and Terraform, organizations can achieve significant cost reductions without compromising operational efficiency. Embracing automation in cloud infrastructure management is crucial for staying agile and competitive in today's dynamic business landscape.

With these strategies in place, businesses can unlock substantial cost savings while ensuring their AWS infrastructure meets the demands of scalability, reliability, and performance. Leveraging tags for instance management adds flexibility and granularity, enabling organizations to apply automation selectively based on specific requirements. By incorporating these best practices, businesses can build a cost-effective and efficient AWS environment poised for growth and innovation.