Internet-Accesible ElastiCache Server Behind Twemproxy, Using NLB and ASG

0:00 / 0:00

Summary

In this post, we get into setting up an internet-accessible ElastiCache server using AWS services. Learn how to bypass the “VPC-isolated” limitation of the ElastiCache service while utilizing it for its robust capabilities. This tutorial requires a basic understanding of AWS and is ideal for those aiming to leverage Redis as more than just a cache. Redis is not only a high-performance in-memory cache but also a data structure store supporting various types: lists, sets, hashes, streams, and more. It offers features like persistence, pub/sub, and clustering, making it versatile for a variety of applications beyond caching.

‍

Overview

Redis is an open source, in-memory Key-Value database, meant for low-latency read/write activity. It is typically used as cache, but it’s not its only possible application. It can also be used as a NoSQL database.

If you came across this post, you probably might want to implement a Redis server. And you considered convenient to avoid taking care of the routine maintenance, security measures and updates that imply having such an asset. This is where cloud providers become convenient, so you probably thought AWS’s managed service ElastiCache could be a good choice.

‍

Goal

By design, ElastiCache clusters can only connect to resources inside the VPC it’s running on. In this solution, the Redis database will be accesible from anywhere in the internet.

This of course has some implications that must be taken into consideration before implementing this solution into your workload.

The goal of this blog is to illustrate how such an architecture would be, as well as to provide CloudFormation templates to test it for yourself. These templates can be found in this GitHub repository.

‍

Considerations

Typical Redis use cases imply high read and read-write activity, which means latency is a metric that should be taken into account. Network performance through the internet is not consistent and might be subject latency and bandwidth variations. So be sure that the link between your application and the database is stable.
Exposing a Redis server to the internet requires an extra layer of security, securing access either by filtering traffic by IP address range or setting up a password and using the AUTH command. In-transit encryption should be also implemented.

‍

Solution

To sort out this access limitation, an internet exposed proxy instance can be implemented. This comes with its own challenges and limitations, but the idea is to use a Redis proxy. There are multiple Redis proxy options available for free on the internet: RedisLabs/redis-cluster-proxy, Twitter/twemproxy, Nginx among others.

Twitter’s (or now 𝕏’s) twemproxy will be used, as suggested in this Trivago tech post, for being able to manage persistent connections with the Redis database. This is specially useful in contexts in which write activity is as high as read activity, and helps reducing latency by reducing the amount of connection opening and closing.

These proxy instances will not be directly exposed to the internet, but rather behind an AWS Network Load Balancer (NLB).

The load balancer plays a main security role, as it terminates TLS connections and allows to have twemproxy instances in a private subnet. Offloading encrypted packages using NLB avoids decrypting in the Redis instances, which degrades performance. AWS states that their NLB scales “infinitely”. Of course this is priced accordingly. Communication between twemproxy and ElastiCache is done through TCP. Twemproxy instances are launched by an autoscaling group (ASG), providing both fault tolerance and high availability by scaling horizontally. ASG allows to specify how many instances are going to be up for any given time, and how much it can scale in or out.

Desired Capacity: Initial capacity maintained by the group.
Minimum Capacity: Lower limit, prevents capacity reduction under the set value.
Maximum Capacity: Upper limit, prevents capacity from increasing beyond the set value.

For ElastiCache settings, this architecture allows for both Cluster Mode enabled and disabled (the comparison between them can be found here), as twemproxy configuration takes any amount of Redis Nodes and has multiple consistent sharding options. In the scope of this article, the ElastiCache cluster will be running on Cluster Mode disabled.

‍

Cloudformation Stacks

The resources will be split in three different Cloudformation (CFN) stacks, separating VPC, ElastiCache and the NLB/ASG resources.

‍

💡You should always use infrastructure as Code (laC) for your cloud computing projects

‍

Prerequisites

In order to follow this tutorial you need to have the following resources:

ACM Certificate
R53 Hosted Zone
EC2 KeyPair (you can easily create one in the AWS Management Console)
SSM Key to store the twemproxy password. Use key/value using token: password

It’s important to note that having an ACM Certificate implies that you own a domain name. You will need to specify it during the creation of the NLB/ASG stack. The SSL certificate is needed for TLS termination, otherwise you would need to use plain TCP communication through the internet which is dangerous and not recommended.

‍

Tutorial

Setting up the network

We first need to deploy the required network sources to our AWS account. Upload the VPC stack template in CloudFormation and specify the private and public subnets’ IP ranges, and the VPC IP range.

‍

This template will create a NAT Gateway, necessary to provide internet access to instances without public IPs. This resource is charged by hourly usage, so remember to shut it down after testing.

‍

Creating the ElastiCache cluster

To start the ElastiCache cluster we will be using a CFN stack that is already configured for Redis 7.0. You can change this stack to modify some of the cluster configurations. For now, we will specify a total of 2 nodes with a small instance size and use the private subnets we just created.

‍

The cluster provisioning is rather slow-ish, so be patient while AWS sets your database up. Once it’s done, we need to gather info from the resources created. We need to obtain the following info:

ElastiCache cluster’s primary endpoint: Go to ElastiCache > Redis Clusters > your_cluster_name and look for Primary endpoint
Redis port: It’s by default set to be 6379
Redis security group: Created by the template as your_stack_name-redis-sg
Certificate Arn: The ARN of your ACM Certificiate
Hosted Zone: The hosted zone id, found in R53 > Hosted Zones > your_hz_name
Subdomain name: This is the whole domain name for your endpoint, for example redis.yourdomain.com
Token Secret: You have to create a Secrets Manager key/value secret to store the password for the Redis proxy.

Once you noted this parameters you are ready to deploy the NLB/ASG stack. This values can be exported as outputs in a stack containing this resources, this was not done for the sake of simplicity.

‍

Launching the Redis proxy

Finally, we need to deploy the twemproxy stack, using the parameters gathered in the previous step. This template will launch a Network Load Balancer, whose domain name will be the one specified as subdomain name, which terminates TLS connections to the instances in the autoscaling group. This instances will be exact replicas of a certain LaunchTemplate, a sort of blueprint that defines an EC2 setup and startup. There will be as much instances as stated in the DesiredInstances parameter.

‍

‍

Once the stack is launched, we will be able to access the Redis database, using the server name we defined. The following is an example of a Redis client initialized in Ruby to connect through TLS.

‍

‍

Performance

We built a simple Ruby script to test connection performance, using an EC2 located in a different VPC in the same region. This simulates the case of the connection going through the internet. It achieved an average latency of ~0.26ms

‍

‍

Pricing

The following list of considerations per service must be taken into account to calculate the operating costs of this solution:

ElastiCache for Redis: Pricing depends on the type and size of the instance you use and the region in which your instances are run.
EC2 Instances: These are used for twemproxy. Cost will be determined by the instance type, region, and the number of instances you run.
Network Load Balancer (NLB): NLB pricing is based on the number of Load Balancer hours, plus the number of Load Balancer Capacity Units (LCU) used.
NAT Gateway: You’re billed for each NAT Gateway-hour that your NAT Gateway is provisioned and available. You can turn it off after setting instances up.
Data Transfer: There are costs associated with data going in and out of AWS services.
ACM Certificate: While the certificates themselves are free, if you use the AWS Certificate Manager’s Private Certificate Authority, there is a monthly fee.
Route 53: This has a small cost for domain registration and DNS service.
Secrets Manager: Pricing is based on the number of secrets stored and the number of API calls made.
CloudFormation: There’s no additional charge for CloudFormation. You pay for AWS resources (such as EC2 instances, EBS volumes, etc.) created using CloudFormation.

During the making of this solution, we had a daily spend of roughly ~4USD, where ~1.5USD was the ElastiCache. This accounts for about 100USD per month in fixed costs. Of course, depending on you needs and capacity, the spend will scale accordingly. A more detailed spending amount can be calculated using AWS Pricing Calculator.

‍

Conclusion

In this exploration, we’ve demonstrated how to bridge the inherent limitations of AWS’ ElastiCache service to make a Redis server accessible over the internet. While ElastiCache’s VPC-bound nature provides a robust layer of isolation, certain applications and scenarios necessitate broader accessibility. By utilizing AWS services judiciously, in conjunction with twemproxy, we’ve successfully crafted a solution that maintains the security and performance characteristics vital to Redis operations.

To those aiming to leverage the capabilities of Redis beyond its traditional use-cases, this tutorial serves as a starting point, illustrating the potential and flexibility of combining cloud-native services with open-source solutions. As always, the evolution of technology warrants continuous learning and adaptation to maximize the benefits of such integrations.

‍

Stay ahead of the curve on the latest trends and insights in big data, machine learning and artificial intelligence. Don't miss out and subscribe to our newsletter!

‍

Summary

‍

Overview

‍

Goal

By design, ElastiCache clusters can only connect to resources inside the VPC it’s running on. In this solution, the Redis database will be accesible from anywhere in the internet.

This of course has some implications that must be taken into consideration before implementing this solution into your workload.

‍

Considerations

Typical Redis use cases imply high read and read-write activity, which means latency is a metric that should be taken into account. Network performance through the internet is not consistent and might be subject latency and bandwidth variations. So be sure that the link between your application and the database is stable.
Exposing a Redis server to the internet requires an extra layer of security, securing access either by filtering traffic by IP address range or setting up a password and using the AUTH command. In-transit encryption should be also implemented.

‍

Solution

These proxy instances will not be directly exposed to the internet, but rather behind an AWS Network Load Balancer (NLB).

Desired Capacity: Initial capacity maintained by the group.
Minimum Capacity: Lower limit, prevents capacity reduction under the set value.
Maximum Capacity: Upper limit, prevents capacity from increasing beyond the set value.

‍

Cloudformation Stacks

The resources will be split in three different Cloudformation (CFN) stacks, separating VPC, ElastiCache and the NLB/ASG resources.

‍

💡You should always use infrastructure as Code (laC) for your cloud computing projects

‍

Prerequisites

In order to follow this tutorial you need to have the following resources:

ACM Certificate
R53 Hosted Zone
EC2 KeyPair (you can easily create one in the AWS Management Console)
SSM Key to store the twemproxy password. Use key/value using token: password

‍

Tutorial

Setting up the network

We first need to deploy the required network sources to our AWS account. Upload the VPC stack template in CloudFormation and specify the private and public subnets’ IP ranges, and the VPC IP range.

‍

This template will create a NAT Gateway, necessary to provide internet access to instances without public IPs. This resource is charged by hourly usage, so remember to shut it down after testing.

‍

Creating the ElastiCache cluster

‍

The cluster provisioning is rather slow-ish, so be patient while AWS sets your database up. Once it’s done, we need to gather info from the resources created. We need to obtain the following info:

ElastiCache cluster’s primary endpoint: Go to ElastiCache > Redis Clusters > your_cluster_name and look for Primary endpoint
Redis port: It’s by default set to be 6379
Redis security group: Created by the template as your_stack_name-redis-sg
Certificate Arn: The ARN of your ACM Certificiate
Hosted Zone: The hosted zone id, found in R53 > Hosted Zones > your_hz_name
Subdomain name: This is the whole domain name for your endpoint, for example redis.yourdomain.com
Token Secret: You have to create a Secrets Manager key/value secret to store the password for the Redis proxy.

Once you noted this parameters you are ready to deploy the NLB/ASG stack. This values can be exported as outputs in a stack containing this resources, this was not done for the sake of simplicity.

‍

Launching the Redis proxy

‍

‍

Once the stack is launched, we will be able to access the Redis database, using the server name we defined. The following is an example of a Redis client initialized in Ruby to connect through TLS.

‍

‍

Performance

‍

‍

Pricing

The following list of considerations per service must be taken into account to calculate the operating costs of this solution:

ElastiCache for Redis: Pricing depends on the type and size of the instance you use and the region in which your instances are run.
EC2 Instances: These are used for twemproxy. Cost will be determined by the instance type, region, and the number of instances you run.
Network Load Balancer (NLB): NLB pricing is based on the number of Load Balancer hours, plus the number of Load Balancer Capacity Units (LCU) used.
NAT Gateway: You’re billed for each NAT Gateway-hour that your NAT Gateway is provisioned and available. You can turn it off after setting instances up.
Data Transfer: There are costs associated with data going in and out of AWS services.
ACM Certificate: While the certificates themselves are free, if you use the AWS Certificate Manager’s Private Certificate Authority, there is a monthly fee.
Route 53: This has a small cost for domain registration and DNS service.
Secrets Manager: Pricing is based on the number of secrets stored and the number of API calls made.
CloudFormation: There’s no additional charge for CloudFormation. You pay for AWS resources (such as EC2 instances, EBS volumes, etc.) created using CloudFormation.

‍

Conclusion

‍

Stay ahead of the curve on the latest trends and insights in big data, machine learning and artificial intelligence. Don't miss out and subscribe to our newsletter!

‍

Thank you! The file will start to download shortly

Oops! Something went wrong while submitting the form.

Internet-Accessible ElastiCache Server Behind Twemproxy, Using NLB and ASG

Summary

Overview

Goal

Considerations

Solution

Cloudformation Stacks

💡You should always use infrastructure as Code (laC) for your cloud computing projects

Prerequisites

Tutorial

Setting up the network

Creating the ElastiCache cluster

Launching the Redis proxy

Performance

Pricing

Conclusion

Stay ahead of the curve on the latest trends and insights in big data, machine learning and artificial intelligence. Don't miss out and subscribe to our newsletter!

Download youre-book today!

Summary

Overview

Goal

Considerations

Solution

Cloudformation Stacks

💡You should always use infrastructure as Code (laC) for your cloud computing projects

Prerequisites

Tutorial

Setting up the network

Creating the ElastiCache cluster

Launching the Redis proxy

Performance

Pricing

Conclusion

Stay ahead of the curve on the latest trends and insights in big data, machine learning and artificial intelligence. Don't miss out and subscribe to our newsletter!

Related Articles

Related Articles

AI Transformation Challenge

AI Transformation Challenge

Download your
e-book today!