Do you want to learn how to build high performance, high availability, connectivity to the AWS cloud?
In this cloud networking article, we will discuss the various options for connecting to the cloud. We will discuss high availability connections, performance tuning, and optimizing the connectivity between the organization and the cloud.
Connecting To The Cloud (VPN vs Direct Connection)
Let’s first look at how you connect to the cloud. You can use a VPN or a direct connection. What is a VPN? A VPN connects your location to the AWS cloud by creating an IPSec tunnel and encrypting your data on a public network like the internet. Routing information can be shared across the VPN Link. Essentially the VPN is a private connection over the internet. It’s low cast, easy to set up, but the problem is internet performance is not guaranteed. This means that VPN performance is not consistent or guaranteed.
A VPN connection leaves your location and travels across the internet, which could include many internet service providers. You may have a great connection to your ISP, but that does not mean the rest of the internet service providers along the way have that same performance or availability. There are no guarantees in internet performance. This means VPNs are NOT reliable connections as bandwidth and latency are best-effort and not guaranteed.
While VPNs are low-cost and fast to set up, they are just not good enough when you have performance requirements. So, if you need guaranteed performance you must use a direct connection. Which is the equivalent of a private line in the networking world. When you’re dealing with a direct connect, you’re guaranteed performance meaning consistent latency and bandwidth. You won’t have to worry about variations in latency called jitter. You also won’t have to worry about the bandwidth not being there.
The direct connect gives you a performance benefit, but how does it work? It’s not exactly a private line from your organization to your AWS virtual private cloud (VPC). Unlike in your private network, you can’t just buy a wire from your data center directly to the AWS VPC. So, here’s what it really looks like.
What Does an AWS Direct Connection Look Like?
In your on-premises environment, you have a router. The router has a WAN (wide area network), connections to the direct connect location. In the direct connection location, you connect to your router. Next, your router needs to be connected to the AWS switch. This is referred to as a cross-connect – where your service provider plugs your router into the AWS switch with an ethernet cable. From there your connection is backhauled over the AWS network to your VPC. This is how a direct connection works. What if a single direct connection is not good enough because you need more bandwidth? Then bundle them into a link aggregation group that enables you to bundle several physical direct connections into a single logical connection. This increases speed and redundancy.
There are a couple of ways to achieve redundancy with your direct connect. Some options offer greater performance, and some offer greater redundancy, and we’ll discuss both.
When a VPN backup isn’t enough, most organizations take their on-premises environment and connect to two disparate direct connect. This is done on two different internet service providers, on two different direct connect link locations.
In this scenario, if a single point of presence were to fail, you’re covered. If the router sitting in the single direct connect location were to fail, you’re covered through the other direct connect location. If one service provider has a failure, the other service provider would be there. This is how you typically set up a super high availability environment where you would have redundant network connections.
Now let’s talk about the routing. Let’s say you have a connection in the US through AT&T to those direct connect locations. You also have another connection to another direct connect location through Verizon. Now you’re connecting through different direct connect locations and you’re connecting through different service providers. This creates a challenge if you’re going to route your traffic. The latencies are going to be different on the different internet service providers. This creates a scenario that can create asymmetric routing, which means your traffic may go to AWS via Verizon and come back with ATT. This can create out-of-order packets. To learn more about how to prevent this and how to load share safely across redundant connections please see our article on BGP HERE.
BGP has a decision algorithm that looks at weight, local preference, shortest number of paths, and so forth. This BGP decision algorithm allows only one of those links to work. If you have two WAN links through different direct connect locations, you need to do something about your routing. In your VPC, there is a CIDR range which is an aggregate route or a summary route of all the subnets you potentially have. When you have multiple direct connects, you’ll advertise across both links in both BGB peering sessions, the same summary or aggregate route.
When it comes to routing, the more specific route will always be chosen over the less specific route. Here’s what you’ll do when you have two links. On the top link, you will advertise some specific subnets over that BGP peering session. On the bottom link, you’ll advertise some other specific subnets. With each BGP peering session, we’ll have some more specific routes to certain subnets. Now, the traffic will traverse to the more specific routes from the on-premises environment to your VPC. By doing this, you can load share because you’re dealing with two sets of different sessions. Different subnets are going to be shared off different BGP peering sessions. But you’re still putting in that summary route or the aggregate route, which is your VPC CIDR range across both. Now routes also need to be advertised back to the organization’s datacenter as well, this is not shown in this picture for illustrative purposes.
With this setup, if either one of the links fail, you still have network layer reachability for everything. That’s how you load share across multiple connections, across different direct connect locations. But what if you need higher performance? What if the 10-gigabit ethernet interface to AWS direct location or connect location isn’t enough? What if you need two connections? What if you need three? What if you need four? Now, what do you do?
For those of us that have been working in the switching environment, we had something called port aggregation protocol. With port aggregation, you could take four 10 gig links and bundle them together and it would logically look like a 40 gig link. But in order to do this, there are going to have to be some special things. The links are going to have to be the same speed, the same size, and the same latency.
You can’t do a link aggregation group across multiple service providers. It should be on the same service provider as you need consistent latencies. With a link aggregation group, you can bundle up to four links and make it look like a single port. Because this is a single port, you’re not going to have to worry about out-of-order packet delivery. You won’t have to worry about things like fancy configuration of BGP along the way if you only use one of these connections. If you’re using 40 gigs on your primary connection, what happens when you failover to maybe a 1 gig or 10 gig connection? To solve this issue, you’ll do another link aggregation group through another direct connect location.
With link aggregation, you can take multiple groups and you can combine them together. If you combine multiple link aggregation groups, you’ll need to load share again. You must do the same thing we talked about with regards to routing earlier. You’ll advertise the CIDR range off both links and then on one BGP peering session, you’ll advertise a more specific route. Certain subnets and the next BGP peering session you’ll advertise specific routes to specific subnets.
AWS Cloud Networking Summary
We’ve talked about why you would use a direct connection versus a VPN connection. We talked about redundant connectivity with regards to a direct connection and a VPN backup. We’ve talked about using direct connections and another direct connection to back up your environment. And then we talked about load-sharing over multiple direct connections through BGP. We covered link aggregation groups, and then we talked about load sharing across multiple sets of different link aggregation groups across BGP peering sessions. You should have a solid understanding of direct connections. Add this information to your existing cloud computing skills knowledge and you are on your way to your dream cloud career.