Integrate F5 Load Balancers into VMware Cloud on AWS SDDC Environment

With the recent release of VMware Cloud on AWS SDDC version 1.18, we have introduced a ton of advanced networking capabilities which opened up possibilities for many new interesting use cases. Customers can now utilise the NSX Manager UI (or VMC Policy API) to configure route aggregation at each SDDC level, and this provides an efficient way to solve the 100 DX route limit. Customer can also create additional Tier-1 Compute Gateways (Multi-CGWs) with static route injection capabilities to address different requirements such as network multi-tenancy, overlapping IPv4 environments and integrating with 3rd-party network & security appliances etc. You can read more details about the new features at here.

For this article we will focus on the use case of integrating 3rd-party load balancers into VMware Cloud on AWS. Specifically we will look at how to deploy and integrate a HA pair of F5 BIG-IP Local Traffic Manager (LTM) Virtual Edition (VE) into a SDDC cluster.

We will utilise the Route Aggregation and Multi-CGW features to create an inline load balancing topology and integrate with F5 LTMs within the lab SDDC cluster. Traffic from external towards the web servers will be routed through the F5 and the client source addresses are preserved (no SNAT is required and no need to configure XFF at the web servers)

prerequisites
  • Deploy a VMware Cloud on AWS SDDC cluster (ver 1.18+)
  • Access to F5 BIG-IP LTM VE (I’m using v16.1.2, a 30-day trial available here)
  • Access to an AWS account that is linked to the SDDC (so you can test connectivity via the connected VPC or VMware Transit Connect)
  • Deploy 2x web servers in SDDC for the LTM load balancing pool
Lab Procedures

I won’t cover every detailed step but at a high level we’ll need to perform the following tasks:

  1. configure SDDC route aggregation in NSX manager (so that Multi-CGW segment routes are advertised to the external)
  2. create 3x Tier-1 CGWs as per the below lab topology (1x routed CGW-LB-F5 for F5 Outside interfaces, 1x isolated CGW-LB-WEB for F5 Inside interfaces and the web segment, and 1x isolated CGW-LB-HA for F5 HA communications)
  3. create relevant network segments and attached to the above 3x CGWs accordingly
  4. configure static routes at the CGW-LB-F5 and CGW-LB-WEB for ingress and egress transit routing
  5. deploy the F5 LTM HA pair and configure network settings
  6. configure LTM load balancing settings (Nodes, Pool, VIP) and run tests
F5 Integration Lab Topology
STEP-1

To begin, we will first configure SDDC route aggregation at the NSX Manager UI. This will leverage an AWS managed prefix-list to announce summarised routes to external, so the Multi-CGW segments are accessible from connected VPC and Intranet (Direct Connect or VMware Transit Connect).

Within the NSX Manager UI, locate Networking > Global Configurations > Route Aggregation, create an aggregation prefix-list to summarise the SDDC CIDR block (172.30.0.0/16 in my case)

Then create a route configuration to announce the prefix-list to the INTRANET endpoint — since I’m using the VMware Transit Connect for my SDDC external connectivity, the summarised routes will be advertised to the VTGW.

Back at the VMC console we can verify the summarised route (172.30.0.0/16) is being advertised at the SDDC under Networking & Security > Transit Connect > Advertised Routes. Note the SDDC mgmt route (173.30.0.0/23) will not be summarised and will always be advertised explicitly.

STEP-2

Go to the NSX Manager again and create 3x Tier-1 CGWs as per the lap topology. Note we will need to select “routed” type for the CGW-LB-F5 in order to inject a static route towards F5 for the web server segment, and “isolated” type is required for the CGW-LB-WEB in order to inject default route (0.0.0.0/0) towards the F5.

STEP-3

Next, configure the below network segments as per the lab topology and attach them to the 3x CGWs accordingly. Note the VM-MGMT-NET01 is created at the default CGW and this is to host the F5 LTM management interfaces, which use a separate management route table.

STEP-4

Additionally, configure the CGW-LB-F5 to add a static route (for LB-F5-WEB01 segment) towards the F5 — the next-hop will be the Outside interface floating IP (172.30.100.10) between the LTM HA pair.

Similarly, configure the CGW-LB-WEB to add a default route towards the F5 — the next-hop will be the Inside interface floating IP (172.30.100.100) between the LTM HA pair.

STEP-5

We are now ready to deploy and configure the F5 LTM VE appliances. For the purpose of the demo I will only show the key network configurations of the LTM01.

Once the appliances are deployed and system has been initialised, go to each LTM management UI to configure the local network settings. First, create the data VLANs for each interface under Network > VLANs — notice here all VLANs are internal to F5 only and must be untagged at each interface, as VLAN trunking to a guest VM is not supported by VMware Cloud on AWS at this stage.

Next, configure the local interface IP addresses under Network > Self-IPs

Also add the static routes including default route under Network > Routes

At this stage, you are ready to add the peer device and create a HA failover device group. Once the device group is created and the HA pair is in sync, you can now create additional HA floating IP addresses for both the Inside and Outside interfaces.

Note here for the floating IPs you’ll need to apply a floating traffic group (I’m using the default traffic-group-1).

STEP-6

Finally we are ready to configure the load balancing settings at the F5 LTM HA pair for the workloads deployed in SDDC. For this lab I have deployed two simple Linux VMs with Apache web servers (172.30.101.11 & 172.30.101.12)

First, create 2x nodes for the web servers under Local Traffic > Nodes:

Second, create a LB pool at Local Traffic > Pools with the 2x nodes and select appropriate Health Monitor and Load Balancing Method.

Lastly, go to Local Traffic > Virtual Servers and deploy a HTTP VIP for the web service using the LB pool we have just created.

Assuming everything is configured correctly you should see the VIP coming online straight away, and you can also verify the service status at Local Traffic > Network Map:

Now hit the VIP address in your browser and you should see traffic is being load balanced between the two nodes (since we selected the basic Round Robin LB method).

and because the F5s are deployed in inline (routed) mode without SNAT, the web servers are able to see the original source IPs from the clients.

Provision and integrate iSCSI storage with VMware Cloud on AWS using Amazon FSx for NetApp ONTAP

With the recently announced Amazon FSx for NetApp ONTAP, it is very exciting that for the first time we have a fully managed ONTAP file system in the cloud! What’s more interesting about this service is that we can now deliver high-performance block storage to the workloads running on VMware Cloud on AWS (VMC) through a first-party Amazon managed service!

In this post I will walk you through a simple example for provisioning and integrating iSCSI-based block storage to a Windows workload running on VMC environment using Amazon FSx for NetAPP ONTAP. For this demo I’ve provisioned the FSx service in a shared service VPC, which is connected to the VMC SDDC cluster through an AWS Transit Gateway (TGW) via VPN attachment (as per below diagram).

Depending on your environment or requirements, you can also leverage a VMware Transit Connect (or VTGW) to provide high speed VPC connections between the shared service VPC and VMC, or simply provision the FSx service in the connected VPC so no TGW/VTGW is required.

AWS Configuration

To begin, I simply go to AWS console and select FSx in the service category and provision an Amazon FSx for NetApp ONTAP service in my preferred region. As a quick summary I have used the below settings:

  • SSD storage capacity 1024GB (min 1024GB, max 192TB)
  • sustained throughput capacity 512MB/s
  • Multi-AZ (ontap cluster) deployment
  • 1x storage virtual machine (svm01) to provide iSCSI service
  • 1x default volume (/vol01) of 200GB to host the iSCSI LUNs
  • storage efficiency (deduplication/compression etc): enabled
  • capacity pool tiering policy: enabled

After around 20min wait, the FSx ONTAP file system will be provisioned and ready for service. If you are using the above settings you should see a summary page similar like below. You can also retrieve the management endpoint IP address under the “Network & Security” tab.

Note the management addresses (for both the cluster and SVMs) are automatically allocated from within a 198.19.0.0/16 range, and the same address block is going to provide the floating IP for NFS/SMB service (so customers don’t have to change file share mounting point address during an ONTAP cluster failover). Since this subnet is not natively deployed in a VPC, AWS will automatically inject the endpoint addresses (for management and NFS/SMB) into the specific VPC route tables based on your configurations.

However, you’ll need to specifically inject a static route for this (see below) on TGW/VTGW, especially if you are planning to provide NFS/SMB services to the VMC SDDCs over peering connections — see here for more details.

Conversely, this static route is not required if you are only using iSCSI services as the iSCSI endpoints are provisioned directly onto the native subnets hosting the FSx service and are not using the floating IP range — more on this later.

Next, we’ll verify the SVM (svm01) and Volume (vol01) status and make sure they are all online and healthy before we can provision iSCSI LUNs. Note: you’ll always see a separate root volume (automatically) created for each SVM.

Now click the “svm01” to dive into the details, and you’ll find the iSCSI endpoint IP addresses (again they are in the native VPC subnets not the mgmt floating IP range)

ONTAP CLI CONFIGURATION

We are now ready to move onto the iSCSI LUN provisioning. This can be done by using either ONTAP API or ONTAP CLI, which is what I’m using here. First, we’ll SSH into the cluster management IP and verify the SVM and volume status.

Since this is a fully managed service, iSCSI service has been already activated on the SVM and the cluster is listening for iSCSI sessions on the 2x subnets across both AZs. You’ll also find the iSCSI target name here.

Now we’ll create a 20GB LUN for the Windows client running on VMC.

Next, create an igroup to include the Windows client iSCSI initiator. Notice the ALUA feature is enabled by default — this is pretty cool as we can test iSCSI MPIO as well 🙂

Finally, map the igroup to the LUN we have just created, make sure the LUN is now in “mapped” status and we are all done here!

Windows Client Setup

On the Windows client (running on the VMC), launch the iSCSI initiator configuration and put the iSCSI IP address of one of the FSx subnets in “Quick Connect”, Windows will automatically discover the available targets on the FSx side and log into the fabric.

Optionally, you could add a secondary storage I/O path if MPIO is installed/enabled on the Windows client. Like in my example here, I have add a second iSCSI session by using another iSCSI endpoint address in a different FSx subnet/AZ.

Now click “Auto Configure” under “Volumes and Devices” to discover and configure the iSCSI LUN device.

Next, go to “Computer Management” then “Disk Management” —> you should see a new 20GB disk has been automatically discovered (or manually refresh the hardware list if you can’t see the new disk yet). Initialise and format the disk.

The new 20GB disk is now ready to use. In the disk properties, you can verify the 2x iSCSI I/O paths as per below, and you can also change the MPIO policy based on your own requirements.

Integrating a 3rd-party firewall appliance with VMware Cloud on AWS by leveraging a Security/Transit VPC

With the latest “Transit VPC” feature in the VMware Cloud on AWS (VMC) 1.12 release, you can now inject static routes in the VMware managed Transit Gateway (or VTGW) to forward SDDC egress traffic to a 3rd-party firewall appliance for security inspection. The firewall appliance is deployed in a Security/Transit VPC to provide transit routing and policy enforcement between SDDCs and workload VPCs, on-premises data center and the Internet.

Important Notes:

  • For this lab, I’m using a Palo Alto VM-Series Next-Generation Firewall Bundle 2 AMI – refer to here and here for a detailed deployment instructions
  • “Source/Destination Check” must be disabled on all ENIs attached to the firewall
  • For Internet access, SNAT must be configured on firewall appliance to maintain route symmetry
  • Similarly, inbound access from Internet to a server within VMC requires DNAT on firewall appliance

Lab Topology:

SDDC Group – Adding static (default) route

After deployed the SDDC and SDDC Group, link your AWS account at here

after a while, the VTGW will show up in the Resource Access Manager (RAM) within your account, accept the shared VTGW and then create a VPC attachment to connect your Security/Transit VPC to the VTGW.

Once done, add a static default route at SDDC Group to point to the VTGW-SecVPC attachment.

the default route should appear soon under your SDDC (Network & Security —> Transit Connect), also notice we are advertising the local SDDC segments including the management subnets

AWS SETUP

Also we need to update the route table for each of the 3x firewall subnets

Route Table for the AWS native side subnet-01 (Trust Zone):

Route Table for the SDDC side subnet-02 (Untrust Zone):

Route Table for the public side subnet-03 (Internet Zone):

Route Table for the customer managed TGW:

Palo FW Configuration

Palo Alto firewall interface configuration

Virtual Router config:

Security Zones

NAT Config

  • Outbound SNAT to Internet
  • Inbound DNAT to Server01 in SDDC01

Testing FW rules

Testing Results
  • “untrust” —> “trust” deny
  • “trust” —> “untrust” allow
  • “untrust” -> “Internet” allow
  • “trust” -> “Internet” allow