Integrating a 3rd-party firewall appliance with VMware Cloud on AWS by leveraging a Security/Transit VPC

With the latest “Transit VPC” feature in the VMware Cloud on AWS (VMC) 1.12 release, you can now inject static routes in the VMware managed Transit Gateway (or VTGW) to forward SDDC egress traffic to a 3rd-party firewall appliance for security inspection. The firewall appliance is deployed in a Security/Transit VPC to provide transit routing and policy enforcement between SDDCs and workload VPCs, on-premises data center and the Internet.

Important Notes:

  • For this lab, I’m using a Palo Alto VM-Series Next-Generation Firewall Bundle 2 AMI – refer to here and here for a detailed deployment instructions
  • “Source/Destination Check” must be disabled on all ENIs attached to the firewall
  • For Internet access, SNAT must be configured on firewall appliance to maintain route symmetry
  • Similarly, inbound access from Internet to a server within VMC requires DNAT on firewall appliance

Lab Topology:

SDDC Group – Adding static (default) route

After deployed the SDDC and SDDC Group, link your AWS account at here

after a while, the VTGW will show up in the Resource Access Manager (RAM) within your account, accept the shared VTGW and then create a VPC attachment to connect your Security/Transit VPC to the VTGW.

Once done, add a static default route at SDDC Group to point to the VTGW-SecVPC attachment.

the default route should appear soon under your SDDC (Network & Security —> Transit Connect), also notice we are advertising the local SDDC segments including the management subnets

AWS SETUP

Also we need to update the route table for each of the 3x firewall subnets

Route Table for the AWS native side subnet-01 (Trust Zone):

Route Table for the SDDC side subnet-02 (Untrust Zone):

Route Table for the public side subnet-03 (Internet Zone):

Route Table for the customer managed TGW:

Palo FW Configuration

Palo Alto firewall interface configuration

Virtual Router config:

Security Zones

NAT Config

  • Outbound SNAT to Internet
  • Inbound DNAT to Server01 in SDDC01

Testing FW rules

Testing Results
  • “untrust” —> “trust” deny
  • “trust” —> “untrust” allow
  • “untrust” -> “Internet” allow
  • “trust” -> “Internet” allow

Create a Tiny Core Linux VM Template for vSphere Lab environment

I’ve always wanted to find a lightweight VM template for running on nested vSphere lab environment, or sometimes for demonstrating live cloud migration such as vMotion to the VMware Cloud on AWS. Recently I have managed to achieve this by using the Tiny Core Linux distribution and it ticked all of my requirements:

  • ultra lightweight – the VM runs stable with only 1 vCPU, 256MB RAM and 64MB hard disk!
  • common linux tools installed – such as curl, wget, openssh etc
  • open-vm-tools installed
  • a lightweight http server serving a static site for running networking or load-balancing tests

In this post I will walk you through the process for creating a Tiny Core based Linux VM template including all of the above requirements. To begin, download the Tiny Core ISO from here. (For reference, I’m using the CorePlus-v11.1 release as I was getting some weird issues with OpenSSH on the latest v12.0 release)

Below are the settings I’ve used for my VM template:

  • VM hardware version 11 – compatible with ESXi 6.0 and later
  • Guest OS = Linux \ Other 3.x Linux (32-bit)
  • Memory = 256MB (this is the lowest I could go for getting a stable machine)
  • Hard Disk = 64MB – change drive type to IDE and set the virtual device node to IDE0:0
  • CDROM – change the virtual device node to IDE1:0
  • iSCSI controller – remove this as it’s not required

Also, you should use the below minimal settings for installing the Tiny Core OS. For detailed installation instructions, you can follow the step-by-step guide at here:

Once the OS has been installed and you are into the shell, create a below script to configure static IP settings for eth0 (and disable DHCP if required).

tc@box:~$ cat /opt/interfaces.sh
#!/bin/sh
# If you are booting Tiny Core from a very fast storage such as SSD / NVMe Drive and getting 
# "ifconfig: SIOCSIFADDR: No such Device" or "route: SIOCADDRT: Network is unreachable"
# error during system boot, use this sleep statement, otherwise you can remove it -
sleep .2
# kill dhcp client for eth0
sleep 1
if [ -f /var/run/udhcpc.eth0.pid ]; then
 kill `cat /var/run/udhcpc.eth0.pid`
 sleep 0.5
fi
# configure interface eth0
ifconfig eth0 192.168.0.1 netmask 255.255.255.0 broadcast 192.168.0.255 up
route add default gw 192.168.0.254
echo nameserver 192.168.0.254 >> /etc/resolv.conf
tc@box:~$sudo chmod 777 /opt/interfaces.sh
tc@box:~$sudo /opt/interfaces.sh

You may also want to reset the password for the default user “tc” (this can be used later for SSH access), and reset the root password as well:

tc@box:~$ passwd
Changing password for tc
...
tc@box:~$ sudo su
root@box:/home/tc# passwd
Changing password for root
...

Now install all the required packages and extensions, and your onboot package list should look like below:

tce-load -wi pcre.tcz curl.tcz wget.tcz open-vm-tools.tcz openssh.tcz busybox-httpd.tcz
tc@box:~$ cat /etc/sysconfig/tcedir/onboot.lst
pcre.tcz
curl.tcz
wget.tcz
open-vm-tools.tcz
openssh.tcz
busybox-httpd.tcz

Now configure and enable the SSH server — you can use user “tc” for a quick SSH test:

cd   /usr/local/etc/ssh
sudo cp ssh_config.orig ssh_config
sudo cp sshd_config.orig sshd_config
sudo /usr/local/etc/init.d/openssh start

Next, we’ll need to save all the settings and make them persistent across reboots, especially we’ll need to add the open-vm-tools and openssh onto the startup script (bootlocal.sh) — otherwise none of these services would be started after a reboot.

sudo su
echo '/opt/interfaces.sh' >> /opt/.filetool.lst
echo '/usr/local/etc/ssh' >> /opt/.filetool.lst
echo '/etc/shadow' >> /opt/.filetool.lst
echo '/opt/interfaces.sh' >> /opt/bootlocal.sh
echo '/usr/local/etc/init.d/open-vm-tools start &> /dev/null'  >> /opt/bootlocal.sh
echo '/usr/local/etc/init.d/openssh start &> /dev/null' >> /opt/bootlocal.sh

and most importantly, use the below command to backup all the config!

tc@box:~$ filetool.sh -b
Backing up files to /mnt/sda1/tce/mydata.tgz

The last one is for my own specific requirement — you can use the below script to setup a lightweight http server so it can be used for networking or load-balancing related tests.

tc@box:~$ sudo vi /opt/httpd.sh
sudo /usr/local/httpd/bin/busybox httpd -p 80 -h /usr/local/httpd/bin/
sleep .5
sudo touch /usr/local/httpd/bin/index.html
sudo chmod 666 /usr/local/httpd/bin/index.html
echo "this page is served by" >> /usr/local/httpd/bin/index.html
ifconfig eth0 | grep -i mask | awk '{print $2}'| cut -f2 -d:  >> /usr/local/httpd/bin/index.html
tc@box:~$ sudo chmod 777 /opt/httpd.sh
tc@box:~$ sudo echo '/opt/httpd.sh' >> /opt/bootlocal.sh
tc@box:~$ filetool.sh -b

Now you can go ahead and safely reboot the VM, and once it comes back online you should be able to SSH into it. Also the the open-vm-tools service should be automatically started and you can see the correct IP address and VM tool version reported in vCenter.

In addition, you should be able to see a static page like below by browsing to the VM address — the script (httpd.sh) should report back the VM’s IP address which could be handy for running a LB related testings.

NSX-T Automation with Terraform

Recently I have tried out the Terraform NSX-T Provider and it worked like a charm. In this post, I will demonstrate a simple example on how to leverage Terraform to provision a basic NSX tenant network environment, which includes the following:

  1. create a Tier-1 router
  2. create (linked) routed ports on the new T1 router and the existing upstream T0 router
  3. link the T1 router to the upstream T0 router
  4. create three logical switches with three logical ports
  5. create three downlink LIFs (with subnets/gateway defined) on the T1 router, and link each of them to the logical switch ports accordingly

Once the tenant environment is provisioned by Terraform, the 3x tenant subnets will be automatically published to the T0 router and propagated to the rest of the network (if BGP is enabled), and we should be able to reach the individual LIF addresses. Below is a sample topology deployed in my lab — (here I’m using pre-provisioned static routes between the T0 and upstream network for simplicity reasons).

Software Versions Used & Verified

  • Terraform – v0.12.25
  • NSX-T Provider – v3.0.1 (auto downloaded by Terraform)
  • NSX-T Data Center -v3.0.2 (build 0.0.16887200)

Sample Terraform Script

You can find the sample Terraform script at my Git repo here — remember to update the variables based on your own environment.

nsx_manager     = "192.168.100.125"
nsx_username    = "admin"
nsx_password    = "xxxxxx"
nsxt_t1_rt_name = "dev-demo-t1-rtr"
ls1_name        = "ls-dev-demo-web"
ls2_name        = "ls-dev-demo-app"
ls3_name        = "ls-dev-demo-db"
ls1_gw          = "172.31.101.1/24"
ls2_gw          = "172.31.102.1/24"
ls3_gw          = "172.31.103.1/24"

Run the Terraform script and this should take less than a minute to complete.

We can review and reverify that the required NSX components were built successfully via the NSX manager UI — Note: you’ll need to switch to the “Manager mode” to be able to see the newly create elements (T1 router, logical switches etc), as Terraform was interacting with the NSX management plane (via MP-API) directly.

In addition, we can also check and confirm the3x tenant subnets are published via T1 to T0 by SSH into the active edge node. Make sure you connect to the correct VRF table for the T0 service router (SR) in order to see the full route table — here we can see the 3x /24 subnets are indeed advertised from T1 to T0 as directly connected (t1c) routes.

As expected I can reach to each of the three LIFs on the T1 router from the lab terminal VM.

Enabling embedded Harbor Image Registry in vSphere 7 with Kubernetes

This will be a quick blog to demonstrate how to enable the (embedded) Harbor Image Registry in vSphere 7 with Kubernetes. Harbor was originally developed by VMware as a enterprise-grade private container registry. It was then donated to the CNCF in 2018 and recently became a CNCF graduated project.

For this demo, we’ll activate the embedded Harbor register within the vSphere 7 Kubernetes environment, and integrate it with the Supervisor Cluster for container management and deployment.

WHAT YOU’LL NEED:

Enabling the embedded Harbor Registry in vSphere 7 with Kubernetes

To begin, go to your vSphere 7 “Workload Cluster —> Namespaces —> Image Registry”, and then click “Enable Harbor”.

Make sure to select the vSAN storage policy to provide persistent storage as required for the Harbor installation.

The process will take a few minutes, and you should see 7x vSphere Pods after Harbor is installed and enabled. Take a note of the Harbor URL — this is an external address of the K8s load balancer that is created by NSX-T.

Push Container Images to Harbor Registry

First, let’s log into the Harbor UI and take a quick look. Since this is embedded within vSphere, it supports the SSO login 🙂

Harbor will automatically create a project for every vSphere namespace we have created. In my case, there are two projects “dev01” and “guestbook” created, which are mapped to the two namespaces in my vSphere workload cluster.

Click the “dev01” project, and then “repository” — as expected it is currently empty, and we’ll be pushing container images to this repository for a quick test. However, before we can do that we’ll need to download and import the certificate to our client machine for certificate-based authentication. Click the “Registry Certificate” to download the ca.crt file.

Next, on the local client create a new directory under /etc/docker/cert.d/ using the same name as the registry FQDN (URL).

[root@pacific-ops01 ~]# cd /etc/docker/certs.d/
[root@pacific-ops01 certs.d]# mkdir 192.168.100.133
[root@pacific-ops01 certs.d]# cd 192.168.100.133/
[root@pacific-ops01 192.168.100.133]# vim ca.crt

Now, let’s get a test (nginx) image, tag it, and try to push it to the dev01 repository.

[root@pacific-ops01 ~]# docker login 192.168.100.133 --username administrator@vsphere.local
Password: 
Login Succeeded

[root@pacific-ops01 ~]# docker pull nginx
Using default tag: latest
Trying to pull repository docker.io/library/nginx ... 
latest: Pulling from docker.io/library/nginx
bf5952930446: Pull complete 
cb9a6de05e5a: Pull complete 
9513ea0afb93: Pull complete 
b49ea07d2e93: Pull complete 
a5e4a503d449: Pull complete 
Digest: sha256:b0ad43f7ee5edbc0effbc14645ae7055e21bc1973aee5150745632a24a752661
Status: Downloaded newer image for docker.io/nginx:latest
[root@pacific-ops01 ~]# 
[root@pacific-ops01 ~]# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
docker.io/nginx     latest              4bb46517cac3        3 days ago          133 MB
[root@pacific-ops01 ~]# 
[root@pacific-ops01 ~]# docker tag docker.io/nginx 192.168.100.133/dev01/nginx
[root@pacific-ops01 ~]# 
[root@pacific-ops01 ~]# docker push 192.168.100.133/dev01/nginx
The push refers to a repository [192.168.100.133/dev01/nginx]
550333325e31: Pushed 
22ea89b1a816: Pushed 
a4d893caa5c9: Pushed 
0338db614b95: Pushed 
d0f104dc0a1f: Pushed 
latest: digest: sha256:179412c42fe3336e7cdc253ad4a2e03d32f50e3037a860cf5edbeb1aaddb915c size: 1362
[root@pacific-ops01 ~]# 

It works, perfect! Now refresh the repository and we can see the new nginx image we just pushed through.

Deploy Kubernetes Pods to Supervisor Cluster from the Harbor Registry

Let’s run a quick test to deploy a Pod using the nginx image from our Harbor Registry. First, log into the Supervisor Cluster and switch to the “dev01” namespace/context.

[root@pacific-ops01 ~]# kubectl vsphere login --server=192.168.100.129 --vsphere-username administrator@vsphere.local --insecure-skip-tls-verify
Password: 
Logged in successfully.
…
[root@pacific-ops01 ~]# kubectl config use-context dev01
Switched to context "dev01".

Make a nginx Pod config using the image path from our Harbor repository.

apiVersion: v1
kind: Pod
metadata:
  labels:
    run: nginx-demo
  name: nginx-demo
  namespace: dev01
spec:
  containers:
  - image: 192.168.100.133/dev01/nginx
    name: nginx-demo
  restartPolicy: Always

Deploy the Pod.

[root@pacific-ops01 ~]# kubectl apply -f nginx-demo.yaml 
pod/nginx-demo created

Monitor the events and soon we can see the Pod is deployed successfully from the image fetched from the Harbor repository.

[root@pacific-ops01 ~]# kubectl get  events -n dev01
LAST SEEN   TYPE     REASON                         OBJECT                                                    MESSAGE
48s         Normal   Status                         image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0   pacific-esxi-3: Image status changed to Resolving
40s         Normal   Resolve                        image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0   pacific-esxi-3: Image resolved to ChainID sha256:80b21afd8140706d5fe3b7106ae6147e192e6490b402bf2dd2df5df6dac13db8
40s         Normal   Bind                           image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0   Imagedisk 80b21afd8140706d5fe3b7106ae6147e192e6490b402bf2dd2df5df6dac13db8-v0 successfully bound
32s         Normal   Status                         image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0   Image status changed to Fetching
14s         Normal   Status                         image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0   Image status changed to Ready
7s          Normal   SuccessfulRealizeNSXResource   pod/nginx-demo                                            Successfully realized NSX resource for Pod
<unknown>   Normal   Scheduled                      pod/nginx-demo                                            Successfully assigned dev01/nginx-demo to pacific-esxi-1
50s         Normal   Image                          pod/nginx-demo                                            Image nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0 bound successfully
39s         Normal   Pulling                        pod/nginx-demo                                            Waiting for Image dev01/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0
14s         Normal   Pulled                         pod/nginx-demo                                            Image dev01/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0 is ready
7s          Normal   SuccessfulMountVolume          pod/nginx-demo                                            Successfully mounted volume default-token-bqxc2
7s          Normal   Created                        pod/nginx-demo                                            Created container nginx-demo
7s          Normal   Started                        pod/nginx-demo                                            Started container nginx-demo



[root@pacific-ops01 ~]# kubectl get pods -n dev01   
NAME         READY   STATUS    RESTARTS   AGE
nginx-demo   1/1     Running   0          60s

Use kubectl describe pod to confirm the nginx Pod is indeed running on the image pulled from the Harbor registry.

Deploying Contour Ingress Controller on Tanzu Kubernetes Grid (TKG)

This blog provides a guide to help you deploying Contour Ingress Controller onto a Tanzu Kubernetes Grid (TKG) cluster. Contour is an open source Kubernetes ingress controller that exposes HTTP/HTTPS routes for internal services so they are reachable from outside the cluster. Like many other ingress controllers, Contour can provide advanced L7 URL/URI based routing and load balancing, as well as SSL/TLS termination capabilities.

Contour was originally developed by Heptio (VMware) and has been recently handed over to CNCF as an incubating project. Contour consists of a control plane that is provisioned via a K8s deployment, and an Envoy-based data plane running as a Daemonset on every cluster worker node.

img
https://projectcontour.io/contour-v014/

WHAT YOU’LL NEED:

For this lab, we’ll install the Contour ingress controller onto a TKG cluster, and we’ll then deploy a sample app (supplied within the manifest) for testing the Ingress services. The overall service topology will look like this:

Install the Contour Ingress Controller

To begin, unzip the TKG extension manifest (I’m using v1.1.0).

[root@pacific-ops01 ~]# tar -xzf tkg-extensions-manifests-v1.1.0-vmware.1.tar.gz 

Log into your TKG cluster and make sure you are in the correct context.

[root@pacific-ops01 ~]# kubectl vsphere login --server=192.168.100.129 --vsphere-username administrator@vsphere.local --insecure-skip-tls-verify --tanzu-kubernetes-cluster-name dev01-tkg-01 --tanzu-kubernetes-cluster-namespace dev01
[root@pacific-ops01 ~]# kubectl config use-context dev01-tkg-01 

Next, install the Cert-Manager (for Contour Ingress) onto the TKG cluster.

Before we can install Contour and Envoy, we’ll need to make a small change to the Envoy service config (02-service-envoy.yaml). As illustrated in the service topology, we will deploy a LoadBalancer in front of the ingress controller. So we’ll update the Envoy service type from NodePort (default) to LoadBalancer.

Now deploy Contour and Envoy onto the cluster.

We can see a Contour deployment, and an Envoy daemonset of 3x (we have 3 worker nodes) have been deployed under the namespace of tanzu-system-ingress. Also, take a note of the external IP (192.168.100.130) of the Envoy LoadBalancer service as this will be used by our Ingress services.

Deploy a Sample App for testing Ingress Services

Deploy the sample app from within the manifest, this will create:

  • one new namespace called “test-ingress”
  • one deployment of the “helloweb” app, with a Replicaset of 3x Pods
  • two separate services called “s1” & “s2” — Note: both services are actually pointing to the same 3x Pods (as they are using the same Pod selector)

Verify the Pods are up and running

[root@pacific-ops01 ~]# kubectl get pods -n test-ingress 
NAME                        READY   STATUS    RESTARTS   AGE
helloweb-7cd97b9cb8-qjwtk   1/1     Running   0          50s
helloweb-7cd97b9cb8-r9s8g   1/1     Running   0          51s
helloweb-7cd97b9cb8-swztl   1/1     Running   0          51s

and both services (s1 & s2) are deployed as expected.

[root@pacific-ops01 ~]# kubectl get svc -n test-ingress 
NAME   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
s1     ClusterIP   10.40.183.104   <none>        80/TCP    1m
s2     ClusterIP   10.40.129.12    <none>        80/TCP    1m

We can’t get to these services yet as they are internal K8s services (ClusterIP) only. We’ll need to deploy an Ingress object so that Contour can expose these services and route traffic to them from external. The good news is that there’s already an Ingress config template provided in the manifest. I’ve made the following changes to the template as per my lab environment (my lab domain is vxlan.co). Note the hostname (URL) and the path (URI) as we’ll be using these to access the two services.

Deploy the Ingress object.

[root@pacific-ops01 ~]# cd tkg-extensions-v1.1.0/ingress/contour/examples/https-ingress 
[root@pacific-ops01 https-ingress]# kubectl apply -f .
ingress.extensions/https-ingress created
secret/https-secret created

Verify the Ingress service is running as expected

[root@pacific-ops01 https-ingress]# kubectl get ingress -n test-ingress 
NAME            HOSTS              ADDRESS   PORTS     AGE
https-ingress   ingress.vxlan.co             80, 443   2m

Create a DNS record with the ingress hostname by pointing to the Envoy load balancer external IP.

Now test access to the s1 service by browsing https://ingress.vxlan.co/s1

and s2 service by browsing https://ingress.vxlan.co/s2

Congrats, you have successfully deployed a Contour Ingress controller on a TKG cluster!