Enabling embedded Harbor Image Registry in vSphere 7 with Kubernetes

This will be a quick blog to demonstrate how to enable the (embedded) Harbor Image Registry in vSphere 7 with Kubernetes. Harbor was originally developed by VMware as a enterprise-grade private container registry. It was then donated to the CNCF in 2018 and recently became a CNCF graduated project.

For this demo, we’ll activate the embedded Harbor register within the vSphere 7 Kubernetes environment, and integrate it with the Supervisor Cluster for container management and deployment.


Enabling the embedded Harbor Registry in vSphere 7 with Kubernetes

To begin, go to your vSphere 7 “Workload Cluster —> Namespaces —> Image Registry”, and then click “Enable Harbor”.

Make sure to select the vSAN storage policy to provide persistent storage as required for the Harbor installation.

The process will take a few minutes, and you should see 7x vSphere Pods after Harbor is installed and enabled. Take a note of the Harbor URL — this is an external address of the K8s load balancer that is created by NSX-T.

Push Container Images to Harbor Registry

First, let’s log into the Harbor UI and take a quick look. Since this is embedded within vSphere, it supports the SSO login 🙂

Harbor will automatically create a project for every vSphere namespace we have created. In my case, there are two projects “dev01” and “guestbook” created, which are mapped to the two namespaces in my vSphere workload cluster.

Click the “dev01” project, and then “repository” — as expected it is currently empty, and we’ll be pushing container images to this repository for a quick test. However, before we can do that we’ll need to download and import the certificate to our client machine for certificate-based authentication. Click the “Registry Certificate” to download the ca.crt file.

Next, on the local client create a new directory under /etc/docker/cert.d/ using the same name as the registry FQDN (URL).

[root@pacific-ops01 ~]# cd /etc/docker/certs.d/
[root@pacific-ops01 certs.d]# mkdir
[root@pacific-ops01 certs.d]# cd
[root@pacific-ops01]# vim ca.crt

Now, let’s get a test (nginx) image, tag it, and try to push it to the dev01 repository.

[root@pacific-ops01 ~]# docker login --username administrator@vsphere.local
Login Succeeded

[root@pacific-ops01 ~]# docker pull nginx
Using default tag: latest
Trying to pull repository docker.io/library/nginx ... 
latest: Pulling from docker.io/library/nginx
bf5952930446: Pull complete 
cb9a6de05e5a: Pull complete 
9513ea0afb93: Pull complete 
b49ea07d2e93: Pull complete 
a5e4a503d449: Pull complete 
Digest: sha256:b0ad43f7ee5edbc0effbc14645ae7055e21bc1973aee5150745632a24a752661
Status: Downloaded newer image for docker.io/nginx:latest
[root@pacific-ops01 ~]# 
[root@pacific-ops01 ~]# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
docker.io/nginx     latest              4bb46517cac3        3 days ago          133 MB
[root@pacific-ops01 ~]# 
[root@pacific-ops01 ~]# docker tag docker.io/nginx
[root@pacific-ops01 ~]# 
[root@pacific-ops01 ~]# docker push
The push refers to a repository []
550333325e31: Pushed 
22ea89b1a816: Pushed 
a4d893caa5c9: Pushed 
0338db614b95: Pushed 
d0f104dc0a1f: Pushed 
latest: digest: sha256:179412c42fe3336e7cdc253ad4a2e03d32f50e3037a860cf5edbeb1aaddb915c size: 1362
[root@pacific-ops01 ~]# 

It works, perfect! Now refresh the repository and we can see the new nginx image we just pushed through.

Deploy Kubernetes Pods to Supervisor Cluster from the Harbor Registry

Let’s run a quick test to deploy a Pod using the nginx image from our Harbor Registry. First, log into the Supervisor Cluster and switch to the “dev01” namespace/context.

[root@pacific-ops01 ~]# kubectl vsphere login --server= --vsphere-username administrator@vsphere.local --insecure-skip-tls-verify
Logged in successfully.
[root@pacific-ops01 ~]# kubectl config use-context dev01
Switched to context "dev01".

Make a nginx Pod config using the image path from our Harbor repository.

apiVersion: v1
kind: Pod
    run: nginx-demo
  name: nginx-demo
  namespace: dev01
  - image:
    name: nginx-demo
  restartPolicy: Always

Deploy the Pod.

[root@pacific-ops01 ~]# kubectl apply -f nginx-demo.yaml 
pod/nginx-demo created

Monitor the events and soon we can see the Pod is deployed successfully from the image fetched from the Harbor repository.

[root@pacific-ops01 ~]# kubectl get  events -n dev01
LAST SEEN   TYPE     REASON                         OBJECT                                                    MESSAGE
48s         Normal   Status                         image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0   pacific-esxi-3: Image status changed to Resolving
40s         Normal   Resolve                        image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0   pacific-esxi-3: Image resolved to ChainID sha256:80b21afd8140706d5fe3b7106ae6147e192e6490b402bf2dd2df5df6dac13db8
40s         Normal   Bind                           image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0   Imagedisk 80b21afd8140706d5fe3b7106ae6147e192e6490b402bf2dd2df5df6dac13db8-v0 successfully bound
32s         Normal   Status                         image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0   Image status changed to Fetching
14s         Normal   Status                         image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0   Image status changed to Ready
7s          Normal   SuccessfulRealizeNSXResource   pod/nginx-demo                                            Successfully realized NSX resource for Pod
<unknown>   Normal   Scheduled                      pod/nginx-demo                                            Successfully assigned dev01/nginx-demo to pacific-esxi-1
50s         Normal   Image                          pod/nginx-demo                                            Image nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0 bound successfully
39s         Normal   Pulling                        pod/nginx-demo                                            Waiting for Image dev01/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0
14s         Normal   Pulled                         pod/nginx-demo                                            Image dev01/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0 is ready
7s          Normal   SuccessfulMountVolume          pod/nginx-demo                                            Successfully mounted volume default-token-bqxc2
7s          Normal   Created                        pod/nginx-demo                                            Created container nginx-demo
7s          Normal   Started                        pod/nginx-demo                                            Started container nginx-demo

[root@pacific-ops01 ~]# kubectl get pods -n dev01   
nginx-demo   1/1     Running   0          60s

Use kubectl describe pod to confirm the nginx Pod is indeed running on the image pulled from the Harbor registry.

Deploying Contour Ingress Controller on Tanzu Kubernetes Grid (TKG)

This blog provides a guide to help you deploying Contour Ingress Controller onto a Tanzu Kubernetes Grid (TKG) cluster. Contour is an open source Kubernetes ingress controller that exposes HTTP/HTTPS routes for internal services so they are reachable from outside the cluster. Like many other ingress controllers, Contour can provide advanced L7 URL/URI based routing and load balancing, as well as SSL/TLS termination capabilities.

Contour was originally developed by Heptio (VMware) and has been recently handed over to CNCF as an incubating project. Contour consists of a control plane that is provisioned via a K8s deployment, and an Envoy-based data plane running as a Daemonset on every cluster worker node.



For this lab, we’ll install the Contour ingress controller onto a TKG cluster, and we’ll then deploy a sample app (supplied within the manifest) for testing the Ingress services. The overall service topology will look like this:

Install the Contour Ingress Controller

To begin, unzip the TKG extension manifest (I’m using v1.1.0).

[root@pacific-ops01 ~]# tar -xzf tkg-extensions-manifests-v1.1.0-vmware.1.tar.gz 

Log into your TKG cluster and make sure you are in the correct context.

[root@pacific-ops01 ~]# kubectl vsphere login --server= --vsphere-username administrator@vsphere.local --insecure-skip-tls-verify --tanzu-kubernetes-cluster-name dev01-tkg-01 --tanzu-kubernetes-cluster-namespace dev01
[root@pacific-ops01 ~]# kubectl config use-context dev01-tkg-01 

Next, install the Cert-Manager (for Contour Ingress) onto the TKG cluster.

Before we can install Contour and Envoy, we’ll need to make a small change to the Envoy service config (02-service-envoy.yaml). As illustrated in the service topology, we will deploy a LoadBalancer in front of the ingress controller. So we’ll update the Envoy service type from NodePort (default) to LoadBalancer.

Now deploy Contour and Envoy onto the cluster.

We can see a Contour deployment, and an Envoy daemonset of 3x (we have 3 worker nodes) have been deployed under the namespace of tanzu-system-ingress. Also, take a note of the external IP ( of the Envoy LoadBalancer service as this will be used by our Ingress services.

Deploy a Sample App for testing Ingress Services

Deploy the sample app from within the manifest, this will create:

  • one new namespace called “test-ingress”
  • one deployment of the “helloweb” app, with a Replicaset of 3x Pods
  • two separate services called “s1” & “s2” — Note: both services are actually pointing to the same 3x Pods (as they are using the same Pod selector)

Verify the Pods are up and running

[root@pacific-ops01 ~]# kubectl get pods -n test-ingress 
NAME                        READY   STATUS    RESTARTS   AGE
helloweb-7cd97b9cb8-qjwtk   1/1     Running   0          50s
helloweb-7cd97b9cb8-r9s8g   1/1     Running   0          51s
helloweb-7cd97b9cb8-swztl   1/1     Running   0          51s

and both services (s1 & s2) are deployed as expected.

[root@pacific-ops01 ~]# kubectl get svc -n test-ingress 
s1     ClusterIP   <none>        80/TCP    1m
s2     ClusterIP    <none>        80/TCP    1m

We can’t get to these services yet as they are internal K8s services (ClusterIP) only. We’ll need to deploy an Ingress object so that Contour can expose these services and route traffic to them from external. The good news is that there’s already an Ingress config template provided in the manifest. I’ve made the following changes to the template as per my lab environment (my lab domain is vxlan.co). Note the hostname (URL) and the path (URI) as we’ll be using these to access the two services.

Deploy the Ingress object.

[root@pacific-ops01 ~]# cd tkg-extensions-v1.1.0/ingress/contour/examples/https-ingress 
[root@pacific-ops01 https-ingress]# kubectl apply -f .
ingress.extensions/https-ingress created
secret/https-secret created

Verify the Ingress service is running as expected

[root@pacific-ops01 https-ingress]# kubectl get ingress -n test-ingress 
NAME            HOSTS              ADDRESS   PORTS     AGE
https-ingress   ingress.vxlan.co             80, 443   2m

Create a DNS record with the ingress hostname by pointing to the Envoy load balancer external IP.

Now test access to the s1 service by browsing https://ingress.vxlan.co/s1

and s2 service by browsing https://ingress.vxlan.co/s2

Congrats, you have successfully deployed a Contour Ingress controller on a TKG cluster!

Deploying vSphere 7 with Kubernetes and Tanzu Kubernetes Grid (TKG) Cluster

In this post we’ll explore the vSphere 7 with Kubernetes capabilities and the detailed deployment steps in order to provision a vSphere supervisor cluster and a Tanzu Kubernetes Grid (TKG) cluster.

If you are new to vSphere 7 and Tanzu Kubernetes, below are some background readings that can be used as a good start point:


I’ll be building a nested vSphere7/VCF4 environment in my home lab ESXi host, and the overall lab setup looks like below:

As you might have guessed, this lab requires a lot of resources! In specific you’ll need the following:

  • physical ESXi host running at least vSphere 6.7 or later
  • capacity to provision VM with up to 8x vCPU
  • capacity to provision up to 140-180GB of RAM
  • around 1TB of spare storage
  • a flat /24 subnet connected to external & Internet (can be shared with lab management network)
  • access to vSphere 7 ESXi/VCSA and NSX-T/Edge 3.0 OVA files and trial licenses

In order to save time on provisioning the vSphere/VCF stack, I’m using William Lam‘s vSphere 7 automation script as discussed here. You can find the PowerShell code and further details at his Git repository.

All demo apps and configuration yaml files used in this lab can be found at my Git Repo.

We’ll cover the following steps:

  • #1 – build a (nested) vSphere7/VCF4 stack
  • #2 – configure workload management and deploy supervisor cluster
  • #3 – deploy a demo app with native vSphere Pod services
  • #4 – deploy a TKG cluster
  • #5 – vSphere environment overview (post deployment)

Step-1: Deploy a vSphere7/VCF4 stack

First, you’ll need to download William’s PowerShell script and modify it based on your own lab environment. You’ll also need to download the required OVAs and place them in the same path as defined in the script — Note for the VCSA you’ll need to unzip the ISO and point the path to the unzipped folder!

Now let’s run the PowerShell script and you’ll see a deployment summary page like this:

Hit “Y” to kickoff the deployment and for me the whole process took just a little over 1 hour.

Once the script completes you should see a vAPP look like this deployed under your physical ESXi host.

Step-2: Configure Workload Management and Deploy Supervisor Cluster

To activate vSphere 7 native Kubernetes capabilities, we need to enable workload management which will configure our nested ESXi cluster as a supervisor cluster. First, log into the nested VCSA, and navigate to “Menu” —> “Workload Management”, click “Enable”:

Select our nested ESXi cluster to be configured as a supervisor cluster

Select supervisor Control Plane VM size

Configure the management network settings for the supervisor cluster, note that we’ll need to reserve a 5-address block for the control plane VMs including a VIP.

Next, configure vSphere Pod network settings — for this demo we’ll reserve one /27 for the Ingress CIDR block as the NAT IPs to be consumed by Load Balancer or Ingress services; and another /27 for the Egress CIDR block as outbound SNAT IPs for provisioned K8s namespaces.

Configure storage policies by selecting the pre-provisioned pacific-gold vSAN policy, then click “Finish” to begin the deployment of supervisor cluster.

This process will take another 20~30 mins to complete, and you’ll see a cluster of 3x control plan VMs being provisioned.

Back to the “Workload Management” —> “Cluster”, you should see our supervisor cluster (consists of 3x ESXi hosts) is now up and running. Also, take a note of the VIP address of the control plan VMs as we’ll be using that IP to log into the supervisor cluster.

Step-3: Deploy a demo app with Native vSphere Pods

To consume the native vSphere Kubernetes Pods capabilities, we need to firstly create a vSphere Namespace, which is mapped to a K8s namespace within the supervisor cluster. vSphere leverages the K8s namespace logical construct to provide resource segmentation for the vSphere pods/services/deployments, and it offers a flexible way to attach authorization and network/storage policies for different environments.

Go to “Menu” —> “Workload Management”, and click “Create Namespace”.

Since we’ll be deploying a sample guestbook app, we’ll name the namespace “guestbook”.

Next, grant the vSphere admin with editor’s permission to the namespace, and assign the vSAN storage policy “pacific-gold-storage-policy” for the namespace —> this is important as (behind the scene) we are leveraging the vSAN CSI (container storage interface) driver to provide persistent storage support for the cluster.

Now we are ready to dive into the vSphere supervisor cluster! Before we can do that, let’s get the Kubectl CLI and the vSphere plugin package.
Open the CLI tools link at here:

Follow the onscreen instructions to download and install the vSphere Kubectl CLI toolkit onto your management host (I’m using a CentOS7 VM).

Time to log into our superviosr K8s cluster! — remember to use the control plane VIP ( as noted before.

[root@Pacific-Ops01]# kubectl vsphere login --server= -u administrator@vsphere.local --insecure-skip-tls-verify

switch context to our “guestbook” namespace

[root@Pacific-Ops01]# kubectl config use-context guestbook
Switched to context "guestbook".

take a look of the cluster nodes, you’ll see the 3x master nodes (supervisor control VMs) and 3x worker nodes (ESXi hosts)

[root@pacific-ops01 vs7-k8s]# kubectl get nodes -o wide
NAME                               STATUS   ROLES    AGE   VERSION                    INTERNAL-IP       EXTERNAL-IP   OS-IMAGE                 KERNEL-VERSION      CONTAINER-RUNTIME
420a7d079f62a8ae40fb4bffea3cee48   Ready    master   8d    v1.16.7-2+bfe512e5ddaaaa      <none>        VMware Photon OS/Linux   4.19.84-1.ph3-esx   docker://18.9.9
420acb46e78281fcfaf3f45ea3d7c577   Ready    master   8d    v1.16.7-2+bfe512e5ddaaaa      <none>        VMware Photon OS/Linux   4.19.84-1.ph3-esx   docker://18.9.9
420aef27c9f45b01e8e0ed4a7e45cf2e   Ready    master   8d    v1.16.7-2+bfe512e5ddaaaa      <none>        VMware Photon OS/Linux   4.19.84-1.ph3-esx   docker://18.9.9
pacific-esxi-1                     Ready    agent    8d    v1.16.7-sph-4d52cd1   <none>        <unknown>                <unknown>           <unknown>
pacific-esxi-2                     Ready    agent    8d    v1.16.7-sph-4d52cd1   <none>        <unknown>                <unknown>           <unknown>
pacific-esxi-3                     Ready    agent    8d    v1.16.7-sph-4d52cd1   <none>        <unknown>                <unknown>           <unknown>

Clone the git repo for this demo lab, and apply a dummy network policy (permit all ingress and all egress traffic)

[root@pacific-ops01 ~]# git clone https://github.com/sc13912/vs7-k8s.git
Cloning into 'vs7-k8s'...
remote: Enumerating objects: 15, done.
remote: Counting objects: 100% (15/15), done.
remote: Compressing objects: 100% (10/10), done.
remote: Total 15 (delta 2), reused 12 (delta 2), pack-reused 0
Unpacking objects: 100% (15/15), done.
[root@pacific-ops01 ~]# cd vs7-k8s/
[root@pacific-ops01 vs7-k8s]# kubectl apply -f network-policy-allowall.yaml
networkpolicy.networking.k8s.io/allow-all created

To deploy the guestbook app, we’ll leverage the dynamic persistent volume provisioning capability of the vSphere CSI driver by calling the vSAN storage class “pacific-gold-storage-policy”

kind: PersistentVolumeClaim
apiVersion: v1
  namespace: guestbook
  name: redis-master-claim
    - ReadWriteOnce
  storageClassName: pacific-gold-storage-policy
      storage: 2Gi

apply the PVCs yamls for both the redis master and slave Pods

[root@pacific-ops01 vs7-k8s]# kubectl apply -f guestbook/guestbook-master-claim.yaml
persistentvolumeclaim/redis-master-claim created

[root@pacific-ops01 vs7-k8s]# kubectl apply -f guestbook/guestbook-slave-claim.yaml 
persistentvolumeclaim/redis-slave-claim created

verify both PVCs are showing “Bound” status mapped to two dynamically provisioned persistent volumes (PVs)

[root@pacific-ops01 vs7-k8s]# kubectl get pvc
NAME                 STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                  AGE
redis-master-claim   Bound    pvc-0102e725-41ad-440b-8a02-8af4d4768ebb   2Gi        RWO            pacific-gold-storage-policy   14m
redis-slave-claim    Bound    pvc-fb4b7bbe-9b35-40e8-b251-8f2effe85a2d   2Gi        RWO            pacific-gold-storage-policy   13m

Now deploy the guestbook app.

[root@pacific-ops01 vs7-k8s]# kubectl apply -f guestbook/guestbook-all-in-one.yaml 
service/redis-master created
deployment.apps/redis-master created
service/redis-slave created
deployment.apps/redis-slave created
service/frontend created
deployment.apps/frontend created

wait until all the pods up and running

[root@pacific-ops01 vs7-k8s]# kubectl get pods -o wide -n guestbook 
NAME                            READY   STATUS    RESTARTS   AGE     IP             NODE             NOMINATED NODE   READINESS GATES
frontend-6cb7f8bd65-kjgh2       1/1     Running   0          3m2s   pacific-esxi-2   <none>           <none>
frontend-6cb7f8bd65-mlv79       1/1     Running   0          3m2s   pacific-esxi-1   <none>           <none>
frontend-6cb7f8bd65-slz6b       1/1     Running   0          3m2s   pacific-esxi-2   <none>           <none>
frontend-6cb7f8bd65-vtkfz       1/1     Running   0          3m3s   pacific-esxi-1   <none>           <none>
redis-master-64fb8775bf-65sdc   1/1     Running   0          3m10s   pacific-esxi-1   <none>           <none>
redis-slave-779b6d8f79-bj9q7    1/1     Running   0          3m7s   pacific-esxi-2   <none>           <none>

retrieve the Load Balancer service IP — note NSX has allocated an IP from the /27 Ingress CIDR block

[root@pacific-ops01 vs7-k8s]# kubectl get svc -n guestbook 
NAME           TYPE           CLUSTER-IP    EXTERNAL-IP       PORT(S)        AGE
frontend       LoadBalancer   80:32610/TCP   4m15s
redis-master   ClusterIP    <none>            6379/TCP       4m22s
redis-slave    ClusterIP   <none>            6379/TCP       4m21s

Hit the load balancer IP in browser to test the guestbook app. Enter and submit some messages, and try to destroy and redeploy the app, your data will be kept by the redis PVs.

Step-4: Deploy a TKG cluster

Before we can deploy a TKG cluster, we’ll need to create a content library subscription by pointing to https://wp-content.vmware.com/v2/latest/lib.json, which contains the VMware Tanzu Kubernetes images:

wait for about 5~10 mins for the library to fully sync, at this point of time I can see two versions of Tanzu K8s images:

Next, create a new namespace called “dev01” which will be hosting our new TKG cluster.

Back to the CLI, we’ll switch context from “guestbook” to the new “dev01” namespace:

[root@pacific-ops01 vs7-k8s]# kubectl config get-contexts 
CURRENT   NAME              CLUSTER           AUTHINFO                                          NAMESPACE
          dev01      wcp:   dev01
*         guestbook   wcp:   guestbook
[root@pacific-ops01 vs7-k8s]# 
[root@pacific-ops01 vs7-k8s]# kubectl config use-context dev01 
Switched to context "dev01".

let’s examine the two TKG K8s versions available from the library:

[root@pacific-ops01 vs7-k8s]# kubectl get virtualmachineimages
NAME                                                        AGE
ob-15957779-photon-3-k8s-v1.16.8---vmware.1-tkg.3.60d2ffd   9m44s
ob-16466772-photon-3-k8s-v1.17.7---vmware.1-tkg.1.154236c   9m44s

and there are also different classes for the TKG VM templates:

[root@pacific-ops01 vs7-k8s]# kubectl get  virtualmachineclasses
NAME                 AGE
best-effort-large    4h48m
best-effort-medium   4h48m
best-effort-small    4h48m
best-effort-xlarge   4h48m
best-effort-xsmall   4h48m
guaranteed-large     4h48m
guaranteed-medium    4h48m
guaranteed-small     4h48m
guaranteed-xlarge    4h48m
guaranteed-xsmall    4h48m

so I have prepared the following yaml config for my TKG cluster — I’m using 1x master node and 3x worker nodes, all within the “guaranteed-small” machine classes.

[root@pacific-ops01 vs7-k8s]# cat tkg-cluster01.yaml 
apiVersion: run.tanzu.vmware.com/v1alpha1
kind: TanzuKubernetesCluster
  name: dev01-tkg-01
  namespace: dev01
    version: v1.16
      class: guaranteed-small
      count: 1
      storageClass: pacific-gold-storage-policy
      class: guaranteed-small
      count: 3
      storageClass: pacific-gold-storage-policy
        name: calico
        cidrBlocks: [""]
        cidrBlocks: [""]

apply the config to create the TKG cluster

[root@pacific-ops01 vs7-k8s]# kubectl apply -f tkg-cluster01.yaml 
tanzukubernetescluster.run.tanzu.vmware.com/dev01-tkg-01 created

monitor the cluster creation process, and eventually you’ll see all 4x TKG VMs are up and running:

[root@pacific-ops01 vs7-k8s]# kubectl get tanzukubernetesclusters.run.tanzu.vmware.com 
NAME           CONTROL PLANE   WORKER   DISTRIBUTION                     AGE   PHASE
dev01-tkg-01   1               3        v1.16.8+vmware.1-tkg.3.60d2ffd   13m   creating

[root@pacific-ops01 vs7-k8s]# kubectl get machines 
NAME                                         PROVIDERID                                       PHASE
dev01-tkg-01-control-plane-n9hqx             vsphere://420aff74-1367-9654-b2ba-59f8a64c3b52   running
dev01-tkg-01-workers-nwmhh-c766c8f77-nnbsj   vsphere://420aca94-26f3-f1c6-e112-607c28c439a4   provisioned
dev01-tkg-01-workers-nwmhh-c766c8f77-pcv65   vsphere://420a2c44-f4e3-f698-b173-86a6b4b3fa27   provisioned
dev01-tkg-01-workers-nwmhh-c766c8f77-zqfwj   vsphere://420a2c16-3002-b2c2-ef5d-d4e3d7a08bf8   provisioned

[root@pacific-ops01 vs7-k8s]# kubectl get machines            
NAME                                         PROVIDERID                                       PHASE
dev01-tkg-01-control-plane-n9hqx             vsphere://420aff74-1367-9654-b2ba-59f8a64c3b52   running
dev01-tkg-01-workers-nwmhh-c766c8f77-nnbsj   vsphere://420aca94-26f3-f1c6-e112-607c28c439a4   running
dev01-tkg-01-workers-nwmhh-c766c8f77-pcv65   vsphere://420a2c44-f4e3-f698-b173-86a6b4b3fa27   running
dev01-tkg-01-workers-nwmhh-c766c8f77-zqfwj   vsphere://420a2c16-3002-b2c2-ef5d-d4e3d7a08bf8   running

Time to log into our new cluster!

[root@pacific-ops01 vs7-k8s]# kubectl vsphere login --server= --vsphere-username administrator@vsphere.local --insecure-skip-tls-verify --tanzu-kubernetes-cluster-name dev01-tkg-01 --tanzu-kubernetes-cluster-namespace dev01

[root@pacific-ops01 vs7-k8s]# kubectl config use-context dev01-tkg-01 
Switched to context "dev01-tkg-01".

Once you are logged in and switched to the cluster “dev01-tkg-01” namespace, verify that you can see all 4x TKG nodes are in “Ready” status

[root@pacific-ops01 ~]# kubectl get nodes 
NAME                                         STATUS   ROLES    AGE   VERSION
dev01-tkg-01-control-plane-n9hqx             Ready    master   22m   v1.16.8+vmware.1
dev01-tkg-01-workers-nwmhh-c766c8f77-nnbsj   Ready    <none>   56s   v1.16.8+vmware.1
dev01-tkg-01-workers-nwmhh-c766c8f77-pcv65   Ready    <none>   61s   v1.16.8+vmware.1
dev01-tkg-01-workers-nwmhh-c766c8f77-zqfwj   Ready    <none>   85s   v1.16.8+vmware.1

We are now ready to deploy demo apps into the TKG cluster. First, update the cluster RBAC and Pod Security Policies by applying the supplied yaml config.

[root@pacific-ops01 vs7-k8s]# kubectl apply -f allow-nonroot-clusterrole.yaml 
clusterrole.rbac.authorization.k8s.io/psp:privileged created
clusterrolebinding.rbac.authorization.k8s.io/all:psp:privileged created

Next, deploy the yelb demo app :

[root@pacific-ops01 vs7-k8s]# kubectl apply -f yelb/yelb-lb.yaml
service/redis-server created
service/yelb-db created
service/yelb-appserver created
service/yelb-ui created
deployment.apps/yelb-ui created
deployment.apps/redis-server created
deployment.apps/yelb-db created
deployment.apps/yelb-appserver created

wait for all the Pods up and running, then retrieve the external IP of the yelb-ui Load Balancer (assigned by NSX from the pre-provisioned /27 Ingress CIDR block)

[root@pacific-ops01 vs7-k8s]# kubectl get svc yelb-ui -n yelb-app 
NAME      TYPE           CLUSTER-IP    EXTERNAL-IP       PORT(S)        AGE
yelb-ui   LoadBalancer   80:30116/TCP   9d

Go to the LB IP and you’ll see the app is running successfully.

vSphere Environment Overview

Below is a quick overview of the vSphere Lab environment after you have completed all the steps. You should see a supervisor cluster (consists of 3x ESXi worker nodes and the 3x control VMs), a TKG cluster with its own namespace, and a guestbook microservice app deployed with native vSphere Pod services by leveraging vSAN CSI.

and here is the network topology overview captured from NSX-T UI. Note NSX automatically deploys a dedicated Tier-1 gateway for every TKG cluster created. The tier-1 gateway also provides egress SNAT and Ingress LB capabilities for the TKG cluster.

Cloud Native DevOps on GCP Series Ep2 – Create a CI/CD pipeline with GKE, GCR and Cloud Build

This is the second episode of our Cloud Native DevOps on GCP series. In the previous chapter, we have built a multi-AZ GKE cluster with Terraform. This time, we’ll create a cloud native CI/CD pipeline leveraging our GKE cluster and Google DevOps tools such as Cloud Build and Google Container Registry (GCR). We’ll create a Cloud Build trigger by connecting to GitHub repository to perform automatic build, test and deployment of a sample micro-service app onto the GKE cluster.

For this demo, I have provided a simple NodeJS app which is already containerized and packaged as a Helm Chart for fast K8s deployment. You can find all the artifacts at my GitHub Repo, including the demo app, Helm template/chart, as well as the Cloud Build pipeline code.


  • Access to a GCP testing environment
  • Install Git, Kubectl and Terrafrom on your client
  • Install Docker on your client
  • Install GCloud SDK
  • Check the NTP clock & sync status on your client —> important!
  • Clone or download the demo app repo at here

Step-1: Prepare the GCloud Environment

To begin, configure the GCloud environment variables and authentications.

gcloud init
gcloud config set accessibility/screen_reader true
gcloud auth application-default login

Register GCloud as a Docker credential helper — this is important so our Docker client will have privileged access to interact with GCR. (Later we’ll need to build and push a Helm client image to GCR as required for the pipeline deployment process)

gcloud auth configure-docker

Enable required GCP API services.

gcloud services enable compute.googleapis.com
gcloud services enable servicenetworking.googleapis.com
gcloud services enable cloudresourcemanager.googleapis.com
gcloud services enable container.googleapis.com
gcloud services enable cloudbuild.googleapis.com

Update Cloud Build service account with an editor role so it will have required permissions to access GKE and GCR within the project.

PROJECT_ID=`gcloud config get-value project`
CLOUDBUILD_SA="$(gcloud projects describe $PROJECT_ID --format 'value(projectNumber)')@cloudbuild.gserviceaccount.com"
gcloud projects add-iam-policy-binding $PROJECT_ID --member serviceAccount:$CLOUDBUILD_SA --role roles/editor

Step-2: Launch a GKE Cluster using Terraform

If you have been following the series and have already deployed a GKE cluster, you can skip this step and move on to the next. Otherwise you can follow this post to build a GKE cluster with Terraform.

Make sure to deploy an Ingress Controller as there is an Ingress service defined in our Helm Chart!

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-0.32.0/deploy/static/provider/cloud/deploy.yaml  

Step-3: Initialize Helm for Application Deployment on GKE

As mentioned above, for this demo we have encapsulated our demo app into a Helm Chart. Helm is a package management system designed for simplifying and accelerating application deployment on the Kubernetes platform.

As of version 2, Helm consists of a local client and a Tiller server pod (deployed in K8s cluster) to interact with the Kube-apiserver for app deployment. In our example, we’ll first build a customised Helm client docker image and push it to GCR. This image will then be used by Cloud Build to interact with the Tiller server (deployed on GKE) for deploying the pre-packaged Helm chart — as illustrated in the below diagram.

First let’s configure a service account for Tiller and initialize Helm (server component) on our GKE cluster.

kubectl apply -f ./k8s-helm/tiller.yaml
helm init --history-max 200 --service-account tiller

We’ll then build and push a customised Helm client image to GCR. This might take a few minutes.

cd ./k8s-helm/cloud-builders-community/helm
docker build -t gcr.io/$PROJECT_ID/helm .
docker push gcr.io/$PROJECT_ID/helm

On GCR confirm there is a new Helm (client) image has been pushed through.

Step-4: Review the (Cloud Build) Pipeline Code

Before we move forward, let’s take a moment to review the pipeline code (as defined in the cloudbuild.yaml). There is a total of 4 stages included in our Cloud Build pipeline:

  1. Build a docker image with our demo app
  2. Push the new image to GCR
  3. Deploy Helm chart (for our demo app) to GKE via GCR
  4. Integration Testing

The first two stages are straight forward, we’ll use the Google published Cloud Builder docker image to build the node app image and push it to the GCR repository.

  # Build demo app image
  - name: gcr.io/cloud_builders/docker
      - build
      - -t
      - gcr.io/$PROJECT_ID/node-app:$COMMIT_SHA
      - .
  # Push demo app image to GCR
  - name: gcr.io/cloud-builders/docker
      - push
      - gcr.io/$PROJECT_ID/node-app:$COMMIT_SHA

Next we’ll leverage the (previously built) Helm client to interact with our GKE cluster and to deploy the Helm chart (for our node app), with the image repository pointing to the GCR path from the last pipeline stage.

  # Deploy with Helm Chart
  - name: gcr.io/$PROJECT_ID/helm
      - upgrade
      - -i
      - node-app
      - ./k8s-helm/node-app
      - --set
      - image.repository=gcr.io/$PROJECT_ID/node-app,image.tag=$COMMIT_SHA
      - -f
      - ./k8s-helm/node-app/values.yaml
      - KUBECONFIG=/workspace/.kube/config
      - TILLERLESS=false
      - TILLER_NAMESPACE=kube-system

Lastly, we’ll run an integration test to verify the demo app status on our GKE cluster. For our node app there is a built-in heath-check URL configured at “/health“, and we’ll be leveraging another Cloud Builder curl image to ping this URL path and expect a return message of <“status”: “ok”> . Note: here we should be polling the internal DNS address for the k8s service (of the demo app) so there is no dependency on IP allocations.

  # Integration Testing
  - name: gcr.io/cloud-builders/kubectl
    entrypoint: 'bash'
      - '-c'
      - |
        kubectl delete --wait=true pod curl
        kubectl run curl --restart=Never --image=gcr.io/cloud-builders/curl --generator=run-pod/v1 -- http://node-app.default.svc.cluster.local/health
        sleep 15
        kubectl logs curl 
        kubectl logs curl | grep OK
      - KUBECONFIG=/workspace/.kube/config

Step-4: Create a Cloud Build Trigger by Connecting to GitHub Repository

Now that we have our GKE cluster ready and Helm image pushed to GCR, the next step is to connect Cloud Build to the GitHub repository and create a CI trigger. On GCP console, go to Cloud Build —> Triggers, select the GitHub repo as below.

If this is the first time you are connecting to GitHub in Cloud Build, it will redirect you to an authorization page like below, accept it in order to access your repositories.

Select the demo app repository, which also includes the pipeline config (cloudbuild.yaml) file.

Create a push trigger in the next page and you should see a summary like this.

You can manually run the trigger now to kick off the CI build process. However we’ll be running more thorough testing to verify the end-to-end pipeline automation process in the next section.

Step-5: Test the CI/CD Pipeline

It’s time to test our CI/CD pipeline! First we’ll make a “cosmetic” version change (1.0.0 to 1.0.1) to the Helm chart for our demo app.

Commit the change and push to the Git repository.

This (push event) should have triggered our Cloud Build pipeline. You can jump on the GCP console to monitor the fully automated 4-stage process. The pipeline will be completed once the integration test has returned a status of OK.

On the GKE cluster we can see our Helm chart v-1.0.1 has been deployed successfully.

The deployment and node app are running as expected.

Retrieve the Ingress public IP and update the local host file for a quick testing. (Note the Ingress URL is defined as “node-app.local”)

[root@cloud-ops01 nodejs-cloudbuild-demo]# kubectl get ingresses 
NAME       HOSTS            ADDRESS         PORTS   AGE
node-app   node-app.local   80      15m
[root@cloud-ops01 nodejs-cloudbuild-demo]# 
[root@cloud-ops01 nodejs-cloudbuild-demo]# echo "  node-app.local" >> /etc/hosts   

Now point your browser to “node-app.local” and you should see the demo app page like below. Congrats, you have just successfully deployed a cloud native CI/CD pipeline on GCP!

Cloud Native DevOps on GCP Series Ep1 – Build a GKE Cluster with Terraform

This is the first episode of our Cloud Native DevOps on GCP series. Here we’ll be building an Google Kubernetes Engine (GKE) cluster using Terraform. From my personal experience, GKE has been one of the most scalable and reliable managed Kubernetes solution, and it’s also 100% upstream compliant and certified by CNCF.

For this demo I have provided a sample Terraform script at here. The target state will look like this:

In specific, we’ll be launching the following GCP/GKE resources:

  • 1x new VPC for hosting the demo GKE cluster
  • 1x /17 CIDR block as the primary address space for the VPC
  • 2x /18 CIDR blocks for the GKE Pod and Service address spaces
  • 1x GKE high availability cluster across 2x Availability Zone (AZ)
  • 2x GKE worker instance groups (2x nodes each)


  • Access to a GCP testing environment
  • Install Git, Kubectl and Terrafrom on your client
  • Install GCloud SDK
  • Check the NTP clock & sync status on your client —> important!
  • Clone the Terraform Repo at here

Step-1: Setup the GCloud Environment and Run the Terrafrom Script

To begin, run below interactive GCloud commands to prepare for the GCP environment

gcloud init  
gcloud config set accessibility/screen_reader true  
gcloud auth application-default login  

Remember to update the terraform.tfvars with your own GCP project_id

project_id = "xxxxxxxx"

Make sure to enable the GKE API if not already

gcloud services enable container.googleapis.com

Now run the Terraform script:

terraform init
terraform apply

The whole process should be taking about 7~10 mins, and you should get an output like this:

Now register the cluster and update kubeconfig file

[root@cloud-ops01 tf-gcp-gke]# gcloud container clusters get-credentials node-pool-cluster-demo --region australia-southeast1
Fetching cluster endpoint and auth data.
kubeconfig entry generated for node-pool-cluster-demo.

Step-2: Verify the GKE Cluster Status

Check that we can access the GKE cluster and there should be 4x worker nodes provisioned.

[root@cloud-ops01 ~]# kubectl get nodes
NAME                                               STATUS   ROLES    AGE     VERSION
gke-node-pool-cluster-demo-pool-01-03a2c598-34lh   Ready    <none>   8m59s   v1.16.9-gke.2
gke-node-pool-cluster-demo-pool-01-03a2c598-tpwq   Ready    <none>   9m      v1.16.9-gke.2
gke-node-pool-cluster-demo-pool-01-e903c7a8-04cf   Ready    <none>   9m5s    v1.16.9-gke.2
gke-node-pool-cluster-demo-pool-01-e903c7a8-0lt8   Ready    <none>   9m5s    v1.16.9-gke.2

This can also been verified on GKE console

The 4x worker nodes are provisioned over 2x managed instance groups across two different AZs

Run kubectl describe nodes and we can see each node has been tagged with a few customised labels based on its unique properties. These are important metadata which can be used for selective Pod/Node deployment and other use cases like affinity or anti-affinity rules.

Step-3: Deploy GKE Add-on Services

  • Install Metrics-Server to provide cluster-wide resource metrics collection and to support use cases such as Horizontal Pod Autoscaling (HPA)
[root@cloud-ops01 tf-gcp-gke]# kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml

Wait for a few seconds and we should have resource stats

[root@cloud-ops01 tf-gcp-gke]# kubectl top nodes
NAME                                               CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
gke-node-pool-cluster-demo-pool-01-03a2c598-34lh   85m          4%     798Mi           14%       
gke-node-pool-cluster-demo-pool-01-03a2c598-tpwq   300m         15%    816Mi           14%       
gke-node-pool-cluster-demo-pool-01-e903c7a8-04cf   191m         9%     958Mi           16%       
gke-node-pool-cluster-demo-pool-01-e903c7a8-0lt8   102m         5%     795Mi           14%    
  • Next, deploy a NGINX Ingress Controller so we can use L7 URL load balancing and to save cost by reducing the required numbers of external load balances
[root@cloud-ops01 tf-gcp-gke]# kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-0.32.0/deploy/static/provider/cloud/deploy.yaml  

On GCP console we can see that an external Load Balancer has been provisioned in front of the Ingress Controller. Take a note of the LB address at below — this is the public IP that will be consumed by our ingress services.

In addition, we’ll deploy 2x storage classes to provide dynamic persistent storage support for stateful pods and services. Note the different persistent disk (PD) specs (standard & SSD) for different I/O requirements.

 [root@cloud-ops01 tf-gcp-gke]# kubectl create -f ./storage/storageclass/  

Step-4: Deploy Sample Apps onto the GKE Cluster for Testing

  • We’ll first deploy the famous Hipster Shop app, which is a cloud-native microservice application developed by Google.
kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/microservices-demo/master/release/kubernetes-manifests.yaml  

wait for all the Pods up and running

[root@cloud-ops01 tf-gcp-gke]# kubectl get pods 
NAME                                     READY   STATUS    RESTARTS   AGE
adservice-687b58699c-fq9x4               1/1     Running   0          2m16s
cartservice-778cffc8f6-dnxmr             1/1     Running   0          2m20s
checkoutservice-98cf4f4c-69fqg           1/1     Running   0          2m26s
currencyservice-c69c86b7c-mz5zv          1/1     Running   0          2m19s
emailservice-5db6c8b59f-jftv7            1/1     Running   0          2m27s
frontend-8d8958c77-s9665                 1/1     Running   0          2m24s
loadgenerator-6bf9fd5bc9-5lsrn           1/1     Running   3          2m19s
paymentservice-698f684cf9-7xbjc          1/1     Running   0          2m22s
productcatalogservice-789c77b8dc-4tk4w   1/1     Running   0          2m21s
recommendationservice-75d7cd8d5c-4x9kl   1/1     Running   0          2m25s
redis-cart-5f59546cdd-8tj8f              1/1     Running   0          2m17s
shippingservice-7d87945947-nhb5x         1/1     Running   0          2m18s

check the external frontend service, you should see a LB has been deployed by GKE with a public IP assigned

[root@cloud-ops01 ~]# kubectl get svc frontend-external 
NAME                TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)        AGE
frontend-external   LoadBalancer   80:32408/TCP   5m32s

You should be able to access the app via the LB public IP.

  • Next, we’ll deploy the sample Guestbook app to verify the persistent storage setup.
[root@cloud-ops01 tf-gcp-gke]# kubectl create ns guestbook-app  
[root@cloud-ops01 tf-gcp-gke]# kubectl apply -f ./demo-apps/guestbook/  

The application requests 2x persistent volumes (PV) for the redis-master and redis-slave pods. Both PVs should be automatically provisioned by the persistent volume claims (PVC) with the 2x different storage classes as we deployed earlier. You should see the STATUS reported as “Bound” between each PV and PVC mapping.

Retrieve the external IP/DNS for the frontend service of the Guestbook app.

[root@cloud-ops01 tf-gcp-gke]# kubectl get svc frontend -n guestbook-app 
NAME       TYPE           CLUSTER-IP        EXTERNAL-IP    PORT(S)        AGE
frontend   LoadBalancer   80:31006/TCP   23m

You should be able to access the Guesbook app now. Enter and submit some messages, and try to destroy and redeploy the app, your data will be kept by the redis PVs.

  • Lastly, we’ll deploy a modified version of the yelb app to test the NGINX ingress controller
[root@cloud-ops01 tf-gcp-gke]# kubectl create ns yelb  
[root@cloud-ops01 tf-gcp-gke]# kubectl apply -f ./demo-apps/yelb/

You should see an ingress service deployed as per below.

Retrieve the external IP for the ingress service within the yelb namespace. As mentioned before, this should be the same address of the external LB deployed for the ingress controller.

[root@cloud-ops01 tf-gcp-gke]# kubectl get ingresses -n yelb 
NAME           HOSTS        ADDRESS       PORTS   AGE
yelb-ingress   yelb.local   80      6m47s

Also, notice the ingress URL path is defined as “yelb.local”. This is the DNS entry that will be redirected by the http ingress service. So we’ll update the local host file (with the ingress public IP) for a quick testing.

[root@cloud-ops01 tf-aws-eks]# echo "  yelb.local" >> /etc/hosts  

and that’s it, the incoming requests to “yelb.local” are now routed via the ingress service to the yelb frontend pod running on our GKE cluster.