[{"content":"This is the 5th episode of our NKE lab series. In this episode, I will demonstrate how you can easily build a fully-automated GitOps continues delivery (CD) pipeline, by using Github, NKE and Argo CD.\nGitOps is a operational framework that takes DevOps best practices (such as version control, Infra-as-Code, CI/CD etc), and applies them to modern and cloud native infrastructure such as Kubernetes-based clusters.\nThere are two GitOps approaches: Push-based and Pull-based, and you can reach more about each model at here. This post will focus on the Pull-based approach as it provides many benefits such as better version control and governance, more automation and self-service capabilities, and easier for rollback, auditing/compliance suitable for large and stable production environment.\nBelow is an overview of the lab architecture (boxed environment) – specifically we’ll build a pull-based GitOps pipeline using GitHub, Argo CD and NKE. Argo CD will automatically pull the k8s application config from GitHub and deploy a guestbook demo app (as seen in Ep2). onto our NKE cluster.\nI will walk through the following steps:\nPART-1: Prepare a NKE cluster PART-2: Install Argo CD onto your NKE cluster PART-3: Deploy the GitOps CD pipeline using Argo CD pre-requisites a 1-node or 3-node Nutanix CE 2.0 cluster deployed in nested virtualization depending on your lab compute capacity, as documented here and here a NKE-enabled K8s cluster deployed in Nutanix CE (see Ep1) a lab network environment supports VLAN tagging and provides basic infra services such as AD, DNS, NTP etc (these are required when installing the CE cluster) a Linux/Mac workstation for managing the Kubernetes cluster, with Kubectl installed. fork the Git repo which includes the Argo CD k8s app config file prepare a Nutanix File Server (see part-2 at here), which is required to provide multi-access persistent storage for the Redis cluster within the demo app PART-1: Prepare a NKE cluster If you haven’t got a NKE cluster yet, follow the guide here to build one first. Once you have the cluster ready, navigate to Storage \u0026gt; Storage Classes to deploy a file-based storage class using the Nutanix Files (as listed in the prerequisites). This is required to provide multi-access persistent volumes (PVs) for the Redis followers within the demo app.\nYou should see the new (file-based) storage class popping up in the NKE cluster immediately.\nsc@vx-ops03:~$ kubectl get storageclasses.storage.k8s.io NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE default-storageclass (default) csi.nutanix.com Delete Immediate true 23h files-storageclass csi.nutanix.com Delete Immediate false 58s sc@vx-ops03:~$ kubectl describe storageclasses.storage.k8s.io files-storageclass Name: files-storageclass IsDefaultClass: No Annotations: \u0026lt;none\u0026gt; Provisioner: csi.nutanix.com Parameters: nfsPath=/k8s,nfsServer=fs.vxlan.co,storageType=NutanixFiles AllowVolumeExpansion: \u0026lt;unset\u0026gt; MountOptions: \u0026lt;none\u0026gt; ReclaimPolicy: Delete VolumeBindingMode: Immediate Events: \u0026lt;none\u0026gt; Next, we’ll prepare a Load Balancer service for our NKE cluster. You can skip this step if you have already deployed a LB controller within your cluster.\nFor this, I’ll simply install MetalLB in my cluster using L2 mode. Follow the installation guide to deploy MetalLB.\nkubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.8/config/manifests/metallb-native.yaml ... sc@vx-ops03:~$ kubectl get all -n metallb-system NAME READY STATUS RESTARTS AGE pod/controller-77676c78d9-f5552 1/1 Running 0 60s pod/speaker-6nhh5 1/1 Running 0 60s pod/speaker-nqj9p 1/1 Running 0 60s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/metallb-webhook-service ClusterIP 10.21.19.162 \u0026lt;none\u0026gt; 443/TCP 60s NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/speaker 2 2 2 2 2 kubernetes.io/os=linux 60s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/controller 1/1 1 1 60s NAME DESIRED CURRENT READY AGE replicaset.apps/controller-77676c78d9 1 1 1 60s Once all MetalaLB components are up and running, apply the following config file to enable the LB service at L2 mode. Note you need to the change the LB IP address pool to match your Lab environment – this should be the same subnet to where your NKE cluster is deployed.\nsc@vx-ops03:~/nke-guestbook-demo/MetalLB-config$ cat metallb_config.yaml --- apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: default namespace: metallb-system spec: addresses: - 192.168.102.20-192.168.102.29 autoAssign: true --- apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: default namespace: metallb-system spec: ipAddressPools: - default sc@vx-ops03:~/nke-guestbook-demo/MetalLB-config$ sc@vx-ops03:~/nke-guestbook-demo/MetalLB-config$ kubectl apply -f metallb_config.yaml ipaddresspool.metallb.io/default created l2advertisement.metallb.io/default created We are now ready to install Argo CD into our NKE cluster.\nPART-2: Install Argo CD onto the NKE cluster To install Argo CD, simply follow the official guide here.\nkubectl create namespace argocd kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml You’ll see a bunch of resources being created within the NKE cluster (argocd namespace). Specifically, the Argo CD application controller is a Kubernetes controller that continuously monitors running applications and compares the current, live state against the desired target state (as specified in the repo). When it detects a mismatch (called out-of-sync), and depending on your configuration, Argo CD can automatically pull the latest app config from the repository and deploy it to the designated K8s cluster.\nsc@vx-ops03:~$ kubectl get all -n argocd NAME READY STATUS RESTARTS AGE pod/argocd-application-controller-0 1/1 Running 0 104s pod/argocd-applicationset-controller-587b5c864b-gdnsn 1/1 Running 0 105s pod/argocd-dex-server-6958d7dcf4-ff589 1/1 Running 0 105s pod/argocd-notifications-controller-6847bd5c98-9h8nf 1/1 Running 0 105s pod/argocd-redis-bfcfd667f-nxx8x 1/1 Running 0 105s pod/argocd-repo-server-9646985c8-sfshk 1/1 Running 0 105s pod/argocd-server-9d6c97757-88mhk 1/1 Running 0 105s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/argocd-applicationset-controller ClusterIP 10.21.137.108 \u0026lt;none\u0026gt; 7000/TCP,8080/TCP 105s service/argocd-dex-server ClusterIP 10.21.244.74 \u0026lt;none\u0026gt; 5556/TCP,5557/TCP,5558/TCP 105s service/argocd-metrics ClusterIP 10.21.133.37 \u0026lt;none\u0026gt; 8082/TCP 105s service/argocd-notifications-controller-metrics ClusterIP 10.21.195.247 \u0026lt;none\u0026gt; 9001/TCP 105s service/argocd-redis ClusterIP 10.21.250.115 \u0026lt;none\u0026gt; 6379/TCP 105s service/argocd-repo-server ClusterIP 10.21.191.71 \u0026lt;none\u0026gt; 8081/TCP,8084/TCP 105s service/argocd-server ClusterIP 10.21.213.76 \u0026lt;none\u0026gt; 80/TCP,443/TCP 105s service/argocd-server-metrics ClusterIP 10.21.55.77 \u0026lt;none\u0026gt; 8083/TCP 105s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/argocd-applicationset-controller 1/1 1 1 105s deployment.apps/argocd-dex-server 1/1 1 1 105s deployment.apps/argocd-notifications-controller 1/1 1 1 105s deployment.apps/argocd-redis 1/1 1 1 105s deployment.apps/argocd-repo-server 1/1 1 1 105s deployment.apps/argocd-server 1/1 1 1 105s NAME DESIRED CURRENT READY AGE replicaset.apps/argocd-applicationset-controller-587b5c864b 1 1 1 105s replicaset.apps/argocd-dex-server-6958d7dcf4 1 1 1 105s replicaset.apps/argocd-notifications-controller-6847bd5c98 1 1 1 105s replicaset.apps/argocd-redis-bfcfd667f 1 1 1 105s replicaset.apps/argocd-repo-server-9646985c8 1 1 1 105s replicaset.apps/argocd-server-9d6c97757 1 1 1 105s NAME READY AGE statefulset.apps/argocd-application-controller 1/1 105 Next, we’ll need to expose the ArgoCD server to external so we can access the Web UI. This can be easily achieved using the MetalLB service we deployed before.\nWe’ll patch the existing argocd-server service and change the service type from ClusterIP to LoadBalancer.\nsc@vx-ops03:~$ kubectl patch svc argocd-server -n argocd -p \u0026#39;{\u0026#34;spec\u0026#34;: {\u0026#34;type\u0026#34;: \u0026#34;LoadBalancer\u0026#34;}}\u0026#39; service/argocd-server patched and you should see its service type now changed to LoadBalancer, with a external IP automatically assigned from the LB address pool as configured earlier.\nsc@vx-ops03:~$ kubectl get svc -n argocd argocd-server NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE argocd-server LoadBalancer 10.21.213.76 192.168.102.20 80:31375/TCP,443:30619/TCP 113m You should now be able to access the Web UI. But first let’s grab the initial admin password.\nsc@vx-ops03:~$ kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath=\u0026#34;{.data.password}\u0026#34; | base64 -d Now open a browser page and hit the LB address for argocd-server (192.168.102.20 in my case), you should be able to login using admin with the initial password. Once logged in, you can update the password under User Info \u0026gt; Update password.\nPART-3: Deploy the GitOps CD pipeline using Argo CD It’s time to deploy the GitOps CD pipeline. For this, I have prepared an Argo CD application config file at the Git repo here.\nsc@vx-ops03:~/nke-guestbook-demo/argocd$ cat application.yml apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: guestbook-argo-demo namespace: argocd spec: project: default source: repoURL: https://github.com/sc13912/nke-guestbook-demo.git targetRevision: HEAD path: argocd/dev destination: server: https://kubernetes.default.svc namespace: myguestbook syncPolicy: syncOptions: - CreateNamespace=true automated: selfHeal: true prune: true The above Argo CD config file defines the following:\nIt tells Argo CD to monitor my nke-guestbook-demo (path: argocd/dev) as the source repo for the demo app configuration; It tells Argo CD to deploy the demo app to the target K8s cluster (ie. local NKE cluster), within the myguestbook namespace; It tells Argo CD to auto create the namespace if it doesn’t exist in the target cluster; It also enables auto sync (disabled by default), with automatic self-healing and pruning — you can read more about these at here. Now let’s go ahead and apply the Argo CD config:\nsc@vx-ops03:~/nke-guestbook-demo/argocd$ kubectl apply -f application.yml application.argoproj.io/guestbook-argo-demo created Since currently there is no app managed by Argo CD, it will immediately detect a mismatch (out-of-sync) and automatically pull the demo app config from the Git repo and deploy it onto the local NKE cluster.\nWe’ll see a bunch of resources being created within the NKE cluster. Take a note of the frontend LoadBalancer external address (192.168.102.21).\nsc@vx-ops03:~/nke-guestbook-demo/argocd$ kubectl get all -n myguestbook NAME READY STATUS RESTARTS AGE pod/frontend-795b566649-cpp7x 1/1 Running 0 5m8s pod/frontend-795b566649-k4z5r 1/1 Running 0 5m8s pod/frontend-795b566649-zcv7x 1/1 Running 0 5m8s pod/redis-follower-5ffdf87b7d-f8dzg 1/1 Running 0 5m8s pod/redis-follower-5ffdf87b7d-v8tcc 1/1 Running 0 5m8s pod/redis-leader-c767d6dbb-h8hwn 1/1 Running 0 5m8s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/frontend LoadBalancer 10.21.130.37 192.168.102.21 80:31512/TCP 5m8s service/redis-follower ClusterIP 10.21.130.41 \u0026lt;none\u0026gt; 6379/TCP 5m8s service/redis-leader ClusterIP 10.21.227.65 \u0026lt;none\u0026gt; 6379/TCP 5m8s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/frontend 3/3 3 3 5m8s deployment.apps/redis-follower 2/2 2 2 5m8s deployment.apps/redis-leader 1/1 1 1 5m8s NAME DESIRED CURRENT READY AGE replicaset.apps/frontend-795b566649 3 3 3 5m9s replicaset.apps/redis-follower-5ffdf87b7d 2 2 2 5m9s replicaset.apps/redis-leader-c767d6dbb 1 1 1 5m9s Open a browser page and hit the LB address, you should see the Guestbook page coming up.\nBack at the Argo CD console, we can see the app status is now fully synced, and you can easily visualize what resources (Pods, PVCs, Services etc) are getting deployed. You can also click into each individual resource to examine the details such as live manifest and logs etc.\nTo test the automatic sync mechanism, we can make a simple change at the source repo: let’s update the Redis follower replica (from 2 to 3) and commit the change.\nBy default, Argo CD performs periodic check at 3-min interval. So after 3min you should see Argo CD detect the config update from the source repo and automatically deploy an additional Pod for the Redis follower Replicaset.\n","permalink":"https://route179.dev/2024/09/13/nke-lab-series-ep5-build-a-gitops-cd-pipeline-using-github-nke-and-argo-cd/","summary":"\u003cp\u003eThis is the 5th episode of our \u003ca href=\"/tags/nke/\"\u003eNKE lab series\u003c/a\u003e. In this episode, I will demonstrate how you can easily build a fully-automated GitOps continues delivery (CD) pipeline, by using Github, NKE and \u003ca href=\"https://argoproj.github.io/cd/\"\u003eArgo CD\u003c/a\u003e.\u003c/p\u003e\n\u003cp\u003e\u003ca href=\"https://about.gitlab.com/topics/gitops/\"\u003eGitOps\u003c/a\u003e is a operational framework that takes DevOps best practices (such as version control, Infra-as-Code, CI/CD etc), and applies them to modern and cloud native infrastructure such as Kubernetes-based clusters.\u003c/p\u003e\n\u003cp\u003eThere are two GitOps approaches: Push-based and Pull-based, and you can reach more about each model \u003ca href=\"https://www.harness.io/blog/gitops-the-push-and-pull-approach\"\u003eat here\u003c/a\u003e. This post will focus on the Pull-based approach as it provides many benefits such as better version control and governance, more automation and self-service capabilities, and easier for rollback, auditing/compliance suitable for large and stable production environment.\u003c/p\u003e","title":"NKE Lab series – Ep5: Build a GitOps CD pipeline using GitHub, NKE and Argo CD"},{"content":"This is the 4th episode of our NKE lab series. Previously, I have demonstrated how you can easily deploy a NKE cluster in a Nutanix CE lab environment, and I have explored some NKE platform features including out-of-the-box CSI and CNI support. In this episode, we’ll take a look how you can accelerate Kubernetes application development by integrating NKE with Nutanix Database Service (NDB).\nNDB is a Database-as-a-Service designed to help developers speed up application development and simplify database administration across on-prem and public clouds. It simplifies database operations such as test DB provisioning/cloning and integrated snapshots/backup etc. It also provides a consistent “Database-as-Code” experience using REST API and K8s integrations. NDB supports most popular database engines, and you can read more about it at here.\nFor this demo, we’ll deploy a containerized WordPress app onto a NKE cluster, by connecting to a MySQL test database instance managed by the NDB.\nSpecifically, I will walkthrough the following steps:\nPART-1: Deploy the NDB service via Nutanix Marketplace PART-2: Provision a MySQL test database instance in NDB PART-3: Deploy the demo app onto NKE by connecting to the test database in NDB pre-requisites a 1-node or 3-node Nutanix CE 2.0 cluster deployed in nested virtualization depending on your lab compute capacity, as documented here and here a NKE-enabled K8s cluster deployed in Nutanix CE (see Ep1) a lab network environment supports VLAN tagging and provides basic infra services such as AD, DNS, NTP etc (these are required when installing the CE cluster) a Linux/Mac workstation for managing the Kubernetes cluster, with Kubectl installed. PART-1: Deploy the NDB service To begin, log into your Prism Central and navigate to Admin Center \u0026gt; Marketplace, and deploy the “Database Service”.\nAt the deployment blueprint, provide a name to the NDB VM and select your (management) network, leave everything else as default and click “Deploy”.\nAfter the NDB service VM is deployed, you should see a new Database Service category under the navigation bar. It will take you to the NDB management console for initial service setup.\nTo register NDB, you’ll first need to supply your CE cluster virtual IP and login credentials.\nSelect subnet profile(s) to access PC management and iSCSI services etc.\nNext, configure DNS, NTP and SMTP based on your own environment.\nFor the Network Profile, it is recommended you have a sperate and dedicated VLAN for database provisioning, and the address pool should be managed by NDB (instead of IPAM).\nThat’s pretty much all you need to deploy and register the NDB service. You’ll see NDB creating a bunch of database profiles to help streamline the DB provisioning process (more on this in a movement). The whole process will take approx. 10~20mins.\nPART-2: provision a test MySQL db instance Once the NDB service is deployed and registered to your CE cluster, go to Databases \u0026gt; Source to provision a MySQL instance for our demo app.\nHere we’ll fast track the database provisioning process by simply selecting out-of-the-box software, compute and network profiles.\nYou’ll also have the option to upload a pubkey for using passwordless SSH into the DB VM.\nNext, you’ll define important properties such as DB name, TCP port, size etc. You can optionally attach customized pre-post scripts as well.\nfinally, you can select SLA profiles and customize snapshots settings etc. Click Provision to kick off the deployment.\nWithin around 10~20min, you should see a test MySQL instance is ready online.\nIf you click into the test DB instance, you’ll find its IP address allocated by the NBD.\nYou can also verify that an initial database of “wordpress” has been created, perfect!\nFrom your workstation (with supplied pubkey) you should be able to SSH into the DB VM without using password.\nand while we are here, we can log into the DB and create a test user for our demo app.\n[era@mysql-dev01 ~]$ mysql -u root -p ... mysql\u0026gt; mysql\u0026gt; CREATE USER \u0026#39;wordpress\u0026#39;@\u0026#39;%\u0026#39; IDENTIFIED BY \u0026#39;your_password\u0026#39;; Query OK, 0 rows affected (0.12 sec) mysql\u0026gt; GRANT ALL ON wordpress.* TO \u0026#39;wordpress\u0026#39;@\u0026#39;%\u0026#39;; Query OK, 0 rows affected (0.06 sec) mysql\u0026gt; FLUSH PRIVILEGES; Query OK, 0 rows affected (0.02 sec) mysql\u0026gt; exit PART-3: Deploy the demo app We are now ready to rollout our demo app onto the NKE cluster, and connecting to the test DB instance via NDB! I’m using a k8s sample WordPress app, and specifically we’ll only need the app component yaml file.\nBefore deploying it into my NKE cluster, I’ll just make the following changes:\nChange all resources to the k8s namespace of “wordpress” Update the database environment variables to our NDB test DB instance NOTE: I’m only putting password here for simple illustration, obviously you should never do this in production and it is always recommended to use K8s Secrets instead.\nAfter you update the yaml file, go ahead and deploy it.\nsc@vx-ops03:~/nke-labs/ndb-wp-demo$ kubectl apply -f ./ service/wordpress created persistentvolumeclaim/wp-pv-claim created deployment.apps/wordpress created This should create a bunch of k8s resources, including a WordPress application Pod, a PVC for the Pod to mount the website data at /var/www/html. The PV is automatically provisioned by using the Nutanix CSI driver, as we discussed in Ep2.\nsc@vx-ops03:~/nke-labs/ndb-wp-demo$ kubectl get pvc -n wp-demo NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE wp-pv-claim Bound pvc-e58b02ff-37f9-4f91-957f-5872b9728e33 20Gi RWO default-storageclass 54s sc@vx-ops03:~/nke-labs/ndb-wp-demo$ sc@vx-ops03:~/nke-labs/ndb-wp-demo$ kubectl get pv | grep wp-demo pvc-e58b02ff-37f9-4f91-957f-5872b9728e33 20Gi RWO Delete Bound wp-demo/wp-pv-claim default-storageclass 59s sc@vx-ops03:~/nke-labs/ndb-wp-demo$ sc@vx-ops03:~/nke-labs/ndb-wp-demo$ kubectl get pods -n wp-demo NAME READY STATUS RESTARTS AGE wordpress-69866957dd-89q8m 1/1 Running 0 72s Also, the deployment yaml will create a LoadBalancer using MetalLB controller. Take a note of the external IP address (192.168.102.12).\nsc@vx-ops03:~/nke-labs/ndb-wp-demo$ kubectl get svc -n wp-demo NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE wordpress LoadBalancer 10.20.105.22 192.168.102.12 80:31003/TCP 77s Now launch a browser page and navigate to the LB IP address, you should see the WordPress setup page comes online, perfect!\n(Note: if you are getting database connection error, double check the DB addresses and login credentials within the deployment yaml file).\nIn this lab, we have successfully deployed a Kubernetes application stack without managing any VM or underlying infrastructure resources. By utilizing the power of NKE and NBD platforms, we can significantly improve the efficiency of application development process by reducing the heavy lifting of managing the underlying infrastructure.\n","permalink":"https://route179.dev/2024/08/29/nke-lab-series-ep4-accelerate-k8s-application-development-using-nke-with-nutanix-database-ndb/","summary":"\u003cp\u003eThis is the 4th episode of our \u003ca href=\"/tags/nke/\"\u003eNKE lab series\u003c/a\u003e. Previously, I have demonstrated how you can easily deploy a NKE cluster in a Nutanix CE lab environment, and I have explored some NKE platform features including out-of-the-box CSI and CNI support. In this episode, we’ll take a look how you can accelerate Kubernetes application development by integrating NKE with Nutanix Database Service (NDB).\u003c/p\u003e\n\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/2024/08/29/nke-lab-series-ep4-accelerate-k8s-application-development-using-nke-with-nutanix-database-ndb/screenshot-2024-08-29-100955.png\"\u003e\u003c/p\u003e\n\u003cp\u003eNDB is a Database-as-a-Service designed to help developers speed up application development and simplify database administration across on-prem and public clouds. It simplifies database operations such as test DB provisioning/cloning and integrated snapshots/backup etc. It also provides a consistent “Database-as-Code” experience using REST API and K8s integrations. NDB supports most popular database engines, and you can read more about it \u003ca href=\"https://www.nutanix.com/au/products/database-service\"\u003eat here\u003c/a\u003e.\u003c/p\u003e","title":"NKE Lab series – Ep4: Accelerate K8s application development using NKE with Nutanix Database (NDB)"},{"content":"This is the 3rd episode of our NKE lab series. Previously, I have walked through:\nHow to deploy a NKE-enabled Kubernetes cluster in a nested Nutanix CE environment How to provide persistent storage to your NKE clusters using 2x Nutanix CSI options In this episode, we’ll deep dive into the NKE networking spaces by exploring the following:\nPART-1: Exploring Calico CNI deployment models within a NKE cluster PART-2: Applying standard Kubernetes network policy in a NKE cluster PART-3: Leveraging Calico specific policies in a NKE cluster pre-requisites a 1-node or 3-node Nutanix CE 2.0 cluster deployed in nested virtualization depending on your lab compute capacity, as documented here and here a NKE-enabled K8s cluster deployed in Nutanix CE (see Ep1) a Guestbook demo app deployed onto the NKE cluster (see Ep2) a lab network environment supports VLAN tagging and provides basic infra services such as AD, DNS, NTP etc (these are required when installing the CE cluster) a Linux/Mac workstation for managing the Kubernetes cluster, with Kubectl installed PART-1: Exploring Calico CNI models in NKE Calico is recognized as the most popular CNI plugins within he Kubernetes community, and it has been widely deployed in production thanks to its reliable performance and comprehensive networking and security features.\nCalico supports a variety of flexible Kubernetes networking deployment options, including:\nNon-Overlay network (most performant model with no encapsulation involved) Flat L2 mode (full-mesh BGP peering between k8s nodes to route Pod IPs) BGP (uses iBGP router reflectors to reduce peering pressure in a large cluster, can use ToR L3 switch, or just nodes as software reflectors) Overlay Network (for cross-subnet cluster connectivity, or there is no BGP support) IP-in-IP encapsulation VXLAN encapsulation In a NKE cluster, Calico is pre-configured to use the (default) Flat L2 mode, where all K8s nodes are deployed within the same L2 subnet and establish BGP full-mesh to route Pod IP prefixes. This is the most simple yet performant solution and does not introduce additional dataplane encapsulation overhead.\nLet’s check our NKE cluster to find out more details. First let’s see what are the Calico-based CNI API resources made available to us.\nsc@vx-ops02:~$ kubectl api-resources | grep calico bgpconfigurations crd.projectcalico.org/v1 false BGPConfiguration bgpfilters crd.projectcalico.org/v1 false BGPFilter bgppeers crd.projectcalico.org/v1 false BGPPeer blockaffinities crd.projectcalico.org/v1 false BlockAffinity caliconodestatuses crd.projectcalico.org/v1 false CalicoNodeStatus clusterinformations crd.projectcalico.org/v1 false ClusterInformation felixconfigurations crd.projectcalico.org/v1 false FelixConfiguration globalnetworkpolicies crd.projectcalico.org/v1 false GlobalNetworkPolicy globalnetworksets crd.projectcalico.org/v1 false GlobalNetworkSet hostendpoints crd.projectcalico.org/v1 false HostEndpoint ipamblocks crd.projectcalico.org/v1 false IPAMBlock ipamconfigs crd.projectcalico.org/v1 false IPAMConfig ipamhandles crd.projectcalico.org/v1 false IPAMHandle ippools crd.projectcalico.org/v1 false IPPool ipreservations crd.projectcalico.org/v1 false IPReservation kubecontrollersconfigurations crd.projectcalico.org/v1 false KubeControllersConfiguration networkpolicies crd.projectcalico.org/v1 true NetworkPolicy networksets crd.projectcalico.org/v1 true NetworkSet If we query the ippools.crd.projectcalico.org API we can see the Calico deployment details.\nsc@vx-ops02:~$ kubectl get ippools.crd.projectcalico.org -o yaml \u0026gt; nke-dev01-calico-ippool.yaml sc@vx-ops02:~$ cat nke-dev01-calico-ippool.yaml ... spec: allowedUses: - Workload - Tunnel blockSize: 26 cidr: 172.20.0.0/16 ipipMode: Never natOutgoing: true nodeSelector: all() vxlanMode: Never kind: List As shown above, we are not using any overlay encapsulation (both IPinIP or VXLAN mode are off). Each K8s node will get a /26 block from the pre-allocated Pod CIDR (172.20.0.0/16) — this will be used to assign IP addresses to the local Pods.\nWe can also query the caliconodestatuses.crd.projectcalico.org to get further details for node networking. However, to use this function we’ll need to create a CalicoNodeStatus resource and specify which node and what information we want to collect.\nsc@vx-ops02:~$ cat calico-node-status.yaml --- apiVersion: crd.projectcalico.org/v1 kind: CalicoNodeStatus metadata: name: caliconodestatus-master-0 spec: classes: - Agent - BGP - Routes node: nke-dev01-89e792-master-0 updatePeriodSeconds: 10 --- apiVersion: crd.projectcalico.org/v1 kind: CalicoNodeStatus metadata: name: caliconodestatus-worker-0 spec: classes: - Agent - BGP - Routes node: nke-dev01-89e792-worker-0 updatePeriodSeconds: 10 --- apiVersion: crd.projectcalico.org/v1 kind: CalicoNodeStatus metadata: name: caliconodestatus-worker-1 spec: classes: - Agent - BGP - Routes node: nke-dev01-89e792-worker-1 updatePeriodSeconds: 10 sc@vx-ops02:~$ kubectl apply -f calico-node-status.yaml caliconodestatus.crd.projectcalico.org/caliconodestatus-master-0 created caliconodestatus.crd.projectcalico.org/caliconodestatus-worker-0 created caliconodestatus.crd.projectcalico.org/caliconodestatus-worker-1 created Once deployed, let it run for 30 seconds to collect the information and we can then query the caliconodestatuses.crd.projectcalico.org API:\nsc@vx-ops02:~$ kubectl get caliconodestatuses.crd.projectcalico.org -o yaml \u0026gt; nke-dev01-caliconodestatus.yaml sc@vx-ops02:~$ cat nke-dev01-caliconodestatus.yaml There’s heaps information here but we’ll only focus at this section under the Master node:\n... bgp: numberEstablishedV4: 2 numberEstablishedV6: 0 numberNotEstablishedV4: 0 numberNotEstablishedV6: 0 peersV4: - peerIP: 192.168.102.104 since: \u0026#34;14:19:16\u0026#34; state: Established type: NodeMesh - peerIP: 192.168.102.103 since: \u0026#34;14:19:26\u0026#34; state: Established type: NodeMesh routes: routesV4: ... - destination: 172.20.188.64/26 gateway: 192.168.102.103 interface: eth0 learnedFrom: peerIP: 192.168.102.103 sourceType: NodeMesh type: FIB - destination: 172.20.116.128/26 gateway: 192.168.102.104 interface: eth0 learnedFrom: peerIP: 192.168.102.104 sourceType: NodeMesh type: FIB From above we can confirm the master node (192.168.102.102) has established full mesh with the other 2x worker nodes\nWorker-0: 192.168.102.103 (learned Pod routes 172.20.188.64/26 via BGP) Worker-1: 192.168.102.104 (learned Pod routes 172.20.116.128/26 via BGP) and we can further confirm the Pod routes by looking at Pod IP addresses on each worker node:\nsc@vx-ops02:~$ kubectl get pods --all-namespaces -o wide | grep nke-dev01-89e792-worker-1 | grep 172.20 guestbook frontend-795b566649-4zkqc 1/1 Running 0 82m 172.20.116.140 nke-dev01-89e792-worker-1 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; guestbook frontend-795b566649-8tf76 1/1 Running 0 82m 172.20.116.139 nke-dev01-89e792-worker-1 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; guestbook redis-follower-5ffdf87b7d-4lqlr 1/1 Running 0 82m 172.20.116.141 nke-dev01-89e792-worker-1 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; guestbook redis-leader-c767d6dbb-t5t8j 1/1 Running 0 82m 172.20.116.142 nke-dev01-89e792-worker-1 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; kube-system calico-kube-controllers-5cd67d7657-52km8 1/1 Running 0 11h 172.20.116.128 nke-dev01-89e792-worker-1 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; kube-system coredns-6fb596b5df-p7mhq 1/1 Running 0 11h 172.20.116.129 nke-dev01-89e792-worker-1 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; metallb-system controller-77676c78d9-5gvd6 1/1 Running 0 93m 172.20.116.136 nke-dev01-89e792-worker-1 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; ntnx-system alertmanager-main-0 2/2 Running 1 (11h ago) 11h 172.20.116.133 nke-dev01-89e792-worker-1 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; ntnx-system blackbox-exporter-5458d77cfb-d62kc 3/3 Running 0 11h 172.20.116.132 nke-dev01-89e792-worker-1 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; ntnx-system csi-snapshot-webhook-756b45fb5c-t9k8k 1/1 Running 0 11h 172.20.116.131 nke-dev01-89e792-worker-1 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; ntnx-system fluent-bit-v8gdt 1/1 Running 0 11h 172.20.116.130 nke-dev01-89e792-worker-1 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; sc@vx-ops02:~$ sc@vx-ops02:~$ kubectl get pods --all-namespaces -o wide | grep nke-dev01-89e792-worker-0 | grep 172.20 guestbook frontend-795b566649-j4r75 1/1 Running 0 82m 172.20.188.73 nke-dev01-89e792-worker-0 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; kube-system coredns-6fb596b5df-4kkrc 1/1 Running 0 11h 172.20.188.64 nke-dev01-89e792-worker-0 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; ntnx-system csi-snapshot-controller-7d68bf5bd7-6fg9c 1/1 Running 0 11h 172.20.188.65 nke-dev01-89e792-worker-0 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; ntnx-system fluent-bit-t9d7c 1/1 Running 0 11h 172.20.188.66 nke-dev01-89e792-worker-0 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; ntnx-system kube-state-metrics-54c97cdfdd-rkm2m 3/3 Running 0 11h 172.20.188.69 nke-dev01-89e792-worker-0 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; ntnx-system kubernetes-events-printer-6f44868d47-5sg98 1/1 Running 0 11h 172.20.188.67 nke-dev01-89e792-worker-0 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; ntnx-system prometheus-adapter-678c647d87-rbkqc 1/1 Running 0 11h 172.20.188.70 nke-dev01-89e792-worker-0 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; ntnx-system prometheus-k8s-0 2/2 Running 0 11h 172.20.188.71 nke-dev01-89e792-worker-0 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; ntnx-system prometheus-operator-f57b8d9cb-kpwp9 2/2 Running 0 11h 172.20.188.68 nke-dev01-89e792-worker-0 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; PART-2: USING standard K8s network policy With Calico deployed in our NKE cluster, we can straight away using standard Kubernetes Network policies to enhance cluster-wide security.\nFor example, by default we can access the Guestbook service from anywhere within the cluster. We can test this by launching a testpod within the default namespace and curl the frontend service (using K8s DNS format service.namespace)\nsc@vx-ops02:~$ kubectl run testpod -it --rm --image=yauritux/busybox-curl -- sh /home # /home # curl frontend.guestbook \u0026lt;html ng-app=\u0026#34;redis\u0026#34;\u0026gt; \u0026lt;head\u0026gt; \u0026lt;title\u0026gt;Guestbook\u0026lt;/title\u0026gt; \u0026lt;link rel=\u0026#34;stylesheet\u0026#34; href=\u0026#34;//netdna.bootstrapcdn.com/bootstrap/3.1.1/css/bootstrap.min.css\u0026#34;\u0026gt; https://ajax.googleapis.com/ajax/libs/angularjs/1.2.12/angular.min.js http://controllers.js https://cdnjs.cloudflare.com/ajax/libs/angular-ui-bootstrap/2.5.6/ui-bootstrap-tpls.js \u0026lt;/head\u0026gt; \u0026lt;body ng-controller=\u0026#34;RedisCtrl\u0026#34;\u0026gt; \u0026lt;div style=\u0026#34;width: 50%; margin-left: 20px\u0026#34;\u0026gt; \u0026lt;h2\u0026gt;Guestbook\u0026lt;/h2\u0026gt; \u0026lt;form\u0026gt; \u0026lt;fieldset\u0026gt; \u0026lt;input ng-model=\u0026#34;msg\u0026#34; placeholder=\u0026#34;Messages\u0026#34; class=\u0026#34;form-control\u0026#34; type=\u0026#34;text\u0026#34; name=\u0026#34;input\u0026#34;\u0026gt;\u0026lt;br\u0026gt; \u0026lt;button type=\u0026#34;button\u0026#34; class=\u0026#34;btn btn-primary\u0026#34; ng-click=\u0026#34;controller.onRedis()\u0026#34;\u0026gt;Submit\u0026lt;/button\u0026gt; \u0026lt;/fieldset\u0026gt; \u0026lt;/form\u0026gt; \u0026lt;div\u0026gt; \u0026lt;div ng-repeat=\u0026#34;msg in messages track by $index\u0026#34;\u0026gt; {{msg}} \u0026lt;/div\u0026gt; \u0026lt;/div\u0026gt; \u0026lt;/div\u0026gt; \u0026lt;/body\u0026gt; \u0026lt;/html\u0026gt; So how about if I only want to limit a certain namespace to access and consume the guestbook service? For example, we can use the below K8s ingress network policy to limit access to our guestbook service only from the namespace of “testns”.\nsc@vx-ops02:~$ kubectl create ns testns sc@vx-ops02:~$ kubectl get ns --show-labels | grep testns testns Active 40m kubernetes.io/metadata.name=testns sc@vx-ops02:~$ cat net-policy.yaml apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: mypolicy namespace: guestbook spec: policyTypes: - Ingress ingress: - from: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: testns sc@vx-ops02:~$ kubectl apply -f net-policy.yaml networkpolicy.networking.k8s.io/mypolicy created After applying the ingress policy, we should only be able to access the frontend service from within the testns namespace. Open another terminal, launch a new testpod in the testns namespace to verify this.\nsc@vx-ops02:~$ kubectl run testpod -n testns -it --rm --image=yauritux/busybox-curl -- sh /home # curl frontend.guestbook \u0026lt;html ng-app=\u0026#34;redis\u0026#34;\u0026gt; \u0026lt;head\u0026gt; \u0026lt;title\u0026gt;Guestbook\u0026lt;/title\u0026gt; \u0026lt;link rel=\u0026#34;stylesheet\u0026#34; href=\u0026#34;//netdna.bootstrapcdn.com/bootstrap/3.1.1/css/bootstrap.min.css\u0026#34;\u0026gt; https://ajax.googleapis.com/ajax/libs/angularjs/1.2.12/angular.min.js http://controllers.js https://cdnjs.cloudflare.com/ajax/libs/angular-ui-bootstrap/2.5.6/ui-bootstrap-tpls.js \u0026lt;/head\u0026gt; \u0026lt;body ng-controller=\u0026#34;RedisCtrl\u0026#34;\u0026gt; \u0026lt;div style=\u0026#34;width: 50%; margin-left: 20px\u0026#34;\u0026gt; \u0026lt;h2\u0026gt;Guestbook\u0026lt;/h2\u0026gt; \u0026lt;form\u0026gt; \u0026lt;fieldset\u0026gt; \u0026lt;input ng-model=\u0026#34;msg\u0026#34; placeholder=\u0026#34;Messages\u0026#34; class=\u0026#34;form-control\u0026#34; type=\u0026#34;text\u0026#34; name=\u0026#34;input\u0026#34;\u0026gt;\u0026lt;br\u0026gt; \u0026lt;button type=\u0026#34;button\u0026#34; class=\u0026#34;btn btn-primary\u0026#34; ng-click=\u0026#34;controller.onRedis()\u0026#34;\u0026gt;Submit\u0026lt;/button\u0026gt; \u0026lt;/fieldset\u0026gt; \u0026lt;/form\u0026gt; \u0026lt;div\u0026gt; \u0026lt;div ng-repeat=\u0026#34;msg in messages track by $index\u0026#34;\u0026gt; {{msg}} \u0026lt;/div\u0026gt; \u0026lt;/div\u0026gt; \u0026lt;/div\u0026gt; \u0026lt;/body\u0026gt; \u0026lt;/html\u0026gt; /home # Back at the previous terminal, we are now unable to connect to the frontend service from the first testpod within the default namespace, cool!\n/home # curl frontend.guestbook curl: (28) Failed to connect to frontend.guestbook port 80 after 127280 ms: Operation timed out PART-3: leveraging Calico specific policies One of the limitations with standard Kubernetes network policy is that you can only apply permit rules but deny rules are not supported. This makes it difficult to implement a “blacklist” scenario where you want to only block certain conditions but allow the rest.\nGood news is that we can use Calico based network policies which supports deny rules.\nIn the below example, we leverage the Calico CNI to create 2x egress rules for the guestbook namespace to deny outbound access to 8.0.0.0/8 but allow to the rest networks.\nsc@vx-ops02:~$ cat net-policy1.yaml --- apiVersion: crd.projectcalico.org/v1 kind: NetworkPolicy metadata: name: custom-deny-egress namespace: guestbook spec: order: 10 types: - Egress egress: - action: Deny destination: nets: - 8.0.0.0/8 --- apiVersion: crd.projectcalico.org/v1 kind: NetworkPolicy metadata: name: default-allow-egress namespace: guestbook spec: order: 20 types: - Egress egress: - action: Allow destination: nets: - 0.0.0.0/0 Now let’s apply the Calico policy.\nsc@vx-ops02:~$ kubectl apply -f net-policy1.yaml networkpolicy.crd.projectcalico.org/custom-deny-egress created networkpolicy.crd.projectcalico.org/default-allow-egress created To test this, we can simply find a pod in the guestbook namespace, attach to it and run some ping tests.\nsc@vx-ops02:~$ kubectl get pods -n guestbook NAME READY STATUS RESTARTS AGE frontend-795b566649-4zkqc 1/1 Running 0 7h3m frontend-795b566649-8tf76 1/1 Running 0 7h3m frontend-795b566649-j4r75 1/1 Running 0 7h3m redis-follower-5ffdf87b7d-4lqlr 1/1 Running 0 7h3m redis-leader-c767d6dbb-t5t8j 1/1 Running 0 7h3m sc@vx-ops02:~$ kubectl exec -ti -n guestbook frontend-795b566649-j4r75 -- sh # apt-get install iputils-ping # ping 8.8.8.8 PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data. ^C --- 8.8.8.8 ping statistics --- 7 packets transmitted, 0 received, 100% packet loss, time 6ms # ping 1.1.1.1 PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data. 64 bytes from 1.1.1.1: icmp_seq=1 ttl=55 time=10.4 ms 64 bytes from 1.1.1.1: icmp_seq=2 ttl=55 time=9.33 ms ^C --- 1.1.1.1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 2ms rtt min/avg/max/mdev = 9.330/9.856/10.382/0.526 ms # # ping 192.168.100.5 PING 192.168.100.5 (192.168.100.5) 56(84) bytes of data. 64 bytes from 192.168.100.5: icmp_seq=1 ttl=126 time=0.877 ms 64 bytes from 192.168.100.5: icmp_seq=2 ttl=126 time=0.792 ms ^C As we can see, from the frontend pod we are enable to ping Google but still able to access the rest of networks – exactly what we expected!\n","permalink":"https://route179.dev/2024/08/08/nke-lab-series-ep3-deep-dive-into-nke-networking-with-calico-cni/","summary":"\u003cp\u003eThis is the 3rd episode of our NKE lab series. Previously, I have walked through:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003ca href=\"/2024/08/08/nutanix-kubernetes-engine-nke-lab-series-ep1-create-a-nke-enabled-kubernetes-cluster-on-nutanix-community-edition-ce/\"\u003eHow to deploy a NKE-enabled Kubernetes cluster in a nested Nutanix CE environment\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/2024/08/08/nke-lab-series-ep2-deploy-a-multi-tier-web-application-on-a-nke-enabled-kubernetes-cluster-using-nutanix-csi/\"\u003eHow to provide persistent storage to your NKE clusters using 2x Nutanix CSI options \u003c/a\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eIn this episode, we’ll deep dive into the NKE networking spaces by exploring the following:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003ePART-1: Exploring Calico CNI deployment models within a NKE cluster\u003c/li\u003e\n\u003cli\u003ePART-2: Applying standard Kubernetes network policy in a NKE cluster\u003c/li\u003e\n\u003cli\u003ePART-3: Leveraging Calico specific policies in a NKE cluster\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"pre-requisites\"\u003e\u003cstrong\u003epre-requisites\u003c/strong\u003e\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003ea 1-node or 3-node \u003ca href=\"https://next.nutanix.com/discussion-forum-14/download-community-edition-38417\"\u003eNutanix CE 2.0\u003c/a\u003e cluster deployed in nested virtualization depending on your lab compute capacity, as documented \u003ca href=\"https://www.jeroentielen.nl/installing-nutanix-community-edition-ce-on-vmware-esxi-vsphere/\"\u003ehere\u003c/a\u003e and \u003ca href=\"https://polarclouds.co.uk/nested-nutanix-ce-deployment/\"\u003ehere\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003ea NKE-enabled K8s cluster deployed in Nutanix CE (\u003ca href=\"/2024/08/08/nutanix-kubernetes-engine-nke-lab-series-ep1-create-a-nke-enabled-kubernetes-cluster-on-nutanix-community-edition-ce/\"\u003esee Ep1\u003c/a\u003e)\u003c/li\u003e\n\u003cli\u003ea Guestbook demo app deployed onto the NKE cluster (\u003ca href=\"/2024/08/08/nke-lab-series-ep2-deploy-a-multi-tier-web-application-on-a-nke-enabled-kubernetes-cluster-using-nutanix-csi/\"\u003esee Ep2\u003c/a\u003e)\u003c/li\u003e\n\u003cli\u003ea lab network environment supports VLAN tagging and provides basic infra services such as AD, DNS, NTP etc (these are required when installing the CE cluster)\u003c/li\u003e\n\u003cli\u003ea Linux/Mac workstation for managing the Kubernetes cluster, with \u003cstrong\u003eKubectl installed\u003c/strong\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"part-1-exploring-calico-cni-models-in-nke\"\u003ePART-1: Exploring Calico CNI models in NKE\u003c/h2\u003e\n\u003cp\u003eCalico is recognized as the \u003ca href=\"https://www.tigera.io/blog/calico-in-2020-the-worlds-most-popular-kubernetes-cni/\"\u003emost popular CNI plugins\u003c/a\u003e within he Kubernetes community, and it has been widely deployed in production thanks to its reliable performance and comprehensive networking and security features.\u003c/p\u003e","title":"NKE lab series – Ep3: Deep dive into NKE networking with Calico CNI"},{"content":"This is the 2nd episode of our NKE lab series. In the 1st episode, I have demonstrated how you can easily deploy an enterprise-grade NKE cluster in a Nutanix CE lab environment with nested virtualization.\nIn this episode, we’ll deploy a containerized multi-tier web application onto our NKE cluster, by leveraging the built-in Nutanix CSI driver to provide persistent storage for the demo app.\nSpecifically, we’ll explore 2x Nutanix CSI options:\nPART-1: default storage class – via Nutanix Volumes PART-2: files storage class – via Nutanix Files Manager pre-requisites a 1-node or 3-node Nutanix CE 2.0 cluster deployed in nested virtualization depending on your lab compute capacity, as documented here and here a NKE-enabled K8s cluster deployed in Nutanix CE (see Ep1) a lab network environment supports VLAN tagging and provides basic infra services such as AD, DNS, NTP etc (these are required when installing the CE cluster) a Linux/Mac workstation for managing the Kubernetes cluster, with Kubectl installed clone the demo app Git repository to the workstation PART-1: deploy demo app using the default Storage Class (Nutanix Volumes) Before we start, let’s take a close look of the demo app, which is a simple Guestbook message board. It is a containerized web application using PHP \u0026amp; Redis, originally developed by Google as a GKE demo app. (Also see the generic K8s (non-GKE) user guide)\nThe k8s deployment files here are customized to demonstrate seamless integration with NKE platform capabilities such as out-of-the-box Container Storage Interface (CSI) and Container Network Interface (CNI) support.\nBy default, all Kubernetes Pods are deployed with ephemeral storage. This means the stored data will be gone when the Pod finishes or restarted. In order to preserve the data (i.e Guestbook messages), we can leverage the built-in Nutanix CSI driver to provide persistent storage for our Redis Leader and Redis Follower Pods.\nSpecifically, I have added the following sections to the Redis Leader and Redis Follower deployment files, instructing the Pods to store data in a Persistent Volume (PV) by requesting a Persistent Volume Claim (PVC).\nsc@vx-ops02:~/nke-labs/nke-guestbook-demo$ cat redis-leader-deployment.yaml ... spec: ... volumeMounts: - name: redis-leader-data mountPath: /data volumes: - name: redis-leader-data persistentVolumeClaim: claimName: redis-leader-claim sc@vx-ops02:~/nke-labs/nke-guestbook-demo$ cat redis-follower-deployment.yaml ... spec: ... volumeMounts: - name: redis-follower-data mountPath: /data volumes: - name: redis-follower-data persistentVolumeClaim: claimName: redis-follower-claim To do this, we’ll need to deploy the PVCs, which will trigger automatic PV provisioning by calling the the default storage class API.\nFirst let’s verify the storage class status – notice it is powered by the Nutanix CSI driver (using the Nutanix Volume in the backend).\nsc@vx-ops02:~$ kubectl get storageclasses.storage.k8s.io NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE default-storageclass (default) csi.nutanix.com Retain Immediate true 9h take a look of the PVC Yaml config files\nsc@vx-ops02:~$ cd nke-labs/k8s-guestbook/pvc/ sc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ cat guestbook-follower-claim.yaml kind: PersistentVolumeClaim apiVersion: v1 metadata: namespace: guestbook name: redis-follower-claim spec: accessModes: - ReadWriteOnce storageClassName: default-storageclass resources: requests: storage: 2Gi sc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ sc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ cat guestbook-leader-claim.yaml kind: PersistentVolumeClaim apiVersion: v1 metadata: namespace: guestbook name: redis-leader-claim spec: accessModes: - ReadWriteOnce storageClassName: default-storageclass resources: requests: storage: 2Gi Go ahead and deploy the PVCs\nsc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ kubectl create ns guestbook namespace/guestbook created sc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ sc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ kubectl apply -f ./ persistentvolumeclaim/redis-follower-claim created persistentvolumeclaim/redis-leader-claim created sc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ In a few seconds, you should see 2x PVCs (Redis Leader \u0026amp; Follower) created, and both are showing “Bound” status to the corresponding PVs with defined capacity and access modes.\nsc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ kubectl get pvc -n guestbook NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE redis-follower-claim Bound pvc-66b235ed-4d90-47bc-b99b-982e9d562336 2Gi RWO default-storageclass 25s redis-leader-claim Bound pvc-042544d9-f621-48d0-a11f-c2ec57cacbf5 2Gi RWO default-storageclass 25s We can also see the 2x PVs automatically provisioned by the default storage class.\nsc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ kubectl get pv | grep guestbook pvc-042544d9-f621-48d0-a11f-c2ec57cacbf5 2Gi RWO Retain Bound guestbook/redis-leader-claim default-storageclass 114s pvc-66b235ed-4d90-47bc-b99b-982e9d562336 2Gi RWO Retain Bound guestbook/redis-follower-claim default-storageclass 114s and we can see the same under Prism Central:\nWith both PVCs created and PVs provisioned, we are ready to deploy the demo app.\nFirst, deploy the Redis Leader Pods and Services.\nsc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl apply -f redis-leader-deployment.yaml -f redis-leader-service.yaml deployment.apps/redis-leader created service/redis-leader created sc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl get pod -n guestbook NAME READY STATUS RESTARTS AGE redis-leader-c767d6dbb-bbcxf 1/1 Running 0 9m44s sc@vx-ops02:~/nke-labs/k8s-guestbook$ sc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl get svc -n guestbook NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE redis-leader ClusterIP 10.20.150.67 \u0026lt;none\u0026gt; 6379/TCP 9m53s Next, deploy the Redis Follower Pods and Services.\nsc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl apply -f redis-follower-deployment.yaml -f redis-follower-service.yaml deployment.apps/redis-follower created service/redis-follower created sc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl get pods -n guestbook NAME READY STATUS RESTARTS AGE redis-follower-5ffdf87b7d-9t9v5 1/1 Running 0 15s redis-leader-c767d6dbb-bbcxf 1/1 Running 0 10m sc@vx-ops02:~/nke-labs/k8s-guestbook$ sc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl get svc -n guestbook NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE redis-follower ClusterIP 10.20.23.235 \u0026lt;none\u0026gt; 6379/TCP 20s redis-leader ClusterIP 10.20.150.67 \u0026lt;none\u0026gt; 6379/TCP 10m Before we can deploy the Frontend service, we’ll need to install a K8s LoadBalancer Controller so we can expose the web frontend to the external network. For this, I’m using MetalLB which is a popular open-source LoadBalancer controller for baremetal or hybrid K8s environment.\nThe installation is easily done by using a single manifest.\nsc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.8/config/manifests/metallb-native.yaml namespace/metallb-system created customresourcedefinition.apiextensions.k8s.io/bfdprofiles.metallb.io created customresourcedefinition.apiextensions.k8s.io/bgpadvertisements.metallb.io created customresourcedefinition.apiextensions.k8s.io/bgppeers.metallb.io created customresourcedefinition.apiextensions.k8s.io/communities.metallb.io created customresourcedefinition.apiextensions.k8s.io/ipaddresspools.metallb.io created customresourcedefinition.apiextensions.k8s.io/l2advertisements.metallb.io created customresourcedefinition.apiextensions.k8s.io/servicel2statuses.metallb.io created serviceaccount/controller created serviceaccount/speaker created role.rbac.authorization.k8s.io/controller created role.rbac.authorization.k8s.io/pod-lister created clusterrole.rbac.authorization.k8s.io/metallb-system:controller created clusterrole.rbac.authorization.k8s.io/metallb-system:speaker created rolebinding.rbac.authorization.k8s.io/controller created rolebinding.rbac.authorization.k8s.io/pod-lister created clusterrolebinding.rbac.authorization.k8s.io/metallb-system:controller created clusterrolebinding.rbac.authorization.k8s.io/metallb-system:speaker created configmap/metallb-excludel2 created secret/metallb-webhook-cert created service/metallb-webhook-service created deployment.apps/controller created daemonset.apps/speaker created validatingwebhookconfiguration.admissionregistration.k8s.io/metallb-webhook-configuration created sc@vx-ops02:~/nke-labs/k8s-guestbook$ MetalLB can be deployed in either L2 or L3 (BGP) mode. For simplicity, we’ll choose a flat L2 mode, and that means we’ll need to reserve a IP pool from the same NKE subnet for the LoadBalancer services to consume. We’ll also configure MetalLB to advertise this IP pool in L2 mode so it will respond to external ARP requests for these addresses.\nsc@vx-ops02:~/nke-labs/k8s-guestbook$ cd ~/nke-labs/k8s-guestbook/MetalLB-config/ sc@vx-ops02:~/nke-labs/k8s-guestbook/MetalLB-config$ cat metallb_config.yaml --- apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: default namespace: metallb-system spec: addresses: - 192.168.102.10-192.168.102.19 autoAssign: true --- apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: default namespace: metallb-system spec: ipAddressPools: - default sc@vx-ops02:~/nke-labs/k8s-guestbook/MetalLB-config$ kubectl apply -f metallb_config.yaml ipaddresspool.metallb.io/default created l2advertisement.metallb.io/default created Verify the MetalLB status and check the IP address pool advertised – make sure this range is outside the NKE DHCP range, and is not used by any other services on the subnet.\nsc@vx-ops02:~/nke-labs/k8s-guestbook/MetalLB-config$ kubectl get ipaddresspools.metallb.io -n metallb-system NAME AUTO ASSIGN AVOID BUGGY IPS ADDRESSES default true false [\u0026#34;192.168.102.10-192.168.102.19\u0026#34;] sc@vx-ops02:~/nke-labs/k8s-guestbook/MetalLB-config$ kubectl get l2advertisements.metallb.io -n metallb-system -o wide NAME IPADDRESSPOOLS IPADDRESSPOOL SELECTORS INTERFACES NODE SELECTORS default [\u0026#34;default\u0026#34;] Finally, we can deploy the Guestbook Frontend Pods and the (LoadBalancer) Service.\nsc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl apply -f frontend-deployment.yaml -f frontend-service.yaml deployment.apps/frontend created service/frontend created You should now have all 3x tiers of K8s services running, with the Frontend service exposed to outside via MetalLB. In my case, I can see the Frontend LoadBalancer service is successfully deployed and has automatically obtained an external routable IP of 192.168.102.10.\nsc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl get pods -n guestbook NAME READY STATUS RESTARTS AGE frontend-795b566649-kndwz 1/1 Running 0 3m4s frontend-795b566649-v96qh 1/1 Running 0 3m4s frontend-795b566649-z99gz 1/1 Running 0 3m4s redis-follower-5ffdf87b7d-9t9v5 1/1 Running 0 6m55s redis-leader-c767d6dbb-bbcxf 1/1 Running 0 17m sc@vx-ops02:~/nke-labs/k8s-guestbook$ sc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl get svc -n guestbook NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE frontend LoadBalancer 10.20.65.94 192.168.102.10 80:31807/TCP 3m10s redis-follower ClusterIP 10.20.23.235 \u0026lt;none\u0026gt; 6379/TCP 7m1s redis-leader ClusterIP 10.20.150.67 \u0026lt;none\u0026gt; 6379/TCP 17m Open a browser page and hit that LB IP address, and you should see the Guestbook application delivered from our NKE cluster! Leave some messages there and click submit, and the data will be stored in the Redis database.\nTo test the persistent storage, we can simply delete the demo app and re-create it. When the new Redis Pods are re-deployed they should mount the same PVs which preserve the data.\nsc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl delete -f ./ deployment.apps \u0026#34;frontend\u0026#34; deleted service \u0026#34;frontend\u0026#34; deleted deployment.apps \u0026#34;redis-follower\u0026#34; deleted service \u0026#34;redis-follower\u0026#34; deleted deployment.apps \u0026#34;redis-leader\u0026#34; deleted service \u0026#34;redis-leader\u0026#34; deleted sc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl get all -n guestbook No resources found in guestbook namespace. sc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl apply -f ./ deployment.apps/frontend created service/frontend created deployment.apps/redis-follower created service/redis-follower created deployment.apps/redis-leader created service/redis-leader created When the K8s services back online, open another browser page (use Incognito mode so there’s no cache) and hit the LB address again – the original message is still there, perfect!\nSTEP-2: deploy demo app using Multi-Access Storage Class (NUTANIX FILES) In the last example, we have deployed the demo app using the NKE default storage class, supported by Nutanix CSI driver and using Nutanix Volume in the backend. However, we only deployed 1x Redis Follower Pod – but what about if we want to expand to multiple Redis Follower instances to provide better scalability?\nIn that case, we would need multiple Pods to access and read/write data to the same PV. This particular PV Access mode is referred as ReadWriteMany, which is not supported by the default NKE storage class (Nutanix Volume). Instead, we’ll need to create a new NFS-type storage class, which can be provided by the Nutanix Files manager.\nTo do so, first we’ll need to deploy a Nutanix Filer Server. In Prism Central, go to Unified Storage \u0026gt; Files \u0026gt; Filer Server, then click New Filer Server. I will be using ver 4.4.0.3 in this case.\nProvide file server domain name, and follow the wizard to configure Internal/External subnets with IP addresses for the filer server fleet.\nBelow is a snapshot of the configuration of my filer servers. I have also configured the corresponding records on the AD/DNS including all 3x server IP addresses.\nOnce the filer server is provisioned, go to Configuration \u0026gt; Authentication to enable the NFS protocol. Since this will be consumed by NKE so for now we just leave it to “Unmanaged”.\nNow go to Shares \u0026amp; Exports to create a NFS share (for the NKE storage class). Provide a share name, specify size and select Protocol (NFS is pre-selected since we only enabled it). In General Settings page, leave the default (Enable Compression) and select next.\nHere we’ll configure NFS protocol access\nDefault Access – No Access, with Exception for the NKE subnet: Client – 192.168.102.0/24 Access – Read/Write Squashing – All Squashing here’s a summary of the NFS share.\nOnce the NFS share is deployed. We are ready to create the storage class. Navigate to Cloud Infrastructure \u0026gt; Kubernetes Management \u0026gt; Clusters, and click into our NKE cluster.\nGo to Storage \u0026gt; Storage Classes to create a new storage class with the following settings:\nVolume Type – nutanix_files NFS Export Endpoint – fs.vxlan.co (Filer Server DNS name) NFS Export Path – /k8s (the NFS share provided by Filer Server) Now let’s check the storage class under our NKE cluster. You should see the new files-storageclass is automatically deployed into our cluster and is ready to be consumed!\nsc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ kubectl get storageclasses.storage.k8s.io NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE default-storageclass (default) csi.nutanix.com Retain Immediate true 3h45m files-storageclass csi.nutanix.com Delete Immediate false 37s sc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ kubectl describe storageclasses.storage.k8s.io files-storageclass Name: files-storageclass IsDefaultClass: No Annotations: \u0026lt;none\u0026gt; Provisioner: csi.nutanix.com Parameters: nfsPath=/k8s,nfsServer=fs.vxlan.co,storageType=NutanixFiles AllowVolumeExpansion: \u0026lt;unset\u0026gt; MountOptions: \u0026lt;none\u0026gt; ReclaimPolicy: Delete VolumeBindingMode: Immediate Events: \u0026lt;none\u0026gt; To test this, we’ll first clean up our existing deployment, including the previous PVCs.\nsc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl delete -f ./ deployment.apps \u0026#34;frontend\u0026#34; deleted service \u0026#34;frontend\u0026#34; deleted deployment.apps \u0026#34;redis-follower\u0026#34; deleted service \u0026#34;redis-follower\u0026#34; deleted deployment.apps \u0026#34;redis-leader\u0026#34; deleted service \u0026#34;redis-leader\u0026#34; deleted sc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ kubectl delete -f ./ persistentvolumeclaim \u0026#34;redis-follower-claim\u0026#34; deleted persistentvolumeclaim \u0026#34;redis-leader-claim\u0026#34; deleted sc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ Next, we’ll make the following changes to fully utilize the ReadWriteMany capabilities provided by the NFS-based file storage class.\nUpdate the Redis follower PVC yaml file, and change the storage class to “files-storageclass” and access mode to “ReadWriteMany“ Update the Redis follower deployment yaml file, and change the replicas to 3 sc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ vim guestbook-follower-claim-nfs.yaml kind: PersistentVolumeClaim apiVersion: v1 metadata: namespace: guestbook name: redis-follower-claim spec: accessModes: - ReadWriteMany storageClassName: files-storageclass resources: requests: storage: 5Gi sc@vx-ops02:~/nke-labs/k8s-guestbook$ vim redis-follower-deployment.yaml # SOURCE: https://cloud.google.com/kubernetes-engine/docs/tutorials/guestbook apiVersion: apps/v1 kind: Deployment ... spec: replicas: 3 selector: matchLabels: app: redis ... Let’s create the PVCs\nsc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ kubectl apply -f guestbook-leader-claim.yaml -f guestbook-follower-claim-nfs.yaml persistentvolumeclaim/redis-leader-claim created persistentvolumeclaim/redis-follower-claim created sc@vx-ops02:~/nke-labs/k8s-guestbook/pvc$ kubectl get pvc -n guestbook NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE redis-follower-claim Bound pvc-9a99599b-bfb1-4851-9a21-8140e34b2528 5Gi RWX files-storageclass 91s redis-leader-claim Bound pvc-2665d13e-26ba-4df4-8ecb-2d4cc292305f 2Gi RWO default-storageclass 5s Now we have a new PVC created using the files-storageclass with the access mode of “RWX”\nWe are ready to deploy the demo app again\nsc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl apply -f ./ deployment.apps/frontend created service/frontend created deployment.apps/redis-follower created service/redis-follower created deployment.apps/redis-leader created service/redis-leader created and notice this time there are 3x Redis Follower Pods deployed – all mounting the same PV which was provisioned by the files-storageclass powered by the Nutanix Files Manager!\nsc@vx-ops02:~/nke-labs/k8s-guestbook$ kubectl get pods -n guestbook NAME READY STATUS RESTARTS AGE frontend-795b566649-42b8d 1/1 Running 0 44s frontend-795b566649-m94nb 1/1 Running 0 44s frontend-795b566649-xlkn9 1/1 Running 0 44s redis-follower-5ffdf87b7d-d7gtc 1/1 Running 0 43s redis-follower-5ffdf87b7d-ldpqr 1/1 Running 0 43s redis-follower-5ffdf87b7d-thsld 1/1 Running 0 43s redis-leader-c767d6dbb-dmjpl 1/1 Running 0 43s and our favorite Guestbook service is back online with improved database scalability!\nIn the next episode, we’ll dive deep into the NKE networking space, and explore how you can leverage the Calico CNI to deploy flexible K8s network \u0026amp; security policies. Stay tuned!\n","permalink":"https://route179.dev/2024/08/08/nke-lab-series-ep2-deploy-a-multi-tier-web-application-on-a-nke-enabled-kubernetes-cluster-using-nutanix-csi/","summary":"\u003cp\u003eThis is the 2nd episode of our NKE lab series. In the \u003ca href=\"/2024/08/08/nutanix-kubernetes-engine-nke-lab-series-ep1-create-a-nke-enabled-kubernetes-cluster-on-nutanix-community-edition-ce/\"\u003e1st episode\u003c/a\u003e, I have demonstrated how you can easily deploy an enterprise-grade NKE cluster in a Nutanix CE lab environment with nested virtualization.\u003c/p\u003e\n\u003cp\u003eIn this episode, we’ll deploy a containerized multi-tier web application onto our NKE cluster, by leveraging the built-in Nutanix CSI driver to provide persistent storage for the demo app.\u003c/p\u003e\n\u003cp\u003eSpecifically, we’ll explore 2x Nutanix CSI options:\u003c/p\u003e","title":"NKE lab series – Ep2: Deploy a multi-tier web application on a NKE cluster using persistent storage with Nutanix CSI"},{"content":"This blog is the 1st episode of a Nutanix Kubernetes Engine (NKE) home lab series. In this post, I will walk through the detailed process of deploying an enterprise-ready NKE-enabled Kubernetes cluster within a Nutanix CE environment.\nNutanix CE is a free version of Nutanix AOS, which powers the Nutanix Enterprise Cloud Platform. It is designed for people interested in test driving Nutanix platform features and capabilities in a non-production or PoC environment. Even better, Nutanix CE also works in a nested virtualization deployment on top of ESXi/vSphere. This makes it perfect for hands-on testing or exploring in a safe environment such as home-lab, which is exactly what I’m running here!\npre-requisites a vSphere 7/8 environment with at least 64GB RAM and 1TB storage (preferably SSD) a 1-node or 3-node Nutanix CE 2.0 cluster deployed in nested virtualization depending on your lab compute capacity, as documented here and here Prism Central installed and connected to your CE cluster (I’m running pc.2023.4.0.2) a lab network environment supports VLAN tagging and provides basic infra services such as AD, DNS, NTP etc (these are required when installing the CE cluster) a Linux/Mac workstation for managing the Kubernetes cluster, with Kubectl installed Since this is in nested virtualization and the NKE cluster will be running on separate \u0026amp; dedicated VLAN, you need to apply the following vDS/vSS port-group configuration to the CE VM vNICs. VLAN ID: All (4095), Security – Promiscuous mode (Accept) Security – Mac address changes (Accept) Security – Forged transmits (Accept) Lab Steps Step-1: Prepare the CE cluster environemnt Before we start, we’ll need to prepare our lab CE cluster and apply all the software/firmware patches and updates. To do so, simply go to Life Cycle Manager (LCM) in Prism Element, or Platform Services \u0026gt; Admin Center \u0026gt; LCM in Prism Central to perform an inventory discovery, and then follow the Update wizard to complete all software updates. Below is a snapshot of all software versions I’m running after the update process completes.\nNext, log back into Prism Central and navigate to Platform Services \u0026gt; Apps \u0026amp; Marketplace to Enable Marketplace.\nThis process will take a few mins to complete, after that you’ll be able to select and install the NKE package. Below is what I have enabled on my CE environment, including some additional packages such as Foundation Central and Nutanix Files Manager, which will be used for CSI integrations in Episode 2 (stay tuned!).\nNote: By default the NKE version installed is only v2.2.3 which is far outdated, and you’ll need to run another LCM inventory check and update it to the latest version available (v2.10.1 in my case as shown above)\nNow go to the new NKE menu under Cloud Infrastructure \u0026gt; Kubernetes Management, and you’ll need to download the latest NKE node OS and different Kubernetes version images based on your own needs.\nThese are all the NKE node OS with Kubernetes images available to me at this point of time.\nBefore we can deploy the NKE cluster, the last thing is to prepare a dedicated VLAN/subnet. Go to Infrastructure \u0026gt; Network \u0026amp; Security \u0026gt; Subnet to create a subnet (VLAN 102 in my case).\nNOTE: It is very important this subnet has DHCP or IPAM configured as NKE will need this to automatically assign IP addresses for K8s nodes during cluster deployment. Since I’m using a Juniper switch to provide the DHCP service so I have left IPAM unchecked here.\nSTEP-2: deploy a NKE-enabled K8s cluster Time to deploy the NKE cluster. Navigate to Cloud Infrastructure \u0026gt; Kubernetes Management \u0026gt; Clusters and click Create Kubernetes Cluster.\nFor the purpose of demo (and save precious lab resources), we’ll deploy a Development Cluster.\nProvide a cluster name, and select preferred Node OS image and Kubernetes versions.\nSpecify the NKE cluster subnet we prepared earlier, and the number of worker nodes.\nNext, select a CNI provider and specify the Kubernetes Pod and Service CIDR ranges. At the moment there are only 2x CNI options: Calico and Flannel. We’ll go with Calico as it is the most commonly deployed CNI in production.\nNext, configure a default storage class as the platform built-in CSI driver to provide persistent storage for the NKE cluster. We’ll dive deeper on this in Ep2 to explore more features and capabilities provided by the NKE platform. You can also enable Flash Mode to ensure the Persistent Volumes (PVs) consumed by Pods are to be deployed on SSDs.\nNow go ahead and deploy the NKE cluster. Depending on your CE cluster capacity, the process might take 10~20 min to complete. (note if you are getting ETCD deployment error, double check the NKE subnet DHCP/IPAM configuration).\nSTEP-3: ACCESS the NKE cluster With a bit of luck you should see a Kubernetes cluster up and running in 20min. In my case, NKE has deployed a total of 4x K8s VM nodes, including:\n1x Etcd node (192.168.102.101) – a key-value database manages and holds the configuration data, state data, and metadata for the Kubernetes cluster 1x K8s master node (192.168.102.102) – provides K8s API endpoint for cluster management \u0026amp; orchestration 2x K8s worker nodes (192.168.102.103 \u0026amp; 104) To access our K8s cluster, click “Download Kubeconfig” and save it to your workstation (MAC/Linux) where the Kubectl tool installed.\nCopy the kubeconfig file to path .kube/config, and you should now have access to the cluster.\nsc@vx-ops02:~$ cp nke-dev01-kubectl.cfg .kube/config sc@vx-ops02:~$ sc@vx-ops02:~$ kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME nke-dev01-89e792-master-0 Ready control-plane,master 9h v1.28.5 192.168.102.102 \u0026lt;none\u0026gt; CentOS Linux 7 (Core) 3.10.0-1160.108.1.el7.x86_64 containerd://1.6.16 nke-dev01-89e792-worker-0 Ready node 9h v1.28.5 192.168.102.103 \u0026lt;none\u0026gt; CentOS Linux 7 (Core) 3.10.0-1160.108.1.el7.x86_64 containerd://1.6.16 nke-dev01-89e792-worker-1 Ready node 9h v1.28.5 192.168.102.104 \u0026lt;none\u0026gt; CentOS Linux 7 (Core) 3.10.0-1160.108.1.el7.x86_64 containerd://1.6.16 Take a look at the kube-system namespace, as expected the Calico CNI provider is installed and ready to provide Kubernetes networking services. We’ll dive deeper into CNI in a later episode.\nsc@vx-ops02:~$ kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE calico-kube-controllers-5cd67d7657-52km8 1/1 Running 0 9h calico-node-4z7kd 1/1 Running 0 9h calico-node-vd77x 1/1 Running 0 9h calico-node-z7sc4 1/1 Running 0 9h calico-typha-6cfbdf945c-ll5p2 1/1 Running 0 9h coredns-6fb596b5df-4kkrc 1/1 Running 0 9h coredns-6fb596b5df-p7mhq 1/1 Running 0 9h kube-apiserver-nke-dev01-89e792-master-0 3/3 Running 0 9h kube-proxy-ds-8k4x5 1/1 Running 0 9h kube-proxy-ds-b7wq7 1/1 Running 0 9h kube-proxy-ds-ljf2p 1/1 Running 0 9h sc@vx-ops02:~$ and in the ntnx-system namespace, we have a bunch of other plugins pre-installed such as FluentBit and Prometheus adapter to provide out-of-the-box logging, monitoring and observability services.\nsc@vx-ops02:~$ kubectl get pods -n ntnx-system NAME READY STATUS RESTARTS AGE alertmanager-main-0 2/2 Running 1 (9h ago) 9h blackbox-exporter-5458d77cfb-d62kc 3/3 Running 0 9h csi-snapshot-controller-7d68bf5bd7-6fg9c 1/1 Running 0 9h csi-snapshot-webhook-756b45fb5c-t9k8k 1/1 Running 0 9h fluent-bit-2qwsg 1/1 Running 0 9h fluent-bit-t9d7c 1/1 Running 0 9h fluent-bit-v8gdt 1/1 Running 0 9h kube-state-metrics-54c97cdfdd-rkm2m 3/3 Running 0 9h kubernetes-events-printer-6f44868d47-5sg98 1/1 Running 0 9h node-exporter-gw2ms 2/2 Running 0 9h node-exporter-pvw25 2/2 Running 0 9h node-exporter-vmdvk 2/2 Running 0 9h nutanix-csi-controller-768695cfcf-xpgx9 5/5 Running 1 (9h ago) 9h nutanix-csi-node-2mrnm 3/3 Running 1 (9h ago) 9h nutanix-csi-node-q6plw 3/3 Running 1 (9h ago) 9h prometheus-adapter-678c647d87-rbkqc 1/1 Running 0 9h prometheus-k8s-0 2/2 Running 0 9h prometheus-operator-f57b8d9cb-kpwp9 2/2 Running 0 9h More importantly, we can see the Nutanix CSI controller is also deployed for us. If we check the storage classes, we can see the default storage class is now ready to be consumed and the persistent storage provider is csi.nutanix.com, perfect!\nsc@vx-ops02:~$ kubectl get storageclasses.storage.k8s.io NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE default-storageclass (default) csi.nutanix.com Retain Immediate true 9h In the next episode, we’ll deploy a demo app onto our NKE cluster, by using the built-in CSI driver to provide persistent storage for the data tier. We’ll also explore other NKE CSI capabilities and options including native integration with the Nutanix Filer server, stay tuned!\n","permalink":"https://route179.dev/2024/08/08/nutanix-kubernetes-engine-nke-lab-series-ep1-create-a-nke-enabled-kubernetes-cluster-on-nutanix-community-edition-ce/","summary":"\u003cp\u003eThis blog is the 1st episode of a Nutanix Kubernetes Engine (NKE) home lab series. In this post, I will walk through the detailed process of deploying an enterprise-ready NKE-enabled Kubernetes cluster within a \u003ca href=\"https://www.nutanix.com/au/products/community-edition\"\u003eNutanix CE\u003c/a\u003e environment.\u003c/p\u003e\n\u003cp\u003eNutanix CE is a free version of Nutanix AOS, which powers the Nutanix Enterprise Cloud Platform. It is designed for people interested in test driving Nutanix platform features and capabilities in a non-production or PoC environment. Even better, Nutanix CE also works in a nested virtualization deployment on top of ESXi/vSphere. This makes it perfect for hands-on testing or exploring in a safe environment such as home-lab, which is exactly what I’m running here!\u003c/p\u003e","title":"Nutanix Kubernetes Engine (NKE) lab series – Ep1: Create a NKE-enabled Kubernetes Cluster on Nutanix Community Edition (CE)"},{"content":"With the recent release of VMware Cloud on AWS SDDC version 1.18, we have introduced a ton of advanced networking capabilities which opened up possibilities for many new interesting use cases. Customers can now utilise the NSX Manager UI (or VMC Policy API) to configure route aggregation at each SDDC level, and this provides an efficient way to solve the 100 DX route limit. Customer can also create additional Tier-1 Compute Gateways (Multi-CGWs) with static route injection capabilities to address different requirements such as network multi-tenancy, overlapping IPv4 environments and integrating with 3rd-party network \u0026amp; security appliances etc. You can read more details about the new features at here.\nFor this article we will focus on the use case of integrating 3rd-party load balancers into VMware Cloud on AWS. Specifically we will look at how to deploy and integrate a HA pair of F5 BIG-IP Local Traffic Manager (LTM) Virtual Edition (VE) into a SDDC cluster.\nWe will utilise the Route Aggregation and Multi-CGW features to create an inline load balancing topology and integrate with F5 LTMs within the lab SDDC cluster. Traffic from external towards the web servers will be routed through the F5 and the client source addresses are preserved (no SNAT is required and no need to configure XFF at the web servers)\nprerequisites Deploy a VMware Cloud on AWS SDDC cluster (ver 1.18+) Access to F5 BIG-IP LTM VE (I’m using v16.1.2, a 30-day trial available here) Access to an AWS account that is linked to the SDDC (so you can test connectivity via the connected VPC or VMware Transit Connect) Deploy 2x web servers in SDDC for the LTM load balancing pool Lab Procedures I won’t cover every detailed step but at a high level we’ll need to perform the following tasks:\nconfigure SDDC route aggregation in NSX manager (so that Multi-CGW segment routes are advertised to the external) create 3x Tier-1 CGWs as per the below lab topology (1x routed CGW-LB-F5 for F5 Outside interfaces, 1x isolated CGW-LB-WEB for F5 Inside interfaces and the web segment, and 1x isolated CGW-LB-HA for F5 HA communications) create relevant network segments and attached to the above 3x CGWs accordingly configure static routes at the CGW-LB-F5 and CGW-LB-WEB for ingress and egress transit routing deploy the F5 LTM HA pair and configure network settings configure LTM load balancing settings (Nodes, Pool, VIP) and run tests F5 Integration Lab Topology\nSTEP-1 To begin, we will first configure SDDC route aggregation at the NSX Manager UI. This will leverage an AWS managed prefix-list to announce summarised routes to external, so the Multi-CGW segments are accessible from connected VPC and Intranet (Direct Connect or VMware Transit Connect).\nWithin the NSX Manager UI, locate Networking \u0026gt; Global Configurations \u0026gt; Route Aggregation, create an aggregation prefix-list to summarise the SDDC CIDR block (172.30.0.0/16 in my case)\nThen create a route configuration to announce the prefix-list to the INTRANET endpoint — since I’m using the VMware Transit Connect for my SDDC external connectivity, the summarised routes will be advertised to the VTGW.\nBack at the VMC console we can verify the summarised route (172.30.0.0/16) is being advertised at the SDDC under Networking \u0026amp; Security \u0026gt; Transit Connect \u0026gt; Advertised Routes. Note the SDDC mgmt route (173.30.0.0/23) will not be summarised and will always be advertised explicitly.\nSTEP-2 Go to the NSX Manager again and create 3x Tier-1 CGWs as per the lap topology. Note we will need to select “routed” type for the CGW-LB-F5 in order to inject a static route towards F5 for the web server segment, and “isolated” type is required for the CGW-LB-WEB in order to inject default route (0.0.0.0/0) towards the F5.\nSTEP-3 Next, configure the below network segments as per the lab topology and attach them to the 3x CGWs accordingly. Note the VM-MGMT-NET01 is created at the default CGW and this is to host the F5 LTM management interfaces, which use a separate management route table.\nSTEP-4 Additionally, configure the CGW-LB-F5 to add a static route (for LB-F5-WEB01 segment) towards the F5 — the next-hop will be the Outside interface floating IP (172.30.100.10) between the LTM HA pair.\nSimilarly, configure the CGW-LB-WEB to add a default route towards the F5 — the next-hop will be the Inside interface floating IP (172.30.100.100) between the LTM HA pair.\nSTEP-5 We are now ready to deploy and configure the F5 LTM VE appliances. For the purpose of the demo I will only show the key network configurations of the LTM01.\nOnce the appliances are deployed and system has been initialised, go to each LTM management UI to configure the local network settings. First, create the data VLANs for each interface under Network \u0026gt; VLANs — notice here all VLANs are internal to F5 only and must be untagged at each interface, as VLAN trunking to a guest VM is not supported by VMware Cloud on AWS at this stage.\nNext, configure the local interface IP addresses under Network \u0026gt; Self-IPs\nAlso add the static routes including default route under Network \u0026gt; Routes\nAt this stage, you are ready to add the peer device and create a HA failover device group. Once the device group is created and the HA pair is in sync, you can now create additional HA floating IP addresses for both the Inside and Outside interfaces.\nNote here for the floating IPs you’ll need to apply a floating traffic group (I’m using the default traffic-group-1).\nSTEP-6 Finally we are ready to configure the load balancing settings at the F5 LTM HA pair for the workloads deployed in SDDC. For this lab I have deployed two simple Linux VMs with Apache web servers (172.30.101.11 \u0026amp; 172.30.101.12)\nFirst, create 2x nodes for the web servers under Local Traffic \u0026gt; Nodes:\nSecond, create a LB pool at Local Traffic \u0026gt; Pools with the 2x nodes and select appropriate Health Monitor and Load Balancing Method.\nLastly, go to Local Traffic \u0026gt; Virtual Servers and deploy a HTTP VIP for the web service using the LB pool we have just created.\nAssuming everything is configured correctly you should see the VIP coming online straight away, and you can also verify the service status at Local Traffic \u0026gt; Network Map:\nNow hit the VIP address in your browser and you should see traffic is being load balanced between the two nodes (since we selected the basic Round Robin LB method).\nand because the F5s are deployed in inline (routed) mode without SNAT, the web servers are able to see the original source IPs from the clients.\n","permalink":"https://route179.dev/2022/05/02/integrate-f5-load-balancers-into-vmware-cloud-on-aws-sddc-environment/","summary":"\u003cp\u003eWith the recent release of \u003ca href=\"https://docs.vmware.com/en/VMware-Cloud-on-AWS/0/rn/vmc-on-aws-relnotes.html#wn04052022\"\u003e\u003cstrong\u003eVMware Cloud on AWS SDDC version 1.18\u003c/strong\u003e\u003c/a\u003e, we have introduced a ton of advanced networking capabilities which opened up possibilities for many new interesting use cases. Customers can now utilise the NSX Manager UI (or VMC Policy API) to configure route aggregation at each SDDC level, and this provides an efficient way to solve the \u003ca href=\"https://kb.vmware.com/s/article/78931\"\u003e100 DX route limit\u003c/a\u003e. Customer can also create additional Tier-1 Compute Gateways (Multi-CGWs) with static route injection capabilities to address different requirements such as network multi-tenancy, overlapping IPv4 environments and integrating with 3rd-party network \u0026amp; security appliances etc. You can read more details about the new features \u003ca href=\"https://blogs.vmware.com/cloud/2022/04/06/vmware-cloud-on-aws-advanced-networking-and-routing-features/\"\u003eat here\u003c/a\u003e.\u003c/p\u003e","title":"Integrate F5 Load Balancers into VMware Cloud on AWS SDDC Environment"},{"content":"With the recently announced Amazon FSx for NetApp ONTAP, it is very exciting that for the first time we have a fully managed ONTAP file system in the cloud! What’s more interesting about this service is that we can now deliver high-performance block storage to the workloads running on VMware Cloud on AWS (VMC) through a first-party Amazon managed service!\nIn this post I will walk you through a simple example for provisioning and integrating iSCSI-based block storage to a Windows workload running on VMC environment using Amazon FSx for NetAPP ONTAP. For this demo I’ve provisioned the FSx service in a shared service VPC, which is connected to the VMC SDDC cluster through an AWS Transit Gateway (TGW) via VPN attachment (as per below diagram).\nDepending on your environment or requirements, you can also leverage a VMware Transit Connect (or VTGW) to provide high speed VPC connections between the shared service VPC and VMC, or simply provision the FSx service in the connected VPC so no TGW/VTGW is required.\nAWS Configuration To begin, I simply go to AWS console and select FSx in the service category and provision an Amazon FSx for NetApp ONTAP service in my preferred region. As a quick summary I have used the below settings:\nSSD storage capacity 1024GB (min 1024GB, max 192TB) sustained throughput capacity 512MB/s Multi-AZ (ontap cluster) deployment 1x storage virtual machine (svm01) to provide iSCSI service 1x default volume (/vol01) of 200GB to host the iSCSI LUNs storage efficiency (deduplication/compression etc): enabled capacity pool tiering policy: enabled After around 20min wait, the FSx ONTAP file system will be provisioned and ready for service. If you are using the above settings you should see a summary page similar like below. You can also retrieve the management endpoint IP address under the “Network \u0026amp; Security” tab.\nNote the management addresses (for both the cluster and SVMs) are automatically allocated from within a 198.19.0.0/16 range, and the same address block is going to provide the floating IP for NFS/SMB service (so customers don’t have to change file share mounting point address during an ONTAP cluster failover). Since this subnet is not natively deployed in a VPC, AWS will automatically inject the endpoint addresses (for management and NFS/SMB) into the specific VPC route tables based on your configurations.\nHowever, you’ll need to specifically inject a static route for this (see below) on TGW/VTGW, especially if you are planning to provide NFS/SMB services to the VMC SDDCs over peering connections — see here for more details.\nConversely, this static route is not required if you are only using iSCSI services as the iSCSI endpoints are provisioned directly onto the native subnets hosting the FSx service and are not using the floating IP range — more on this later.\nNext, we’ll verify the SVM (svm01) and Volume (vol01) status and make sure they are all online and healthy before we can provision iSCSI LUNs. Note: you’ll always see a separate root volume (automatically) created for each SVM.\nNow click the “svm01” to dive into the details, and you’ll find the iSCSI endpoint IP addresses (again they are in the native VPC subnets not the mgmt floating IP range)\nONTAP CLI CONFIGURATION We are now ready to move onto the iSCSI LUN provisioning. This can be done by using either ONTAP API or ONTAP CLI, which is what I’m using here. First, we’ll SSH into the cluster management IP and verify the SVM and volume status.\nSince this is a fully managed service, iSCSI service has been already activated on the SVM and the cluster is listening for iSCSI sessions on the 2x subnets across both AZs. You’ll also find the iSCSI target name here.\nNow we’ll create a 20GB LUN for the Windows client running on VMC.\nNext, create an igroup to include the Windows client iSCSI initiator. Notice the ALUA feature is enabled by default — this is pretty cool as we can test iSCSI MPIO as well 🙂\nFinally, map the igroup to the LUN we have just created, make sure the LUN is now in “mapped” status and we are all done here!\nWindows Client Setup On the Windows client (running on the VMC), launch the iSCSI initiator configuration and put the iSCSI IP address of one of the FSx subnets in “Quick Connect”, Windows will automatically discover the available targets on the FSx side and log into the fabric.\nOptionally, you could add a secondary storage I/O path if MPIO is installed/enabled on the Windows client. Like in my example here, I have add a second iSCSI session by using another iSCSI endpoint address in a different FSx subnet/AZ.\nNow click “Auto Configure” under “Volumes and Devices” to discover and configure the iSCSI LUN device.\nNext, go to “Computer Management” then “Disk Management” —\u0026gt; you should see a new 20GB disk has been automatically discovered (or manually refresh the hardware list if you can’t see the new disk yet). Initialise and format the disk.\nThe new 20GB disk is now ready to use. In the disk properties, you can verify the 2x iSCSI I/O paths as per below, and you can also change the MPIO policy based on your own requirements.\n","permalink":"https://route179.dev/2021/09/28/provision-and-integrate-iscsi-storage-with-vmware-cloud-on-aws-using-amazon-fsx-for-netapp-ontap/","summary":"\u003cp\u003eWith the \u003ca href=\"https://aws.amazon.com/blogs/aws/new-amazon-fsx-for-netapp-ontap/\"\u003erecently announced\u003c/a\u003e \u003cstrong\u003eAmazon FSx for NetApp ONTAP\u003c/strong\u003e, it is very exciting that for the first time we have a fully managed ONTAP file system in the cloud! What’s more interesting about this service is that we can now deliver high-performance block storage to the workloads running on VMware Cloud on AWS (VMC) through a first-party Amazon managed service!\u003c/p\u003e\n\u003cp\u003eIn this post I will walk you through a simple example for provisioning and integrating iSCSI-based block storage to a Windows workload running on VMC environment using Amazon FSx for NetAPP ONTAP. For this demo I’ve provisioned the FSx service in a shared service VPC, which is connected to the VMC SDDC cluster through an AWS Transit Gateway (TGW) via VPN attachment (as per below diagram).\u003c/p\u003e","title":"Provision and integrate iSCSI storage with VMware Cloud on AWS using Amazon FSx for NetApp ONTAP"},{"content":"With the latest “Transit VPC” feature in the VMware Cloud on AWS (VMC) 1.12 release, you can now inject static routes in the VMware managed Transit Gateway (or VTGW) to forward SDDC egress traffic to a 3rd-party firewall appliance for security inspection. The firewall appliance is deployed in a Security/Transit VPC to provide transit routing and policy enforcement between SDDCs and workload VPCs, on-premises data center and the Internet.\nImportant Notes:\nFor this lab, I’m using a Palo Alto VM-Series Next-Generation Firewall Bundle 2 AMI – refer to here and here for a detailed deployment instructions “Source/Destination Check” must be disabled on all ENIs attached to the firewall For Internet access, SNAT must be configured on firewall appliance to maintain route symmetry Similarly, inbound access from Internet to a server within VMC requires DNAT on firewall appliance Lab Topology:\nSDDC Group – Adding static (default) route After deployed the SDDC and SDDC Group, link your AWS account at here\nafter a while, the VTGW will show up in the Resource Access Manager (RAM) within your account, accept the shared VTGW and then create a VPC attachment to connect your Security/Transit VPC to the VTGW.\nOnce done, add a static default route at SDDC Group to point to the VTGW-SecVPC attachment.\nthe default route should appear soon under your SDDC (Network \u0026amp; Security —\u0026gt; Transit Connect), also notice we are advertising the local SDDC segments including the management subnets\nAWS SETUP Also we need to update the route table for each of the 3x firewall subnets\nRoute Table for the AWS native side subnet-01 (Trust Zone):\nRoute Table for the SDDC side subnet-02 (Untrust Zone):\nRoute Table for the public side subnet-03 (Internet Zone):\nRoute Table for the customer managed TGW:\nPalo FW Configuration Palo Alto firewall interface configuration\nVirtual Router config:\nSecurity Zones\nNAT Config\nOutbound SNAT to Internet Inbound DNAT to Server01 in SDDC01 Testing FW rules\nTesting Results “untrust” —\u0026gt; “trust” deny “trust” —\u0026gt; “untrust” allow “untrust” -\u0026gt; “Internet” allow “trust” -\u0026gt; “Internet” allow ","permalink":"https://route179.dev/2021/07/15/integrating-a-3rd-party-firewall-appliance-with-vmware-cloud-on-aws-by-leveraging-a-security-transit-vpc/","summary":"\u003cp\u003eWith the latest \u003ca href=\"https://docs.vmware.com/en/VMware-Cloud-on-AWS/0/rn/vmc-on-aws-relnotes.html#wn09615020\"\u003e“Transit VPC” feature\u003c/a\u003e in the VMware Cloud on AWS (VMC) 1.12 release, you can now inject static routes in the VMware managed Transit Gateway (or VTGW) to forward SDDC egress traffic to a 3rd-party firewall appliance for security inspection. The firewall appliance is deployed in a Security/Transit VPC to provide transit routing and policy enforcement between SDDCs and workload VPCs, on-premises data center and the Internet.\u003c/p\u003e\n\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/2021/07/15/integrating-a-3rd-party-firewall-appliance-with-vmware-cloud-on-aws-by-leveraging-a-security-transit-vpc/screen-shot-2021-07-15-at-3.17.38-pm.png\"\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eImportant Notes:\u003c/strong\u003e\u003c/p\u003e","title":"Integrating a 3rd-party firewall appliance with VMware Cloud on AWS by leveraging a Security/Transit VPC"},{"content":"I’ve always wanted to find a lightweight VM template for running on nested vSphere lab environment, or sometimes for demonstrating live cloud migration such as vMotion to the VMware Cloud on AWS. Recently I have managed to achieve this by using the Tiny Core Linux distribution and it ticked all of my requirements:\nultra lightweight – the VM runs stable with only 1 vCPU, 256MB RAM and 64MB hard disk! common linux tools installed – such as curl, wget, openssh etc open-vm-tools installed a lightweight http server serving a static site for running networking or load-balancing tests In this post I will walk you through the process for creating a Tiny Core based Linux VM template including all of the above requirements. To begin, download the Tiny Core ISO from here. (For reference, I’m using the CorePlus-v11.1 release as I was getting some weird issues with OpenSSH on the latest v12.0 release)\nBelow are the settings I’ve used for my VM template:\nVM hardware version 11 – compatible with ESXi 6.0 and later Guest OS = Linux \\ Other 3.x Linux (32-bit) Memory = 256MB (this is the lowest I could go for getting a stable machine) Hard Disk = 64MB – change drive type to IDE and set the virtual device node to IDE0:0 CDROM – change the virtual device node to IDE1:0 iSCSI controller – remove this as it’s not required Also, you should use the below minimal settings for installing the Tiny Core OS. For detailed installation instructions, you can follow the step-by-step guide at here:\nOnce the OS has been installed and you are into the shell, create a below script to configure static IP settings for eth0 (and disable DHCP if required).\ntc@box:~$ cat /opt/interfaces.sh #!/bin/sh # If you are booting Tiny Core from a very fast storage such as SSD / NVMe Drive and getting # \u0026#34;ifconfig: SIOCSIFADDR: No such Device\u0026#34; or \u0026#34;route: SIOCADDRT: Network is unreachable\u0026#34; # error during system boot, use this sleep statement, otherwise you can remove it - sleep .2 # kill dhcp client for eth0 sleep 1 if [ -f /var/run/udhcpc.eth0.pid ]; then kill `cat /var/run/udhcpc.eth0.pid` sleep 0.5 fi # configure interface eth0 ifconfig eth0 192.168.0.1 netmask 255.255.255.0 broadcast 192.168.0.255 up route add default gw 192.168.0.254 echo nameserver 192.168.0.254 \u0026gt;\u0026gt; /etc/resolv.conf tc@box:~$sudo chmod 777 /opt/interfaces.sh tc@box:~$sudo /opt/interfaces.sh You may also want to reset the password for the default user “tc” (this can be used later for SSH access), and reset the root password as well:\ntc@box:~$ passwd Changing password for tc ... tc@box:~$ sudo su root@box:/home/tc# passwd Changing password for root ... Now install all the required packages and extensions, and your onboot package list should look like below:\ntce-load -wi pcre.tcz curl.tcz wget.tcz open-vm-tools.tcz openssh.tcz busybox-httpd.tcz tc@box:~$ cat /etc/sysconfig/tcedir/onboot.lst pcre.tcz curl.tcz wget.tcz open-vm-tools.tcz openssh.tcz busybox-httpd.tcz Now configure and enable the SSH server — you can use user “tc” for a quick SSH test:\ncd /usr/local/etc/ssh sudo cp ssh_config.orig ssh_config sudo cp sshd_config.orig sshd_config sudo /usr/local/etc/init.d/openssh start Next, we’ll need to save all the settings and make them persistent across reboots, especially we’ll need to add the open-vm-tools and openssh onto the startup script (bootlocal.sh) — otherwise none of these services would be started after a reboot.\nsudo su echo \u0026#39;/opt/interfaces.sh\u0026#39; \u0026gt;\u0026gt; /opt/.filetool.lst echo \u0026#39;/usr/local/etc/ssh\u0026#39; \u0026gt;\u0026gt; /opt/.filetool.lst echo \u0026#39;/etc/shadow\u0026#39; \u0026gt;\u0026gt; /opt/.filetool.lst echo \u0026#39;/opt/interfaces.sh\u0026#39; \u0026gt;\u0026gt; /opt/bootlocal.sh echo \u0026#39;/usr/local/etc/init.d/open-vm-tools start \u0026amp;\u0026gt; /dev/null\u0026#39; \u0026gt;\u0026gt; /opt/bootlocal.sh echo \u0026#39;/usr/local/etc/init.d/openssh start \u0026amp;\u0026gt; /dev/null\u0026#39; \u0026gt;\u0026gt; /opt/bootlocal.sh and most importantly, use the below command to backup all the config!\ntc@box:~$ filetool.sh -b Backing up files to /mnt/sda1/tce/mydata.tgz The last one is for my own specific requirement — you can use the below script to setup a lightweight http server so it can be used for networking or load-balancing related tests.\ntc@box:~$ sudo vi /opt/httpd.sh sudo /usr/local/httpd/bin/busybox httpd -p 80 -h /usr/local/httpd/bin/ sleep .5 sudo touch /usr/local/httpd/bin/index.html sudo chmod 666 /usr/local/httpd/bin/index.html echo \u0026#34;this page is served by\u0026#34; \u0026gt;\u0026gt; /usr/local/httpd/bin/index.html ifconfig eth0 | grep -i mask | awk \u0026#39;{print $2}\u0026#39;| cut -f2 -d: \u0026gt;\u0026gt; /usr/local/httpd/bin/index.html tc@box:~$ sudo chmod 777 /opt/httpd.sh tc@box:~$ sudo echo \u0026#39;/opt/httpd.sh\u0026#39; \u0026gt;\u0026gt; /opt/bootlocal.sh tc@box:~$ filetool.sh -b Now you can go ahead and safely reboot the VM, and once it comes back online you should be able to SSH into it. Also the the open-vm-tools service should be automatically started and you can see the correct IP address and VM tool version reported in vCenter.\nIn addition, you should be able to see a static page like below by browsing to the VM address — the script (httpd.sh) should report back the VM’s IP address which could be handy for running a LB related testings.\n","permalink":"https://route179.dev/2021/02/21/create-a-tiny-core-linux-vm-template-for-vsphere-lab-environment/","summary":"\u003cp\u003eI’ve always wanted to find a lightweight VM template for running on nested vSphere lab environment, or sometimes for demonstrating live cloud migration such as vMotion to the VMware Cloud on AWS. Recently I have managed to achieve this by using the \u003ca href=\"http://tinycorelinux.net/\"\u003eTiny Core Linux distribution\u003c/a\u003e and it ticked all of my requirements:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eultra lightweight – the VM runs stable with only 1 vCPU, 256MB RAM and 64MB hard disk!\u003c/li\u003e\n\u003cli\u003ecommon linux tools installed – such as curl, wget, openssh etc\u003c/li\u003e\n\u003cli\u003eopen-vm-tools installed\u003c/li\u003e\n\u003cli\u003ea lightweight http server serving a static site for running networking or load-balancing tests\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eIn this post I will walk you through the process for creating a Tiny Core based Linux VM template including all of the above requirements. To begin, download the Tiny Core ISO from \u003ca href=\"http://tinycorelinux.net/downloads.html\"\u003e\u003c/a\u003e \u003ca href=\"http://tinycorelinux.net/downloads.html\"\u003eher\u003c/a\u003e\u003ca href=\"http://tinycorelinux.net/downloads.html\"\u003ee\u003c/a\u003e. (For reference, I’m using the \u003cstrong\u003e\u003ca href=\"http://tinycorelinux.net/11.x/x86/archive/11.1/CorePlus-11.1.iso\"\u003eCorePlus-v11.1\u003c/a\u003e\u003c/strong\u003e release as I was getting some weird issues with OpenSSH on the latest v12.0 release)\u003c/p\u003e","title":"Create a Tiny Core Linux VM Template for vSphere Lab environment"},{"content":"Recently I have tried out the Terraform NSX-T Provider and it worked like a charm. In this post, I will demonstrate a simple example on how to leverage Terraform to provision a basic NSX tenant network environment, which includes the following:\ncreate a Tier-1 router create (linked) routed ports on the new T1 router and the existing upstream T0 router link the T1 router to the upstream T0 router create three logical switches with three logical ports create three downlink LIFs (with subnets/gateway defined) on the T1 router, and link each of them to the logical switch ports accordingly Once the tenant environment is provisioned by Terraform, the 3x tenant subnets will be automatically published to the T0 router and propagated to the rest of the network (if BGP is enabled), and we should be able to reach the individual LIF addresses. Below is a sample topology deployed in my lab — (here I’m using pre-provisioned static routes between the T0 and upstream network for simplicity reasons).\nSoftware Versions Used \u0026amp; Verified\nTerraform – v0.12.25 NSX-T Provider – v3.0.1 (auto downloaded by Terraform) NSX-T Data Center -v3.0.2 (build 0.0.16887200) Sample Terraform Script\nYou can find the sample Terraform script at my Git repo here — remember to update the variables based on your own environment.\nnsx_manager = \u0026#34;192.168.100.125\u0026#34; nsx_username = \u0026#34;admin\u0026#34; nsx_password = \u0026#34;xxxxxx\u0026#34; nsxt_t1_rt_name = \u0026#34;dev-demo-t1-rtr\u0026#34; ls1_name = \u0026#34;ls-dev-demo-web\u0026#34; ls2_name = \u0026#34;ls-dev-demo-app\u0026#34; ls3_name = \u0026#34;ls-dev-demo-db\u0026#34; ls1_gw = \u0026#34;172.31.101.1/24\u0026#34; ls2_gw = \u0026#34;172.31.102.1/24\u0026#34; ls3_gw = \u0026#34;172.31.103.1/24\u0026#34; Run the Terraform script and this should take less than a minute to complete.\nWe can review and reverify that the required NSX components were built successfully via the NSX manager UI — Note: you’ll need to switch to the “Manager mode” to be able to see the newly create elements (T1 router, logical switches etc), as Terraform was interacting with the NSX management plane (via MP-API) directly.\nIn addition, we can also check and confirm the3x tenant subnets are published via T1 to T0 by SSH into the active edge node. Make sure you connect to the correct VRF table for the T0 service router (SR) in order to see the full route table — here we can see the 3x /24 subnets are indeed advertised from T1 to T0 as directly connected (t1c) routes.\nAs expected I can reach to each of the three LIFs on the T1 router from the lab terminal VM.\n","permalink":"https://route179.dev/2020/10/02/nsx-t-automation-with-terraform/","summary":"\u003cp\u003eRecently I have tried out the Terraform NSX-T Provider and it worked like a charm. In this post, I will demonstrate a simple example on how to leverage Terraform to provision a basic NSX tenant network environment, which includes the following:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003ecreate a Tier-1 router\u003c/li\u003e\n\u003cli\u003ecreate (linked) routed ports on the new T1 router and the existing upstream T0 router\u003c/li\u003e\n\u003cli\u003elink the T1 router to the upstream T0 router\u003c/li\u003e\n\u003cli\u003ecreate three logical switches with three logical ports\u003c/li\u003e\n\u003cli\u003ecreate three downlink LIFs (with subnets/gateway defined) on the T1 router, and link each of them to the logical switch ports accordingly\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eOnce the tenant environment is provisioned by Terraform, the 3x tenant subnets will be automatically published to the T0 router and propagated to the rest of the network (if BGP is enabled), and we should be able to reach the individual LIF addresses. Below is a sample topology deployed in my lab — (here I’m using pre-provisioned static routes between the T0 and upstream network for simplicity reasons).\u003c/p\u003e","title":"NSX-T Automation with Terraform"},{"content":"This will be a quick blog to demonstrate how to enable the (embedded) Harbor Image Registry in vSphere 7 with Kubernetes. Harbor was originally developed by VMware as a enterprise-grade private container registry. It was then donated to the CNCF in 2018 and recently became a CNCF graduated project.\nFor this demo, we’ll activate the embedded Harbor register within the vSphere 7 Kubernetes environment, and integrate it with the Supervisor Cluster for container management and deployment.\nWHAT YOU’LL NEED:\n1x vSphere 7 with Kubernetes cluster (you can build one following this post) Kubernetes CLI tool for vSphere installed Docker installed (on your client) Enabling the embedded Harbor Registry in vSphere 7 with Kubernetes\nTo begin, go to your vSphere 7 “Workload Cluster —\u0026gt; Namespaces —\u0026gt; Image Registry”, and then click “Enable Harbor”.\nMake sure to select the vSAN storage policy to provide persistent storage as required for the Harbor installation.\nThe process will take a few minutes, and you should see 7x vSphere Pods after Harbor is installed and enabled. Take a note of the Harbor URL — this is an external address of the K8s load balancer that is created by NSX-T.\nPush Container Images to Harbor Registry\nFirst, let’s log into the Harbor UI and take a quick look. Since this is embedded within vSphere, it supports the SSO login 🙂\nHarbor will automatically create a project for every vSphere namespace we have created. In my case, there are two projects “dev01” and “guestbook” created, which are mapped to the two namespaces in my vSphere workload cluster.\nClick the “dev01” project, and then “repository” — as expected it is currently empty, and we’ll be pushing container images to this repository for a quick test. However, before we can do that we’ll need to download and import the certificate to our client machine for certificate-based authentication. Click the “Registry Certificate” to download the ca.crt file.\nNext, on the local client create a new directory under /etc/docker/cert.d/ using the same name as the registry FQDN (URL).\n[root@pacific-ops01 ~]# cd /etc/docker/certs.d/ [root@pacific-ops01 certs.d]# mkdir 192.168.100.133 [root@pacific-ops01 certs.d]# cd 192.168.100.133/ [root@pacific-ops01 192.168.100.133]# vim ca.crt Now, let’s get a test (nginx) image, tag it, and try to push it to the dev01 repository.\n[root@pacific-ops01 ~]# docker login 192.168.100.133 --username administrator@vsphere.local Password: Login Succeeded [root@pacific-ops01 ~]# docker pull nginx Using default tag: latest Trying to pull repository docker.io/library/nginx ... latest: Pulling from docker.io/library/nginx bf5952930446: Pull complete cb9a6de05e5a: Pull complete 9513ea0afb93: Pull complete b49ea07d2e93: Pull complete a5e4a503d449: Pull complete Digest: sha256:b0ad43f7ee5edbc0effbc14645ae7055e21bc1973aee5150745632a24a752661 Status: Downloaded newer image for docker.io/nginx:latest [root@pacific-ops01 ~]# [root@pacific-ops01 ~]# docker images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/nginx latest 4bb46517cac3 3 days ago 133 MB [root@pacific-ops01 ~]# [root@pacific-ops01 ~]# docker tag docker.io/nginx 192.168.100.133/dev01/nginx [root@pacific-ops01 ~]# [root@pacific-ops01 ~]# docker push 192.168.100.133/dev01/nginx The push refers to a repository [192.168.100.133/dev01/nginx] 550333325e31: Pushed 22ea89b1a816: Pushed a4d893caa5c9: Pushed 0338db614b95: Pushed d0f104dc0a1f: Pushed latest: digest: sha256:179412c42fe3336e7cdc253ad4a2e03d32f50e3037a860cf5edbeb1aaddb915c size: 1362 [root@pacific-ops01 ~]# It works, perfect! Now refresh the repository and we can see the new nginx image we just pushed through.\nDeploy Kubernetes Pods to Supervisor Cluster from the Harbor Registry\nLet’s run a quick test to deploy a Pod using the nginx image from our Harbor Registry. First, log into the Supervisor Cluster and switch to the “dev01” namespace/context.\n[root@pacific-ops01 ~]# kubectl vsphere login --server=192.168.100.129 --vsphere-username administrator@vsphere.local --insecure-skip-tls-verify Password: Logged in successfully. … [root@pacific-ops01 ~]# kubectl config use-context dev01 Switched to context \u0026#34;dev01\u0026#34;. Make a nginx Pod config using the image path from our Harbor repository.\napiVersion: v1 kind: Pod metadata: labels: run: nginx-demo name: nginx-demo namespace: dev01 spec: containers: - image: 192.168.100.133/dev01/nginx name: nginx-demo restartPolicy: Always Deploy the Pod.\n[root@pacific-ops01 ~]# kubectl apply -f nginx-demo.yaml pod/nginx-demo created Monitor the events and soon we can see the Pod is deployed successfully from the image fetched from the Harbor repository.\n[root@pacific-ops01 ~]# kubectl get events -n dev01 LAST SEEN TYPE REASON OBJECT MESSAGE 48s Normal Status image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0 pacific-esxi-3: Image status changed to Resolving 40s Normal Resolve image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0 pacific-esxi-3: Image resolved to ChainID sha256:80b21afd8140706d5fe3b7106ae6147e192e6490b402bf2dd2df5df6dac13db8 40s Normal Bind image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0 Imagedisk 80b21afd8140706d5fe3b7106ae6147e192e6490b402bf2dd2df5df6dac13db8-v0 successfully bound 32s Normal Status image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0 Image status changed to Fetching 14s Normal Status image/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0 Image status changed to Ready 7s Normal SuccessfulRealizeNSXResource pod/nginx-demo Successfully realized NSX resource for Pod \u0026lt;unknown\u0026gt; Normal Scheduled pod/nginx-demo Successfully assigned dev01/nginx-demo to pacific-esxi-1 50s Normal Image pod/nginx-demo Image nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0 bound successfully 39s Normal Pulling pod/nginx-demo Waiting for Image dev01/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0 14s Normal Pulled pod/nginx-demo Image dev01/nginx-4f70b77c704ff28acdf14ce0405bc1811e8ee077-v0 is ready 7s Normal SuccessfulMountVolume pod/nginx-demo Successfully mounted volume default-token-bqxc2 7s Normal Created pod/nginx-demo Created container nginx-demo 7s Normal Started pod/nginx-demo Started container nginx-demo [root@pacific-ops01 ~]# kubectl get pods -n dev01 NAME READY STATUS RESTARTS AGE nginx-demo 1/1 Running 0 60s Use kubectl describe pod to confirm the nginx Pod is indeed running on the image pulled from the Harbor registry.\n","permalink":"https://route179.dev/2020/08/18/enabling-embedded-harbor-image-registry-in-vsphere-7-with-kubernetes/","summary":"\u003cp\u003eThis will be a quick blog to demonstrate how to enable the (embedded) Harbor Image Registry in vSphere 7 with Kubernetes. \u003ca href=\"https://goharbor.io/\"\u003eHarbor\u003c/a\u003e was originally developed by VMware as a enterprise-grade private container registry. It was then donated to the CNCF in 2018 and recently became a \u003ca href=\"https://www.cncf.io/projects/\"\u003eCNCF graduated project\u003c/a\u003e.\u003c/p\u003e\n\u003cp\u003eFor this demo, we’ll activate the embedded Harbor register within the vSphere 7 Kubernetes environment, and integrate it with the Supervisor Cluster for container management and deployment.\u003c/p\u003e","title":"Enabling embedded Harbor Image Registry in vSphere 7 with Kubernetes"},{"content":"This blog provides a guide to help you deploying Contour Ingress Controller onto a Tanzu Kubernetes Grid (TKG) cluster. Contour is an open source Kubernetes ingress controller that exposes HTTP/HTTPS routes for internal services so they are reachable from outside the cluster. Like many other ingress controllers, Contour can provide advanced L7 URL/URI based routing and load balancing, as well as SSL/TLS termination capabilities.\nContour was originally developed by Heptio (VMware) and has been recently handed over to CNCF as an incubating project. Contour consists of a control plane that is provisioned via a K8s deployment, and an Envoy-based data plane running as a Daemonset on every cluster worker node.\nhttps://projectcontour.io/contour-v014/\nWHAT YOU’LL NEED:\nA TKG cluster (you can create one following this post) Download the Tanzu Kubernetes Grid 1.1 Extension manifests at here For this lab, we’ll install the Contour ingress controller onto a TKG cluster, and we’ll then deploy a sample app (supplied within the manifest) for testing the Ingress services. The overall service topology will look like this:\nInstall the Contour Ingress Controller\nTo begin, unzip the TKG extension manifest (I’m using v1.1.0).\n[root@pacific-ops01 ~]# tar -xzf tkg-extensions-manifests-v1.1.0-vmware.1.tar.gz Log into your TKG cluster and make sure you are in the correct context.\n[root@pacific-ops01 ~]# kubectl vsphere login --server=192.168.100.129 --vsphere-username administrator@vsphere.local --insecure-skip-tls-verify --tanzu-kubernetes-cluster-name dev01-tkg-01 --tanzu-kubernetes-cluster-namespace dev01 [root@pacific-ops01 ~]# kubectl config use-context dev01-tkg-01 Next, install the Cert-Manager (for Contour Ingress) onto the TKG cluster.\nBefore we can install Contour and Envoy, we’ll need to make a small change to the Envoy service config (02-service-envoy.yaml). As illustrated in the service topology, we will deploy a LoadBalancer in front of the ingress controller. So we’ll update the Envoy service type from NodePort (default) to LoadBalancer.\nNow deploy Contour and Envoy onto the cluster.\nWe can see a Contour deployment, and an Envoy daemonset of 3x (we have 3 worker nodes) have been deployed under the namespace of tanzu-system-ingress. Also, take a note of the external IP (192.168.100.130) of the Envoy LoadBalancer service as this will be used by our Ingress services.\nDeploy a Sample App for testing Ingress Services\nDeploy the sample app from within the manifest, this will create:\none new namespace called “test-ingress” one deployment of the “helloweb” app, with a Replicaset of 3x Pods two separate services called “s1” \u0026amp; “s2” — Note: both services are actually pointing to the same 3x Pods (as they are using the same Pod selector) Verify the Pods are up and running\n[root@pacific-ops01 ~]# kubectl get pods -n test-ingress NAME READY STATUS RESTARTS AGE helloweb-7cd97b9cb8-qjwtk 1/1 Running 0 50s helloweb-7cd97b9cb8-r9s8g 1/1 Running 0 51s helloweb-7cd97b9cb8-swztl 1/1 Running 0 51s and both services (s1 \u0026amp; s2) are deployed as expected.\n[root@pacific-ops01 ~]# kubectl get svc -n test-ingress NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE s1 ClusterIP 10.40.183.104 \u0026lt;none\u0026gt; 80/TCP 1m s2 ClusterIP 10.40.129.12 \u0026lt;none\u0026gt; 80/TCP 1m We can’t get to these services yet as they are internal K8s services (ClusterIP) only. We’ll need to deploy an Ingress object so that Contour can expose these services and route traffic to them from external. The good news is that there’s already an Ingress config template provided in the manifest. I’ve made the following changes to the template as per my lab environment (my lab domain is vxlan.co). Note the hostname (URL) and the path (URI) as we’ll be using these to access the two services.\nDeploy the Ingress object.\n[root@pacific-ops01 ~]# cd tkg-extensions-v1.1.0/ingress/contour/examples/https-ingress [root@pacific-ops01 https-ingress]# kubectl apply -f . ingress.extensions/https-ingress created secret/https-secret created Verify the Ingress service is running as expected\n[root@pacific-ops01 https-ingress]# kubectl get ingress -n test-ingress NAME HOSTS ADDRESS PORTS AGE https-ingress ingress.vxlan.co 80, 443 2m Create a DNS record with the ingress hostname by pointing to the Envoy load balancer external IP.\nNow test access to the s1 service by browsing https://ingress.vxlan.co/s1\nand s2 service by browsing https://ingress.vxlan.co/s2\nCongrats, you have successfully deployed a Contour Ingress controller on a TKG cluster!\n","permalink":"https://route179.dev/2020/08/01/deploying-contour-ingress-controller-on-tanzu-kubernetes-grid-tkg/","summary":"\u003cp\u003eThis blog provides a guide to help you deploying Contour Ingress Controller onto a Tanzu Kubernetes Grid (TKG) cluster. \u003ca href=\"https://projectcontour.io/\"\u003eContour\u003c/a\u003e is an open source Kubernetes ingress controller that exposes HTTP/HTTPS routes for internal services so they are reachable from outside the cluster. Like many other ingress controllers, Contour can provide advanced L7 URL/URI based routing and load balancing, as well as SSL/TLS termination capabilities.\u003c/p\u003e\n\u003cp\u003eContour was originally developed by Heptio (VMware) and has been recently handed over to CNCF as \u003ca href=\"https://www.cncf.io/projects/\"\u003ean incubating project\u003c/a\u003e. Contour consists of a control plane that is provisioned via a K8s deployment, and an \u003ca href=\"https://www.envoyproxy.io/\"\u003eEnvoy\u003c/a\u003e-based data plane running as a Daemonset on every cluster worker node.\u003c/p\u003e","title":"Deploying Contour Ingress Controller on Tanzu Kubernetes Grid (TKG)"},{"content":"In this post we’ll explore the vSphere 7 with Kubernetes capabilities and the detailed deployment steps in order to provision a vSphere supervisor cluster and a Tanzu Kubernetes Grid (TKG) cluster.\nIf you are new to vSphere 7 and Tanzu Kubernetes, below are some background readings that can be used as a good start point:\nProject Pacific – Technical Overview vSphere 7 – Introduction to the vSphere Pod Service vSphere 7 – Introduction to Kubernetes Namespaces vSphere 7 – Introduction to Tanzu Kubernetes Grid Clusters Requirements\nI’ll be building a nested vSphere7/VCF4 environment in my home lab ESXi host, and the overall lab setup looks like below:\nAs you might have guessed, this lab requires a lot of resources! In specific you’ll need the following:\nphysical ESXi host running at least vSphere 6.7 or later capacity to provision VM with up to 8x vCPU capacity to provision up to 140-180GB of RAM around 1TB of spare storage a flat /24 subnet connected to external \u0026amp; Internet (can be shared with lab management network) access to vSphere 7 ESXi/VCSA and NSX-T/Edge 3.0 OVA files and trial licenses In order to save time on provisioning the vSphere/VCF stack, I’m using William Lam‘s vSphere 7 automation script as discussed here. You can find the PowerShell code and further details at his Git repository.\nAll demo apps and configuration yaml files used in this lab can be found at my Git Repo.\nWe’ll cover the following steps:\n#1 – build a (nested) vSphere7/VCF4 stack #2 – configure workload management and deploy supervisor cluster #3 – deploy a demo app with native vSphere Pod services #4 – deploy a TKG cluster #5 – vSphere environment overview (post deployment) Step-1: Deploy a vSphere7/VCF4 stack\nFirst, you’ll need to download William’s PowerShell script and modify it based on your own lab environment. You’ll also need to download the required OVAs and place them in the same path as defined in the script — Note for the VCSA you’ll need to unzip the ISO and point the path to the unzipped folder!\nNow let’s run the PowerShell script and you’ll see a deployment summary page like this:\nHit “Y” to kickoff the deployment and for me the whole process took just a little over 1 hour.\nOnce the script completes you should see a vAPP look like this deployed under your physical ESXi host.\nStep-2: Configure Workload Management and Deploy Supervisor Cluster\nTo activate vSphere 7 native Kubernetes capabilities, we need to enable workload management which will configure our nested ESXi cluster as a supervisor cluster. First, log into the nested VCSA, and navigate to “Menu” —\u0026gt; “Workload Management”, click “Enable”:\nSelect our nested ESXi cluster to be configured as a supervisor cluster\nSelect supervisor Control Plane VM size\nConfigure the management network settings for the supervisor cluster, note that we’ll need to reserve a 5-address block for the control plane VMs including a VIP.\nNext, configure vSphere Pod network settings — for this demo we’ll reserve one /27 for the Ingress CIDR block as the NAT IPs to be consumed by Load Balancer or Ingress services; and another /27 for the Egress CIDR block as outbound SNAT IPs for provisioned K8s namespaces.\nConfigure storage policies by selecting the pre-provisioned pacific-gold vSAN policy, then click “Finish” to begin the deployment of supervisor cluster.\nThis process will take another 20~30 mins to complete, and you’ll see a cluster of 3x control plan VMs being provisioned.\nBack to the “Workload Management” —\u0026gt; “Cluster”, you should see our supervisor cluster (consists of 3x ESXi hosts) is now up and running. Also, take a note of the VIP address of the control plan VMs as we’ll be using that IP to log into the supervisor cluster.\nStep-3: Deploy a demo app with Native vSphere Pods\nTo consume the native vSphere Kubernetes Pods capabilities, we need to firstly create a vSphere Namespace, which is mapped to a K8s namespace within the supervisor cluster. vSphere leverages the K8s namespace logical construct to provide resource segmentation for the vSphere pods/services/deployments, and it offers a flexible way to attach authorization and network/storage policies for different environments.\nGo to “Menu” —\u0026gt; “Workload Management”, and click “Create Namespace”.\nSince we’ll be deploying a sample guestbook app, we’ll name the namespace “guestbook”.\nNext, grant the vSphere admin with editor’s permission to the namespace, and assign the vSAN storage policy “pacific-gold-storage-policy” for the namespace —\u0026gt; this is important as (behind the scene) we are leveraging the vSAN CSI (container storage interface) driver to provide persistent storage support for the cluster.\nNow we are ready to dive into the vSphere supervisor cluster! Before we can do that, let’s get the Kubectl CLI and the vSphere plugin package.\nOpen the CLI tools link at here:\nFollow the onscreen instructions to download and install the vSphere Kubectl CLI toolkit onto your management host (I’m using a CentOS7 VM).\nTime to log into our superviosr K8s cluster! — remember to use the control plane VIP (192.168.100.129) as noted before.\n[root@Pacific-Ops01]# kubectl vsphere login --server=192.168.100.129 -u administrator@vsphere.local --insecure-skip-tls-verify switch context to our “guestbook” namespace\n[root@Pacific-Ops01]# kubectl config use-context guestbook Switched to context \u0026#34;guestbook\u0026#34;. take a look of the cluster nodes, you’ll see the 3x master nodes (supervisor control VMs) and 3x worker nodes (ESXi hosts)\n[root@pacific-ops01 vs7-k8s]# kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME 420a7d079f62a8ae40fb4bffea3cee48 Ready master 8d v1.16.7-2+bfe512e5ddaaaa 10.244.0.196 \u0026lt;none\u0026gt; VMware Photon OS/Linux 4.19.84-1.ph3-esx docker://18.9.9 420acb46e78281fcfaf3f45ea3d7c577 Ready master 8d v1.16.7-2+bfe512e5ddaaaa 10.244.0.194 \u0026lt;none\u0026gt; VMware Photon OS/Linux 4.19.84-1.ph3-esx docker://18.9.9 420aef27c9f45b01e8e0ed4a7e45cf2e Ready master 8d v1.16.7-2+bfe512e5ddaaaa 10.244.0.195 \u0026lt;none\u0026gt; VMware Photon OS/Linux 4.19.84-1.ph3-esx docker://18.9.9 pacific-esxi-1 Ready agent 8d v1.16.7-sph-4d52cd1 192.168.100.121 \u0026lt;none\u0026gt; \u0026lt;unknown\u0026gt; \u0026lt;unknown\u0026gt; \u0026lt;unknown\u0026gt; pacific-esxi-2 Ready agent 8d v1.16.7-sph-4d52cd1 192.168.100.122 \u0026lt;none\u0026gt; \u0026lt;unknown\u0026gt; \u0026lt;unknown\u0026gt; \u0026lt;unknown\u0026gt; pacific-esxi-3 Ready agent 8d v1.16.7-sph-4d52cd1 192.168.100.123 \u0026lt;none\u0026gt; \u0026lt;unknown\u0026gt; \u0026lt;unknown\u0026gt; \u0026lt;unknown\u0026gt; Clone the git repo for this demo lab, and apply a dummy network policy (permit all ingress and all egress traffic)\n[root@pacific-ops01 ~]# git clone https://github.com/sc13912/vs7-k8s.git Cloning into \u0026#39;vs7-k8s\u0026#39;... remote: Enumerating objects: 15, done. remote: Counting objects: 100% (15/15), done. remote: Compressing objects: 100% (10/10), done. remote: Total 15 (delta 2), reused 12 (delta 2), pack-reused 0 Unpacking objects: 100% (15/15), done. [root@pacific-ops01 ~]# cd vs7-k8s/ [root@pacific-ops01 vs7-k8s]# kubectl apply -f network-policy-allowall.yaml networkpolicy.networking.k8s.io/allow-all created To deploy the guestbook app, we’ll leverage the dynamic persistent volume provisioning capability of the vSphere CSI driver by calling the vSAN storage class “pacific-gold-storage-policy”\nkind: PersistentVolumeClaim apiVersion: v1 metadata: namespace: guestbook name: redis-master-claim spec: accessModes: - ReadWriteOnce storageClassName: pacific-gold-storage-policy resources: requests: storage: 2Gi apply the PVCs yamls for both the redis master and slave Pods\n[root@pacific-ops01 vs7-k8s]# kubectl apply -f guestbook/guestbook-master-claim.yaml persistentvolumeclaim/redis-master-claim created [root@pacific-ops01 vs7-k8s]# kubectl apply -f guestbook/guestbook-slave-claim.yaml persistentvolumeclaim/redis-slave-claim created verify both PVCs are showing “Bound” status mapped to two dynamically provisioned persistent volumes (PVs)\n[root@pacific-ops01 vs7-k8s]# kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE redis-master-claim Bound pvc-0102e725-41ad-440b-8a02-8af4d4768ebb 2Gi RWO pacific-gold-storage-policy 14m redis-slave-claim Bound pvc-fb4b7bbe-9b35-40e8-b251-8f2effe85a2d 2Gi RWO pacific-gold-storage-policy 13m Now deploy the guestbook app.\n[root@pacific-ops01 vs7-k8s]# kubectl apply -f guestbook/guestbook-all-in-one.yaml service/redis-master created deployment.apps/redis-master created service/redis-slave created deployment.apps/redis-slave created service/frontend created deployment.apps/frontend created wait until all the pods up and running\n[root@pacific-ops01 vs7-k8s]# kubectl get pods -o wide -n guestbook NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES frontend-6cb7f8bd65-kjgh2 1/1 Running 0 3m2s 10.244.0.214 pacific-esxi-2 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; frontend-6cb7f8bd65-mlv79 1/1 Running 0 3m2s 10.244.0.213 pacific-esxi-1 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; frontend-6cb7f8bd65-slz6b 1/1 Running 0 3m2s 10.244.0.215 pacific-esxi-2 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; frontend-6cb7f8bd65-vtkfz 1/1 Running 0 3m3s 10.244.0.212 pacific-esxi-1 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; redis-master-64fb8775bf-65sdc 1/1 Running 0 3m10s 10.244.0.210 pacific-esxi-1 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; redis-slave-779b6d8f79-bj9q7 1/1 Running 0 3m7s 10.244.0.211 pacific-esxi-2 \u0026lt;none\u0026gt; \u0026lt;none\u0026gt; retrieve the Load Balancer service IP — note NSX has allocated an IP from the /27 Ingress CIDR block\n[root@pacific-ops01 vs7-k8s]# kubectl get svc -n guestbook NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE frontend LoadBalancer 10.32.0.209 192.168.100.130 80:32610/TCP 4m15s redis-master ClusterIP 10.32.0.34 \u0026lt;none\u0026gt; 6379/TCP 4m22s redis-slave ClusterIP 10.32.0.197 \u0026lt;none\u0026gt; 6379/TCP 4m21s Hit the load balancer IP in browser to test the guestbook app. Enter and submit some messages, and try to destroy and redeploy the app, your data will be kept by the redis PVs.\nStep-4: Deploy a TKG cluster\nBefore we can deploy a TKG cluster, we’ll need to create a content library subscription by pointing to https://wp-content.vmware.com/v2/latest/lib.json, which contains the VMware Tanzu Kubernetes images:\nwait for about 5~10 mins for the library to fully sync, at this point of time I can see two versions of Tanzu K8s images:\nNext, create a new namespace called “dev01” which will be hosting our new TKG cluster.\nBack to the CLI, we’ll switch context from “guestbook” to the new “dev01” namespace:\n[root@pacific-ops01 vs7-k8s]# kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE 192.168.100.129 192.168.100.129 wcp:192.168.100.129:administrator@vsphere.local dev01 192.168.100.129 wcp:192.168.100.129:administrator@vsphere.local dev01 * guestbook 192.168.100.129 wcp:192.168.100.129:administrator@vsphere.local guestbook [root@pacific-ops01 vs7-k8s]# [root@pacific-ops01 vs7-k8s]# kubectl config use-context dev01 Switched to context \u0026#34;dev01\u0026#34;. let’s examine the two TKG K8s versions available from the library:\n[root@pacific-ops01 vs7-k8s]# kubectl get virtualmachineimages NAME AGE ob-15957779-photon-3-k8s-v1.16.8---vmware.1-tkg.3.60d2ffd 9m44s ob-16466772-photon-3-k8s-v1.17.7---vmware.1-tkg.1.154236c 9m44s and there are also different classes for the TKG VM templates:\n[root@pacific-ops01 vs7-k8s]# kubectl get virtualmachineclasses NAME AGE best-effort-large 4h48m best-effort-medium 4h48m best-effort-small 4h48m best-effort-xlarge 4h48m best-effort-xsmall 4h48m guaranteed-large 4h48m guaranteed-medium 4h48m guaranteed-small 4h48m guaranteed-xlarge 4h48m guaranteed-xsmall 4h48m so I have prepared the following yaml config for my TKG cluster — I’m using 1x master node and 3x worker nodes, all within the “guaranteed-small” machine classes.\n[root@pacific-ops01 vs7-k8s]# cat tkg-cluster01.yaml apiVersion: run.tanzu.vmware.com/v1alpha1 kind: TanzuKubernetesCluster metadata: name: dev01-tkg-01 namespace: dev01 spec: distribution: version: v1.16 topology: controlPlane: class: guaranteed-small count: 1 storageClass: pacific-gold-storage-policy workers: class: guaranteed-small count: 3 storageClass: pacific-gold-storage-policy settings: network: cni: name: calico services: cidrBlocks: [\u0026#34;10.36.0.0/16\u0026#34;] pods: cidrBlocks: [\u0026#34;10.242.0.0/16\u0026#34;] apply the config to create the TKG cluster\n[root@pacific-ops01 vs7-k8s]# kubectl apply -f tkg-cluster01.yaml tanzukubernetescluster.run.tanzu.vmware.com/dev01-tkg-01 created monitor the cluster creation process, and eventually you’ll see all 4x TKG VMs are up and running:\n[root@pacific-ops01 vs7-k8s]# kubectl get tanzukubernetesclusters.run.tanzu.vmware.com NAME CONTROL PLANE WORKER DISTRIBUTION AGE PHASE dev01-tkg-01 1 3 v1.16.8+vmware.1-tkg.3.60d2ffd 13m creating [root@pacific-ops01 vs7-k8s]# kubectl get machines NAME PROVIDERID PHASE dev01-tkg-01-control-plane-n9hqx vsphere://420aff74-1367-9654-b2ba-59f8a64c3b52 running dev01-tkg-01-workers-nwmhh-c766c8f77-nnbsj vsphere://420aca94-26f3-f1c6-e112-607c28c439a4 provisioned dev01-tkg-01-workers-nwmhh-c766c8f77-pcv65 vsphere://420a2c44-f4e3-f698-b173-86a6b4b3fa27 provisioned dev01-tkg-01-workers-nwmhh-c766c8f77-zqfwj vsphere://420a2c16-3002-b2c2-ef5d-d4e3d7a08bf8 provisioned [root@pacific-ops01 vs7-k8s]# kubectl get machines NAME PROVIDERID PHASE dev01-tkg-01-control-plane-n9hqx vsphere://420aff74-1367-9654-b2ba-59f8a64c3b52 running dev01-tkg-01-workers-nwmhh-c766c8f77-nnbsj vsphere://420aca94-26f3-f1c6-e112-607c28c439a4 running dev01-tkg-01-workers-nwmhh-c766c8f77-pcv65 vsphere://420a2c44-f4e3-f698-b173-86a6b4b3fa27 running dev01-tkg-01-workers-nwmhh-c766c8f77-zqfwj vsphere://420a2c16-3002-b2c2-ef5d-d4e3d7a08bf8 running Time to log into our new cluster!\n[root@pacific-ops01 vs7-k8s]# kubectl vsphere login --server=192.168.100.129 --vsphere-username administrator@vsphere.local --insecure-skip-tls-verify --tanzu-kubernetes-cluster-name dev01-tkg-01 --tanzu-kubernetes-cluster-namespace dev01 [root@pacific-ops01 vs7-k8s]# kubectl config use-context dev01-tkg-01 Switched to context \u0026#34;dev01-tkg-01\u0026#34;. Once you are logged in and switched to the cluster “dev01-tkg-01” namespace, verify that you can see all 4x TKG nodes are in “Ready” status\n[root@pacific-ops01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION dev01-tkg-01-control-plane-n9hqx Ready master 22m v1.16.8+vmware.1 dev01-tkg-01-workers-nwmhh-c766c8f77-nnbsj Ready \u0026lt;none\u0026gt; 56s v1.16.8+vmware.1 dev01-tkg-01-workers-nwmhh-c766c8f77-pcv65 Ready \u0026lt;none\u0026gt; 61s v1.16.8+vmware.1 dev01-tkg-01-workers-nwmhh-c766c8f77-zqfwj Ready \u0026lt;none\u0026gt; 85s v1.16.8+vmware.1 We are now ready to deploy demo apps into the TKG cluster. First, update the cluster RBAC and Pod Security Policies by applying the supplied yaml config.\n[root@pacific-ops01 vs7-k8s]# kubectl apply -f allow-nonroot-clusterrole.yaml clusterrole.rbac.authorization.k8s.io/psp:privileged created clusterrolebinding.rbac.authorization.k8s.io/all:psp:privileged created Next, deploy the yelb demo app :\n[root@pacific-ops01 vs7-k8s]# kubectl apply -f yelb/yelb-lb.yaml service/redis-server created service/yelb-db created service/yelb-appserver created service/yelb-ui created deployment.apps/yelb-ui created deployment.apps/redis-server created deployment.apps/yelb-db created deployment.apps/yelb-appserver created wait for all the Pods up and running, then retrieve the external IP of the yelb-ui Load Balancer (assigned by NSX from the pre-provisioned /27 Ingress CIDR block)\n[root@pacific-ops01 vs7-k8s]# kubectl get svc yelb-ui -n yelb-app NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE yelb-ui LoadBalancer 10.40.19.40 192.168.100.132 80:30116/TCP 9d Go to the LB IP and you’ll see the app is running successfully.\nvSphere Environment Overview\nBelow is a quick overview of the vSphere Lab environment after you have completed all the steps. You should see a supervisor cluster (consists of 3x ESXi worker nodes and the 3x control VMs), a TKG cluster with its own namespace, and a guestbook microservice app deployed with native vSphere Pod services by leveraging vSAN CSI.\nand here is the network topology overview captured from NSX-T UI. Note NSX automatically deploys a dedicated Tier-1 gateway for every TKG cluster created. The tier-1 gateway also provides egress SNAT and Ingress LB capabilities for the TKG cluster.\n","permalink":"https://route179.dev/2020/07/17/deploying-vsphere-7-with-kubernetes-and-tanzu-kubernetes-grid-tkg-cluster/","summary":"\u003cp\u003eIn this post we’ll explore the vSphere 7 with Kubernetes capabilities and the detailed deployment steps in order to provision a vSphere supervisor cluster and a Tanzu Kubernetes Grid (TKG) cluster.\u003c/p\u003e\n\u003cp\u003eIf you are new to vSphere 7 and Tanzu Kubernetes, below are some background readings that can be used as a good start point:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003ca href=\"https://blogs.vmware.com/vsphere/2019/08/project-pacific-technical-overview.html\"\u003eProject Pacific – Technical Overview\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://blogs.vmware.com/vsphere/2019/08/project-pacific-technical-overview.html\"\u003evSphere 7 – Introduction to the vSphere Pod Service\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://blogs.vmware.com/vsphere/2019/08/project-pacific-technical-overview.html\"\u003evSphere 7 – Introduction to Kubernetes Namespaces\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://blogs.vmware.com/vsphere/2019/08/project-pacific-technical-overview.html\"\u003evSphere 7 – Introduction to Tanzu Kubernetes Grid Clusters\u003c/a\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003e\u003cstrong\u003eRequirements\u003c/strong\u003e\u003c/p\u003e","title":"Deploying vSphere 7 with Kubernetes and Tanzu Kubernetes Grid (TKG) Cluster"},{"content":"This blog provides an example for deploying a CI/CD pipeline on AWS utilising the serverless container platform Fargate and the fully managed CodePipeline service. We’ll also use Terraform to automate the process for building the entire AWS environment, as shown in the below diagram.\nSpecifically, we’ll be creating the following AWS resources:\n1x demo VPC including public/private subnets, NAT gateway and security groups etc 1x ALB for providing LB services to a target group of 2x Fargate container tasks 1x ECS cluster with a Fargate service definition (running our demo app) 1x CodePipeline definition, which builds the demo app from GitHub Repo (with a webhook trigger) and deploys it to the same Fargate service 1x ECR repository for hosting pipeline build images 2x S3 Buckets as build \u0026amp; artifact cache References – for this demo, I’m using these Terraform modules found on GitHub:\nTmknom Terraform Module for creating AWS ECS Fargate CloudPosse Terraform Module for creating AWS CodePipeline on ECS Infrastructure PREREQUISITES\nAccess to an AWS testing environment Install Git \u0026amp; Terraform on your client Install AWS toolkits including AWS CLI, AWS-IAM-Authenticator Check the NTP clock \u0026amp; sync status on your client —\u0026gt; important! Clone or donwload the Terraform code at here. Clone or fork the demo app (including CodePipeline buildspec) at here. Step-1: Review the Terraform Script\nLet’s take a close look of the Terraform code. I’ll skip the VPC and ALB sections and focus on the ECS/Fargate service and CodePipeline definition.\nThis section creates an ECS cluster with the Fargate service definition, note I have put a bitnami node image for testing purpose and it will get replaced automatically by our demo app via the CodPipeline execution.\n############################# Create ECS Cluster and Fargate Service ################################## resource \u0026#34;aws_ecs_cluster\u0026#34; \u0026#34;ecs_cluster\u0026#34; { name = \u0026#34;default\u0026#34; } module \u0026#34;ecs_fargate\u0026#34; { source = \u0026#34;git::https://github.com/tmknom/terraform-aws-ecs-fargate.git?ref=tags/2.0.0\u0026#34; name = var.ecs_service_name container_name = var.container_name container_port = var.container_port cluster = aws_ecs_cluster.ecs_cluster.arn subnets = module.vpc.public_subnets target_group_arn = join(\u0026#34;\u0026#34;, module.alb.target_group_arns) vpc_id = module.vpc.vpc_id container_definitions = jsonencode([ { name = var.container_name image = \u0026#34;bitnami/node:latest\u0026#34; essential = true portMappings = [ { containerPort = var.container_port protocol = \u0026#34;tcp\u0026#34; } ] } ]) desired_count = 2 deployment_maximum_percent = 200 deployment_minimum_healthy_percent = 100 deployment_controller_type = \u0026#34;ECS\u0026#34; assign_public_ip = true health_check_grace_period_seconds = 10 platform_version = \u0026#34;LATEST\u0026#34; source_cidr_blocks = [\u0026#34;0.0.0.0/0\u0026#34;] cpu = 256 memory = 512 requires_compatibilities = [\u0026#34;FARGATE\u0026#34;] iam_path = \u0026#34;/service_role/\u0026#34; description = \u0026#34;Fargate demo example\u0026#34; enabled = true tags = { Environment = \u0026#34;Dev\u0026#34; } } This section creates an ECR repository (for hosting the build image) and defines the pipeline, which builds the demo app from GitHub repo, pushes the new image to ECR and deploys it to the same ECS cluster and Fargate service as created from the above.\n################################### Create ECR Repo and Code Pipeline ################################### resource \u0026#34;aws_ecr_repository\u0026#34; \u0026#34;fargate-repo\u0026#34; { name = var.ecr_repo image_scanning_configuration { scan_on_push = true } } module \u0026#34;ecs_codepipeline\u0026#34; { source = \u0026#34;git::https://github.com/cloudposse/terraform-aws-ecs-codepipeline.git?ref=master\u0026#34; name = var.app_name namespace = var.namespace region = var.region image_repo_name = var.ecr_repo stage = var.stage github_oauth_token = var.github_oath_token github_webhooks_token = var.github_webhooks_token webhook_enabled = \u0026#34;true\u0026#34; repo_owner = var.github_repo_owner repo_name = var.github_repo_name branch = \u0026#34;master\u0026#34; service_name = module.ecs_fargate.ecs_service_name ecs_cluster_name = aws_ecs_cluster.ecs_cluster.arn privileged_mode = \u0026#34;true\u0026#34; } Note the pipeline is synced to GitHub with a webhook trigger enabled, and you’ll need to supply a GitHub personal token for this. So go create one if you haven’t already done so.\nStep-2: Create the Serverless Pipeline with Terraform\nConfigure AWS environment variables\n[root@cloud-ops01 tf-aws-eks]# aws configure AWS Access Key ID [*****]: AWS Secret Access Key [***]: Default region name [us-east-1]: Default output format [json]: update terraform.tfvars based on your own environment\nregion = \u0026#34;us-east-1\u0026#34; ecs_service_name = \u0026#34;ecs-svc-example\u0026#34; container_port = 3000 container_name = \u0026#34;demo-app\u0026#34; namespace = \u0026#34;xxx\u0026#34; stage = \u0026#34;dev\u0026#34; app_name = \u0026#34;demo-app-xxxx\u0026#34; ecr_repo = \u0026#34;fargate-demo-repo\u0026#34; github_oath_token = \u0026#34;xxxx\u0026#34; github_webhooks_token = \u0026#34;xxxx\u0026#34; github_repo_owner = \u0026#34;xxxx\u0026#34; github_repo_name = \u0026#34;fargate-demo-app\u0026#34; Now run the Terraform script\nterraform init terraform apply The process will take about 5 mins and you should see an output like this. Note the public URL of the ALB, which is providing LB services to the 2x Fargate container tasks.\nStep-3: Review the Fargate Service\nOn the AWS Console, go to “Elastic Container Service (ECS) —\u0026gt; Cluster” and we can see an ECS cluster “default” has been created, with 1x Fargate service defined and 2x container tasks/pods running.\nand here are the two running container tasks/pods:\nClick any of the tasks to confirm its running our demo app image deployed from the ECR repository.\nNext, search for AWS service “Developer Tools —\u0026gt; CodePipeline“, you’ll see our Pipeline has been deployed with a (1st) successful execution.\nNow search for “EC2 —\u0026gt; Load Balancer”, confirm that an ALB has been created and it should be deployed on two different subsets across two AZs.\nThis is because we are spreading the 2x ECS container tasks onto two AZs for high availability\nGo to the ALB public DNS/URL and you should see the default page of our demo app running on AWS Fargate, cool!\nStep-4: Test the Pipeline Run\nIt’s testing time now! As discussed, the pipeline is synced to Github repository and will be triggered by a push to master event. The actual build task is defined within the buildspec.yaml which contains a simple 3-stage process as per below. Note the output of the build process includes a json artifact (imagedefinitions.json) which includes the ECR path for the latest build image.\nversion: 0.2 phases: pre_build: commands: - echo Logging in to Amazon ECR... - aws --version - eval $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email) - REPOSITORY_URI=$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/$IMAGE_REPO_NAME - IMAGE_TAG=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7) build: commands: - echo Build started on `date` - echo Building the Docker image... - REPO_URI=$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/$IMAGE_REPO_NAME - docker pull $REPO_URI:latest || true - docker build --cache-from $REPO_URI:latest --tag $REPO_URI:latest --tag $REPO_URI:$IMAGE_TAG . post_build: commands: - echo Build completed on `date` - echo Pushing the Docker images... - REPO_URI=$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/$IMAGE_REPO_NAME - docker push $REPO_URI:latest - docker push $REPO_URI:$IMAGE_TAG - echo Writing image definitions file... - printf \u0026#39;[{\u0026#34;name\u0026#34;:\u0026#34;demo-app\u0026#34;,\u0026#34;imageUri\u0026#34;:\u0026#34;%s\u0026#34;}]\u0026#39; \u0026#34;$REPO_URI:$IMAGE_TAG\u0026#34; | tee imagedefinitions.json artifacts: files: imagedefinitions.json To test the pipeline run, we’ll make a “cosmetic change” to the app revision (v1.0 —\u0026gt; v1.1)\nCommit and push to master.\nAs expected, this has triggered a new pipeline run\nSoon you’ll see two additional pods are launching with a new revision number of “3” — this is because by default Fargate implements a rolling update deployment strategy with a default minimum healthy percent of 100%. So it will not remove the previous container pods (revision 2) until the new ones are running and ready.\nOnce the v3 Pods are running and we can see the v2 pods are being terminated and de-registered from the service.\nEventually the v2 pods are removed and the Fargate service is now updated with revision 3, which consists of the new pods running our demo app “v1.1”.\nIn the CodePipeline history, verify the new build \u0026amp; deployment process have been completed successfully.\nAlso, verify the new image (tag “99cc610”) of the demo app is pushed to ECR as expected.\nGo to the Fargate tasks (revision 3) again and verify the container pods are indeed running on the new image “99cc610”.\nRefresh the ALB address to see the v1.1 app page loading — Magic!\n","permalink":"https://route179.dev/2020/06/20/build-a-serverless-ci-cd-pipeline-on-aws-with-fargate-codepipeline-and-terraform/","summary":"\u003cp\u003eThis blog provides an example for deploying a CI/CD pipeline on AWS utilising the serverless container platform Fargate and the fully managed CodePipeline service. We’ll also use Terraform to automate the process for building the entire AWS environment, as shown in the below diagram.\u003c/p\u003e\n\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/2020/06/20/build-a-serverless-ci-cd-pipeline-on-aws-with-fargate-codepipeline-and-terraform/cicd-fargate.png\"\u003e\u003c/p\u003e\n\u003cp\u003eSpecifically, we’ll be creating the following AWS resources:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e1x demo VPC including public/private subnets, NAT gateway and security groups etc\u003c/li\u003e\n\u003cli\u003e1x ALB for providing LB services to a target group of 2x Fargate container tasks\u003c/li\u003e\n\u003cli\u003e1x ECS cluster with a Fargate service definition (running our demo app)\u003c/li\u003e\n\u003cli\u003e1x CodePipeline definition, which builds the demo app from \u003ca href=\"https://github.com/sc13912/fargate-demo-app.git\"\u003eGitHub Repo\u003c/a\u003e (with a webhook trigger) and deploys it to the same Fargate service\u003c/li\u003e\n\u003cli\u003e1x ECR repository for hosting pipeline build images\u003c/li\u003e\n\u003cli\u003e2x S3 Buckets as build \u0026amp; artifact cache\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003e\u003cstrong\u003eReferences\u003c/strong\u003e – for this demo, I’m using these Terraform modules found on GitHub:\u003c/p\u003e","title":"Build a Serverless CI/CD pipeline on AWS with Fargate, CodePipeline and Terraform"},{"content":"This is the third episode of our **Cloud Native DevOps on GCP **series. In the previous chapters, we have achieved the following:\nBuilt a GKE Cluster with Terraform **Created a CI/CD pipeline with GKE, GCR and Cloud Build ** This time, we will take a step further and go completely serverless by deploying the same Node app onto the Google Cloud Run platform. Cloud Run is built from an open source project named Knative, which is a serverless framework developed based on the industry proven Kubernetes architecture. Whilst Knative is developed with the same event-driven concept (like other serverless solutions), it also offers great flexibility and multi-cloud portability at a container level.\nFor this demo, we will firstly launch a Cloud Run Service with an initial image using cloudrun-hello app provided by Google. We will also create a Cloud Build Pipeline to automatically build and push our Node app onto GCR, and then deploy it to the same Cloud Run Service (as a new revision). As previously, the pipeline will be synced to GitHub repository and automatically triggered by a Git push event.\nBest of all, all GCP resources in this environment, including the Cloud Run Service and the Cloud Build Pipeline will be provisioned via Terraform, as illustrated at below.\nWHAT YOU’LL NEED:\nAccess to a GCP testing environment Install Git and Terrafrom on your client Install GCloud SDK Check the NTP clock \u0026amp; sync status on your client —\u0026gt; important! Clone or download the Terraform script at here Clone or fork the NodeJS demo app at here Step-1: Prepare the GCloud Environment\nTo start, configure the GCloud environment variables and authentications.\ngcloud init gcloud config set accessibility/screen_reader true gcloud auth application-default login Enable required GCP API services\ngcloud services enable servicenetworking.googleapis.com gcloud services enable cloudresourcemanager.googleapis.com gcloud services enable cloudbuild.googleapis.com gcloud services enable containerregistry.googleapis.com gcloud services enable run.googleapis.com gcloud services enable sourcerepo.googleapis.com Update Cloud Build service account with all the necessary roles so it will have required permissions to access Cloud Run and GCR within the project.\nPROJECT_ID=`gcloud config get-value project` CLOUDBUILD_SA=\u0026#34;$(gcloud projects describe $PROJECT_ID --format \u0026#39;value(projectNumber)\u0026#39;)@cloudbuild.gserviceaccount.com\u0026#34; gcloud projects add-iam-policy-binding $PROJECT_ID --member serviceAccount:$CLOUDBUILD_SA --role roles/editor gcloud projects add-iam-policy-binding $PROJECT_ID --member serviceAccount:$CLOUDBUILD_SA --role roles/run.admin gcloud projects add-iam-policy-binding $PROJECT_ID --member serviceAccount:$CLOUDBUILD_SA --role roles/container.developer Step-2: Connect Cloud Build to GitHub Repository\nNext, let’s connect Cloud Build to the demo app Git Repository. On GCP console, go to “Cloud Build —\u0026gt; Triggers —\u0026gt; Connect Repository” and then select “GitHub” as below. (You will be redirected to GitHub for authentication.)\nSelect the demo app repository which contains the sample NodeJs application.\nIn the next page, make sure to click “Skip for now” and we are done. We’ll leave it to Terraform to create the trigger at later.\nStep-3: Run Terrafrom Script to launch a Serverless CI/CD Pipeline\nBefore executing the script, make sure to update the variables (as defined in Terrafrom.tfvars) as per your own GCP environment.\nproject_id = \u0026#34;xxxxxxxx\u0026#34; location = \u0026#34;asia-northeast1\u0026#34; gcr_region = \u0026#34;asia\u0026#34; github_owner = \u0026#34;xxxxxx\u0026#34; github_repository = \u0026#34;xxxxxx\u0026#34; Run the Terraform script.\nterraform init terraform apply Since we are not provisioning any Infrastructure resources (it’s Serverless!), the process should take less than 2~3 mins. Take a note of the URL provided in the output — this is the public URL of our Cloud Run Service.\nOn GCP console verify the Cloud Run Service has been deployed successfully.\nNow go to the above URL and you should see the default page of the cloudrun-hello app.\nBefore we move forward, confirm there is now a Cloud Build triggered provisioned by Terrafrom with the pipeline config defined as “cloudbuild.yaml“.\nStep-4: Test the Pipeline\nNow let’s take a closer look at the pipeline code. This is a basic 3-stage pipeline:\nBuild the demo Node app Push the image to GCR Deploy the image from GCR to the existing Cloud Run Service steps: # Build Node app docker image - name: \u0026#34;gcr.io/cloud-builders/docker\u0026#34; args: - build - -t - ${_GCR_REGION}.gcr.io/$PROJECT_ID/${_SERVICE_NAME}:$COMMIT_SHA - . # Push Node app image to GCR - name: \u0026#34;gcr.io/cloud-builders/docker\u0026#34; args: - push - ${_GCR_REGION}.gcr.io/$PROJECT_ID/${_SERVICE_NAME}:$COMMIT_SHA # Deploy the docker image to Cloud Run Service - name: \u0026#34;gcr.io/cloud-builders/gcloud\u0026#34; args: - run - deploy - ${_SERVICE_NAME} - --image=${_GCR_REGION}.gcr.io/$PROJECT_ID/${_SERVICE_NAME}:$COMMIT_SHA - --region=${_LOCATION} - --platform=managed images: - \u0026#34;${_GCR_REGION}.gcr.io/$PROJECT_ID/${_SERVICE_NAME}:$COMMIT_SHA\u0026#34; timeout: 1200s substitutions: _LOCATION: asia-northeast1 _GCR_REGION: asia _SERVICE_NAME: cloudrun-demo Time to test the pipeline! We’ll add a note into the README file.\nCommit and push to Git.\nThis should automatically trigger the pipeline, and the 3-stage process should be completed around a minute 🙂\nNow go back to our Cloud Run Service, you should see a new revision has been deployed by Cloud Build, with the container image now pointing to the GCR path (which contains our demo app).\nRefresh the browser and Boom — you now have access to the demo app running on Google Cloud Run!\nThis concludes our **Cloud Native DevOps on GCP **series. I hope this has been informative and thanks very much for reading!\n","permalink":"https://route179.dev/2020/06/13/use-terraform-to-launch-a-serverless-ci-cd-pipeline-with-cloud-run-gcr-and-cloud-build/","summary":"\u003cp\u003eThis is the third episode of our **Cloud Native DevOps on GCP **series. In the previous chapters, we have achieved the following:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003ca href=\"https://route179.wordpress.com/2020/06/09/build-a-gke-cluster-with-terraform/\"\u003e\u003cstrong\u003eBuilt a GKE Cluster with Terraform\u003c/strong\u003e\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://route179.wordpress.com/2020/06/09/create-a-ci-cd-pipeline-with-gke-gcr-and-cloud-build/\"\u003e**Created a CI/CD pipeline with GKE, GCR and Cloud Build **\u003c/a\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eThis time, we will take a step further and go completely serverless by deploying the same Node app onto the Google Cloud Run platform. Cloud Run is built from an open source project named \u003ca href=\"https://knative.dev/\"\u003eKnative\u003c/a\u003e, which is a serverless framework developed based on the industry proven Kubernetes architecture. Whilst Knative is developed with the same event-driven concept (like other serverless solutions), it also offers great flexibility and multi-cloud portability at a container level.\u003c/p\u003e","title":"Cloud Native DevOps on GCP Series Ep3 – Use Terraform to launch a Serverless CI/CD pipeline with Cloud Run, GCR and Cloud Build"},{"content":"This is the second episode of our **Cloud Native DevOps on GCP **series. In the previous chapter, we have built a multi-AZ GKE cluster with Terraform. This time, we’ll create a cloud native CI/CD pipeline leveraging our GKE cluster and Google DevOps tools such as Cloud Build and Google Container Registry (GCR). We’ll create a Cloud Build trigger by connecting to GitHub repository to perform automatic build, test and deployment of a sample micro-service app onto the GKE cluster.\nFor this demo, I have provided a simple NodeJS app which is already containerized and packaged as a Helm Chart for fast K8s deployment. You can find all the artifacts at** my GitHub Repo**, including the demo app, Helm template/chart, as well as the Cloud Build pipeline code.\nWHAT YOU’LL NEED:\nAccess to a GCP testing environment Install Git, Kubectl and Terrafrom on your client Install Docker on your client Install GCloud SDK Check the NTP clock \u0026amp; sync status on your client —\u0026gt; important! Clone or download the demo app repo at here Step-1: Prepare the GCloud Environment\nTo begin, configure the GCloud environment variables and authentications.\ngcloud init gcloud config set accessibility/screen_reader true gcloud auth application-default login Register GCloud as a Docker credential helper — this is important so our Docker client will have privileged access to interact with GCR. (Later we’ll need to build and push a Helm client image to GCR as required for the pipeline deployment process)\ngcloud auth configure-docker Enable required GCP API services.\ngcloud services enable compute.googleapis.com gcloud services enable servicenetworking.googleapis.com gcloud services enable cloudresourcemanager.googleapis.com gcloud services enable container.googleapis.com gcloud services enable cloudbuild.googleapis.com Update Cloud Build service account with an editor role so it will have required permissions to access GKE and GCR within the project.\nPROJECT_ID=`gcloud config get-value project` CLOUDBUILD_SA=\u0026#34;$(gcloud projects describe $PROJECT_ID --format \u0026#39;value(projectNumber)\u0026#39;)@cloudbuild.gserviceaccount.com\u0026#34; gcloud projects add-iam-policy-binding $PROJECT_ID --member serviceAccount:$CLOUDBUILD_SA --role roles/editor Step-2: Launch a GKE Cluster using Terraform\nIf you have been following the series and have already deployed a GKE cluster, you can skip this step and move on to the next. Otherwise you can follow this post to build a GKE cluster with Terraform.\nMake sure to deploy an Ingress Controller as there is an Ingress service defined in our Helm Chart!\nkubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-0.32.0/deploy/static/provider/cloud/deploy.yaml Step-3: Initialize Helm for Application Deployment on GKE\nAs mentioned above, for this demo we have encapsulated our demo app into a Helm Chart. Helm is a package management system designed for simplifying and accelerating application deployment on the Kubernetes platform.\nAs of version 2, Helm consists of a local client and a Tiller server pod (deployed in K8s cluster) to interact with the Kube-apiserver for app deployment. In our example, we’ll first build a customised Helm client docker image and push it to GCR. This image will then be used by Cloud Build to interact with the Tiller server (deployed on GKE) for deploying the pre-packaged Helm chart — as illustrated in the below diagram.\nFirst let’s configure a service account for Tiller and initialize Helm (server component) on our GKE cluster.\nkubectl apply -f ./k8s-helm/tiller.yaml helm init --history-max 200 --service-account tiller We’ll then build and push a customised Helm client image to GCR. This might take a few minutes.\ncd ./k8s-helm/cloud-builders-community/helm docker build -t gcr.io/$PROJECT_ID/helm . docker push gcr.io/$PROJECT_ID/helm On GCR confirm there is a new Helm (client) image has been pushed through.\nStep-4: Review the (Cloud Build) Pipeline Code\nBefore we move forward, let’s take a moment to review the pipeline code (as defined in the cloudbuild.yaml). There is a total of 4 stages included in our Cloud Build pipeline:\nBuild a docker image with our demo app Push the new image to GCR Deploy Helm chart (for our demo app) to GKE via GCR Integration Testing The first two stages are straight forward, we’ll use the Google published Cloud Builder docker image to build the node app image and push it to the GCR repository.\n# Build demo app image - name: gcr.io/cloud_builders/docker args: - build - -t - gcr.io/$PROJECT_ID/node-app:$COMMIT_SHA - . # Push demo app image to GCR - name: gcr.io/cloud-builders/docker args: - push - gcr.io/$PROJECT_ID/node-app:$COMMIT_SHA Next we’ll leverage the (previously built) Helm client to interact with our GKE cluster and to deploy the Helm chart (for our node app), with the image repository pointing to the GCR path from the last pipeline stage.\n# Deploy with Helm Chart - name: gcr.io/$PROJECT_ID/helm args: - upgrade - -i - node-app - ./k8s-helm/node-app - --set - image.repository=gcr.io/$PROJECT_ID/node-app,image.tag=$COMMIT_SHA - -f - ./k8s-helm/node-app/values.yaml env: - CLOUDSDK_COMPUTE_REGION=$_CUSTOM_REGION - CLOUDSDK_CONTAINER_CLUSTER=$_CUSTOM_CLUSTER - KUBECONFIG=/workspace/.kube/config - TILLERLESS=false - TILLER_NAMESPACE=kube-system Lastly, we’ll run an integration test to verify the demo app status on our GKE cluster. For our node app there is a built-in heath-check URL configured at “/health“, and we’ll be leveraging another Cloud Builder curl image to ping this URL path and expect a return message of \u0026lt;“status”: “ok”\u0026gt; . Note: here we should be polling the internal DNS address for the k8s service (of the demo app) so there is no dependency on IP allocations.\n# Integration Testing - name: gcr.io/cloud-builders/kubectl entrypoint: \u0026#39;bash\u0026#39; args: - \u0026#39;-c\u0026#39; - | kubectl delete --wait=true pod curl kubectl run curl --restart=Never --image=gcr.io/cloud-builders/curl --generator=run-pod/v1 -- http://node-app.default.svc.cluster.local/health sleep 15 kubectl logs curl kubectl logs curl | grep OK env: - CLOUDSDK_COMPUTE_REGION=$_CUSTOM_REGION - CLOUDSDK_CONTAINER_CLUSTER=$_CUSTOM_CLUSTER - KUBECONFIG=/workspace/.kube/config Step-4: Create a Cloud Build Trigger by Connecting to GitHub Repository\nNow that we have our GKE cluster ready and Helm image pushed to GCR, the next step is to connect Cloud Build to the GitHub repository and create a CI trigger. On GCP console, go to Cloud Build —\u0026gt; Triggers, select the GitHub repo as below.\nIf this is the first time you are connecting to GitHub in Cloud Build, it will redirect you to an authorization page like below, accept it in order to access your repositories.\nSelect the demo app repository, which also includes the pipeline config (cloudbuild.yaml) file.\nCreate a push trigger in the next page and you should see a summary like this.\nYou can manually run the trigger now to kick off the CI build process. However we’ll be running more thorough testing to verify the end-to-end pipeline automation process in the next section.\nStep-5: Test the CI/CD Pipeline\nIt’s time to test our CI/CD pipeline! First we’ll make a “cosmetic” version change (1.0.0 to 1.0.1) to the Helm chart for our demo app.\nCommit the change and push to the Git repository.\nThis (push event) should have triggered our Cloud Build pipeline. You can jump on the GCP console to monitor the fully automated 4-stage process. The pipeline will be completed once the integration test has returned a status of OK.\nOn the GKE cluster we can see our Helm chart v-1.0.1 has been deployed successfully.\nThe deployment and node app are running as expected.\nRetrieve the Ingress public IP and update the local host file for a quick testing. (Note the Ingress URL is defined as “node-app.local”)\n[root@cloud-ops01 nodejs-cloudbuild-demo]# kubectl get ingresses NAME HOSTS ADDRESS PORTS AGE node-app node-app.local 34.87.213.107 80 15m [root@cloud-ops01 nodejs-cloudbuild-demo]# [root@cloud-ops01 nodejs-cloudbuild-demo]# echo \u0026#34;34.87.213.107 node-app.local\u0026#34; \u0026gt;\u0026gt; /etc/hosts Now point your browser to “node-app.local” and you should see the demo app page like below. Congrats, you have just successfully deployed a cloud native CI/CD pipeline on GCP!\n","permalink":"https://route179.dev/2020/06/09/create-a-ci-cd-pipeline-with-gke-gcr-and-cloud-build/","summary":"\u003cp\u003eThis is the second episode of our **Cloud Native DevOps on GCP **series. In the previous chapter, we have built a multi-AZ GKE cluster with Terraform. This time, we’ll create a cloud native CI/CD pipeline leveraging our GKE cluster and Google DevOps tools such as Cloud Build and Google Container Registry (GCR). We’ll create a Cloud Build trigger by connecting to GitHub repository to perform automatic build, test and deployment of a sample micro-service app onto the GKE cluster.\u003c/p\u003e","title":"Cloud Native DevOps on GCP Series Ep2 – Create a CI/CD pipeline with GKE, GCR and Cloud Build"},{"content":"This is the first episode of our Cloud Native DevOps on GCP series. Here we’ll be building an Google Kubernetes Engine (GKE) cluster using Terraform. From my personal experience, GKE has been one of the most scalable and reliable managed Kubernetes solution, and it’s also 100% upstream compliant and certified by CNCF.\nFor this demo I have provided a sample Terraform script at here. The target state will look like this:\nIn specific, we’ll be launching the following GCP/GKE resources:\n1x new VPC for hosting the demo GKE cluster 1x /17 CIDR block as the primary address space for the VPC 2x /18 CIDR blocks for the GKE Pod and Service address spaces 1x GKE high availability cluster across 2x Availability Zone (AZ) 2x GKE worker instance groups (2x nodes each) PREREQUISITES Access to a GCP testing environment Install Git, Kubectl and Terrafrom on your client Install GCloud SDK Check the NTP clock \u0026amp; sync status on your client —\u0026gt; important! Clone the Terraform Repo at here Step-1: Setup the GCloud Environment and Run the Terrafrom Script\nTo begin, run below interactive GCloud commands to prepare for the GCP environment\ngcloud init gcloud config set accessibility/screen_reader true gcloud auth application-default login Remember to update the terraform.tfvars with your own GCP project_id\nproject_id = \u0026#34;xxxxxxxx\u0026#34; Make sure to enable the GKE API if not already\ngcloud services enable container.googleapis.com Now run the Terraform script:\nterraform init terraform apply The whole process should be taking about 7~10 mins, and you should get an output like this:\nNow register the cluster and update kubeconfig file\n[root@cloud-ops01 tf-gcp-gke]# gcloud container clusters get-credentials node-pool-cluster-demo --region australia-southeast1 Fetching cluster endpoint and auth data. kubeconfig entry generated for node-pool-cluster-demo. Step-2: Verify the GKE Cluster Status\nCheck that we can access the GKE cluster and there should be 4x worker nodes provisioned.\n[root@cloud-ops01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION gke-node-pool-cluster-demo-pool-01-03a2c598-34lh Ready \u0026lt;none\u0026gt; 8m59s v1.16.9-gke.2 gke-node-pool-cluster-demo-pool-01-03a2c598-tpwq Ready \u0026lt;none\u0026gt; 9m v1.16.9-gke.2 gke-node-pool-cluster-demo-pool-01-e903c7a8-04cf Ready \u0026lt;none\u0026gt; 9m5s v1.16.9-gke.2 gke-node-pool-cluster-demo-pool-01-e903c7a8-0lt8 Ready \u0026lt;none\u0026gt; 9m5s v1.16.9-gke.2 This can also been verified on GKE console\nThe 4x worker nodes are provisioned over 2x managed instance groups across two different AZs\nRun kubectl describe nodes and we can see each node has been tagged with a few customised labels based on its unique properties. These are important metadata which can be used for selective Pod/Node deployment and other use cases like affinity or anti-affinity rules.\nStep-3: Deploy GKE Add-on Services\nInstall Metrics-Server to provide cluster-wide resource metrics collection and to support use cases such as Horizontal Pod Autoscaling (HPA) [root@cloud-ops01 tf-gcp-gke]# kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml Wait for a few seconds and we should have resource stats\n[root@cloud-ops01 tf-gcp-gke]# kubectl top nodes NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% gke-node-pool-cluster-demo-pool-01-03a2c598-34lh 85m 4% 798Mi 14% gke-node-pool-cluster-demo-pool-01-03a2c598-tpwq 300m 15% 816Mi 14% gke-node-pool-cluster-demo-pool-01-e903c7a8-04cf 191m 9% 958Mi 16% gke-node-pool-cluster-demo-pool-01-e903c7a8-0lt8 102m 5% 795Mi 14% Next, deploy a NGINX Ingress Controller so we can use L7 URL load balancing and to save cost by reducing the required numbers of external load balances [root@cloud-ops01 tf-gcp-gke]# kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-0.32.0/deploy/static/provider/cloud/deploy.yaml On GCP console we can see that an external Load Balancer has been provisioned in front of the Ingress Controller. Take a note of the LB address at below — this is the public IP that will be consumed by our ingress services.\nIn addition, we’ll deploy 2x storage classes to provide dynamic persistent storage support for stateful pods and services. Note the different persistent disk (PD) specs (standard \u0026amp; SSD) for different I/O requirements.\n[root@cloud-ops01 tf-gcp-gke]# kubectl create -f ./storage/storageclass/ Step-4: Deploy Sample Apps onto the GKE Cluster for Testing\nWe’ll first deploy the famous Hipster Shop app, which is a cloud-native microservice application developed by Google. kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/microservices-demo/master/release/kubernetes-manifests.yaml wait for all the Pods up and running\n[root@cloud-ops01 tf-gcp-gke]# kubectl get pods NAME READY STATUS RESTARTS AGE adservice-687b58699c-fq9x4 1/1 Running 0 2m16s cartservice-778cffc8f6-dnxmr 1/1 Running 0 2m20s checkoutservice-98cf4f4c-69fqg 1/1 Running 0 2m26s currencyservice-c69c86b7c-mz5zv 1/1 Running 0 2m19s emailservice-5db6c8b59f-jftv7 1/1 Running 0 2m27s frontend-8d8958c77-s9665 1/1 Running 0 2m24s loadgenerator-6bf9fd5bc9-5lsrn 1/1 Running 3 2m19s paymentservice-698f684cf9-7xbjc 1/1 Running 0 2m22s productcatalogservice-789c77b8dc-4tk4w 1/1 Running 0 2m21s recommendationservice-75d7cd8d5c-4x9kl 1/1 Running 0 2m25s redis-cart-5f59546cdd-8tj8f 1/1 Running 0 2m17s shippingservice-7d87945947-nhb5x 1/1 Running 0 2m18s check the external frontend service, you should see a LB has been deployed by GKE with a public IP assigned\n[root@cloud-ops01 ~]# kubectl get svc frontend-external NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE frontend-external LoadBalancer 192.168.74.68 35.197.182.62 80:32408/TCP 5m32s You should be able to access the app via the LB public IP.\nNext, we’ll deploy the sample Guestbook app to verify the persistent storage setup. [root@cloud-ops01 tf-gcp-gke]# kubectl create ns guestbook-app [root@cloud-ops01 tf-gcp-gke]# kubectl apply -f ./demo-apps/guestbook/ The application requests 2x persistent volumes (PV) for the redis-master and redis-slave pods. Both PVs should be automatically provisioned by the persistent volume claims (PVC) with the 2x different storage classes as we deployed earlier. You should see the STATUS reported as “Bound” between each PV and PVC mapping.\nRetrieve the external IP/DNS for the frontend service of the Guestbook app.\n[root@cloud-ops01 tf-gcp-gke]# kubectl get svc frontend -n guestbook-app NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE frontend LoadBalancer 192.168.127.128 34.87.228.35 80:31006/TCP 23m You should be able to access the Guesbook app now. Enter and submit some messages, and try to destroy and redeploy the app, your data will be kept by the redis PVs.\nLastly, we’ll deploy a modified version of the yelb app to test the NGINX ingress controller [root@cloud-ops01 tf-gcp-gke]# kubectl create ns yelb [root@cloud-ops01 tf-gcp-gke]# kubectl apply -f ./demo-apps/yelb/ You should see an ingress service deployed as per below.\nRetrieve the external IP for the ingress service within the yelb namespace. As mentioned before, this should be the same address of the external LB deployed for the ingress controller.\n[root@cloud-ops01 tf-gcp-gke]# kubectl get ingresses -n yelb NAME HOSTS ADDRESS PORTS AGE yelb-ingress yelb.local 35.189.3.12 80 6m47s Also, notice the ingress URL path is defined as “yelb.local”. This is the DNS entry that will be redirected by the http ingress service. So we’ll update the local host file (with the ingress public IP) for a quick testing.\n[root@cloud-ops01 tf-aws-eks]# echo \u0026#34;35.189.3.12 yelb.local\u0026#34; \u0026gt;\u0026gt; /etc/hosts and that’s it, the incoming requests to “yelb.local” are now routed via the ingress service to the yelb frontend pod running on our GKE cluster.\n","permalink":"https://route179.dev/2020/06/09/build-a-gke-cluster-with-terraform/","summary":"\u003cp\u003eThis is the first episode of our \u003cstrong\u003eCloud Native DevOps on GCP\u003c/strong\u003e series. Here we’ll be building an Google Kubernetes Engine (GKE) cluster using Terraform. From my personal experience, GKE has been one of the most scalable and reliable managed Kubernetes solution, and it’s also 100% upstream compliant and \u003ca href=\"https://www.cncf.io/certification/software-conformance/\"\u003ecertified by CNCF\u003c/a\u003e.\u003c/p\u003e\n\u003cp\u003eFor this demo I have provided a sample Terraform script \u003ca href=\"https://github.com/sc13912/tf-gcp-gke.git\"\u003eat here\u003c/a\u003e. The target state will look like this:\u003c/p\u003e\n\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/2020/06/09/build-a-gke-cluster-with-terraform/multi-az-gke-2.png\"\u003e\u003c/p\u003e","title":"Cloud Native DevOps on GCP Series Ep1 – Build a GKE Cluster with Terraform"},{"content":"Hi, I\u0026rsquo;m Sheng — a Solutions Architect at AWS Australia, where I help customers accelerate cloud migrations and modernize their infrastructure with cloud-native technologies.\nIn my current role, I focus on AWS hybrid cloud services, platform engineering, and HPC / AI infrastructure.\nThis blog is where I share hands-on labs, deep dives, and field notes from my work in cloud-native infrastructure and AI platform engineering. Opinions here are my own.\nFind me elsewhere:\nGitHub: sc13912 LinkedIn: Sheng Chen AWS Publications ","permalink":"https://route179.dev/about/","summary":"About Sheng Chen — Solutions Architect at AWS.","title":"About"},{"content":"A collection of my AWS publications across official AWS channels. View my full author profile on the AWS Blog →\nAWS Blog Posts Containers Deep dive into cluster networking for Amazon EKS Hybrid Nodes Deploy production generative AI at the edge using Amazon EKS Hybrid Nodes with NVIDIA DGX Simplify hybrid Kubernetes networking with Amazon EKS Hybrid Nodes gateway Architecture Secure Amazon Elastic VMware Service (Amazon EVS) with AWS Network Firewall Hybrid Integrating iSCSI Storage with VMware Cloud on AWS Virtual Machines Using Amazon FSx for NetApp ONTAP Application Modernization Using Microservices Architecture with VMware Cloud on AWS Simplify Application Networking with Amazon VPC Lattice and VMware Cloud on AWS Expanding VMware Cloud on AWS Multi-Region Connectivity Using AWS Cloud WAN VMware Cloud on AWS Hybrid Network Design Patterns Integrating Third-Party Firewall Appliances with VMware Cloud on AWS Using VMware Transit Connect AWS re:Post Articles Amazon EKS Enabling Multicast on Amazon EKS with Isovalent Enterprise for Cilium Deploy DeepSeek-R1-0528-671B on Amazon EKS using vLLM Deploying Cilium Networking on Amazon EKS Hybrid Nodes Run Virtual Machine Workloads with KubeVirt on Amazon EKS Hybrid Nodes Unpacking the Cluster Networking for Amazon EKS Hybrid Nodes VMware on AWS Extending Layer 2 Networks into VMware Cloud on AWS using L2VPN with NSX Autonomous Edge Expanding Amazon Elastic VMware Service (Amazon EVS) Global Connectivity with AWS Cloud WAN ","permalink":"https://route179.dev/aws-publications/","summary":"Official AWS blog posts and re:Post articles authored by Sheng Chen.","title":"AWS Publications"}]