Use Github Actions to build a GKE cluster with Terraform

Introduction

Actions is the In-House managed solution for CI/CD pipelines. It is a great start understand and how CI/CD works and be able to get some first hand experience since the only thing you need is repository GitHub, would it be public or private.

Why would you use CI/CD

I bet you know why using pipelines are important in an operation job, but i will do a quick recap for those who don't. Say that you have several environments that will each need to host a single GKE cluster with the same configuration. Setting up those cluster one by one manually through the console or the gcloud CLI tool is error prone. And eventually, the more you will have to do those tasks, the more you will look to automate them.

Using CI/CD is like using a script, but on someone else computer, generally your company or cloud provider servers if self hosted, or a provider like GitHub/Gitlab/Bitbucket for managed services. You can chose between both types depending on your needs, here are some examples:

  • Has your team the resources to manage a CI/CD self hosted service ?
  • Do you have any requirements for data security and on-premise infrastructure ?
  • Specific requirements not found in current CI/CD service providers ?
  • Etc...

A pipeline can be triggered by a specific set of events and execute a series of step that will impact one or several components of your architecture. In this case, we are talking about deploying a piece of infrastructure, but it can also be applied to an application codebase that will be compiled, checked for use cases and deployed into a selected environment.

The use of CI/CD through your entire infrastructure will increase its compliance level, enforcing your standards and security measures across all impacted components, thus removing the risk of creating snowflakes component ( obscure and mysterious piece of infrastructure too fragile for any one to touch without increasing your heart attack chances ).

Infrastructure as a Code

Talking about Compliance and standards, we can now introduce the concept of immutability:

Immutability
noun [ U ]
the state of not changing, or being unable to be changed.

C.f. https://dictionary.cambridge.org/dictionary/english/immutability

We will use immutability concept to force a state in our infrastructure. Lots of incidents are generated while changing the state of a component in order to update it by changing its configuration. It allows us to quickly fix any deployment link incident by rolling the current state of the component back to a state known for healthy condition. Those healthy conditions are known because they are stored as code, to be more precise, each component of your infrastructure has its configuration coded and stored into a Version Control Service that allows you to:

  • Track the entire state and evolution of your environments.
  • Review the changes proposed.
  • Quick roll back when faulty configuration is pushed.
  • Add automated testing of your changes through CI and deploy them through CD.

For this post, we will use Terraform as our IaaC tool.

Requirements

To be able to follow, you will need the following:

  • A GitHub repository.
  • A Google Cloud Platform Account
  • A terraform GCP Service Account with the owner status in your Project ( don't forget to download the credential file).

Of courses all values given here can be adapted to your own environment.

Adding your terraform code

For this step you can use your own code, but will i will do is paste the code provided by the Terraform documentation provided here:

resource "google_container_cluster" "primary" {
  name     = "my-gke-cluster"
  location = "us-central1"

  # We can't create a cluster with no node pool defined, but we want to only use
  # separately managed node pools. So we create the smallest possible default
  # node pool and immediately delete it.
  remove_default_node_pool = true
  initial_node_count       = 1

  master_auth {
    username = ""
    password = ""

    client_certificate_config {
      issue_client_certificate = false
    }
  }
}

resource "google_container_node_pool" "primary_preemptible_nodes" {
  name       = "my-node-pool"
  location   = "us-central1"
  cluster    = google_container_cluster.primary.name
  node_count = 1

  node_config {
    preemptible  = true
    machine_type = "e2-medium"

    metadata = {
      disable-legacy-endpoints = "true"
    }

    oauth_scopes = [
      "https://www.googleapis.com/auth/cloud-platform"
    ]
  }
}

This code will create 2 resources:

  • A google_container_cluster that will be the Kubernetes cluster to host our apps.
  • A google_container_node_pool that will enables us to have more control over the default node pool that could have been created with the cluster resource.

You can save this file as gke.tf in your repository.

Now you will need to use the GCP provider and save it into a providers.tf file:

provider "google" {
  project = var.gcp_project
  region  = "europe-west1"
}

It is also considered best practice to host your statefile into a distant storage that can be accessed by terraform when it runs. For this you will need to create a folder into GCP Storage service, then add this terraform code into a backend.tffile:

terraform {
  backend "gcs" {
    bucket = "my-terraform-bucket"
    prefix = "terraform/state"
  }
}

For the sake of it, we will use variables (we used one in providers.tf), save the following values in the file variables.tf:

variable "gcp_project" {
  default = "my-project"
}

We can also lock the version of Terraform this code will be able to be applied from by using a file version.tf, it will force us to use versions of terraform higher or equal to 0.12:

terraform {
  required_version = ">= 0.12"
}

And finally, we will store the credential files into the following file credentials/account.json:

{
    "type": "service_account",
    "project_id": "my-serviceaccount",
    "private_key_id": "xxx",
    "private_key": "-----BEGIN PRIVATE KEY-----yyy-----END PRIVATE KEY-----\\n",
    "client_email": "terraform@my-project.iam.gserviceaccount.com",
    "client_id": "zzz",
    "auth_uri": "https://accounts.google.com/o/oauth2/auth",
    "token_uri": "https://oauth2.googleapis.com/token",
    "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
    "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/terraform%40my-project.iam.gserviceaccount.com"
  }

I know it's not really secure, but we will talk about hosting files into secrets later. I will however show you how to host a string into a secret in GitHub.

Setting up your Workflows

Ensuring the formatting of our code

We could have run the command terraform fmt in our folder by ourselves, but is nice to mandate the code to be formatted as Terraform instruct to form our first compliance standard ( besides the code itself ).

Adding the formatting workflow

We will create this first workflow by hosting a file formatting.yaml in the .github/workflow folder at the root of your repository. As you may notice GitHub Action works with the data serialization language YAML to declaratively describe our pipeline. The file we will use is as follow:

name: formatting

on: 
  pull_request:
    branches: 
    - '*'

jobs:
  terraform:
    runs-on: ubuntu-latest
    steps:
      - name: setup terraform
        run: |
          wget https://releases.hashicorp.com/terraform/0.12.24/terraform_0.12.24_linux_amd64.zip
          unzip terraform_0.12.24_linux_amd64.zip
          sudo mv terraform /usr/bin/
          terraform version
      - name: Checkout
        uses: actions/checkout@v2
      - name: terraform fmt
        run: terraform fmt -write=false -diff -check

Explaining the workflow

Let's take some time to explain what this workflow does. As simple as it looks, the first key name will give the workflow it's name. GitHub will show you the workflow using this name on the Action tab of your repository (https://github.com/your-organization/your-repo/actions).

The second block of code represents the triggers that will launch our workflow, in that case the on key will let you select the pull_request event, and in this event, we will gave all branches as a source to trigger the workflow. By specifying the pull_request event, GitHub will also let you know the status of the job directly on the pull request itself.

Then we will talk about the biggest block in this workflow, the jobs key will let you declare the jobs that will process the changes made to your code. Each job will need parameters in order to work properly:

  • A name that will be defined as terraform. We can add other jobs with different names, we could use linters for python, Json or other type of files in the workflow. As it is, this workflow hosts one job, but it can host several.
  • A base to run the job on with the runs-on key that will point to the latest version of the ubuntu virtual environment provided by GitHub.
  • The steps are declarative instructions that will be executed in the order you have set in this workflow. They can be defined by yourself or the community. Lets analyze each step:
  • setup terraform: this step will launch a specified number of bash commands to run into your virtual environment. They will install Terraform using the 0.12.24 version.
  • Checkout: This is also a step maintained by the community, it will help you to checkout your repository so the changes you made will be processed by your job.
  • terraform init: As you may already know, you need to initialize the folder where your terraform code lies so it can be applied

Validating the changes

Now that the formatting workflow is done, we can now focus on the step right before applying our code to the infrastructure, the planning of our code. As you know, you should use the terraform plan command before letting terraform apply anything. This is the workflow supposed to handle this part.

Adding the dry-run workflow

Here is the code you will need to add in the same folder as before:

name: dry-runs

on: 
  pull_request:
    branches: 
    - '*'

jobs:
  terraform:
    runs-on: ubuntu-latest
    steps:
      - name: setup terraform
        run: |
          wget https://releases.hashicorp.com/terraform/0.12.24/terraform_0.12.24_linux_amd64.zip
          unzip terraform_0.12.24_linux_amd64.zip
          sudo mv terraform /usr/bin/
          terraform version
      - name: setup gcloud
        uses: google-github-actions/setup-gcloud@master
        with:
          service_account_email: ${{ secrets.GHUB_ACTION_CI_GCP_EMAIL }}
          service_account_key: ${{ secrets.GHUB_ACTION_CI_GCP_KEY }}
      - name: Checkout
        uses: actions/checkout@v2
      - name: Terraform security and compliance scan
        run: checkov -d infrastructure/terraform
      - name: terraform init
        run:  cd infrastructure/terraform && terraform init 
        env:
          GOOGLE_APPLICATION_CREDENTIALS: ${{ secrets.TERRAFORM_GCLOUD_CREDENTIALS_PATH}}
      - name: terraform plan
        run: cd infrastructure/terraform && terraform plan
        env:
          GOOGLE_APPLICATION_CREDENTIALS: ${{ secrets.TERRAFORM_GCLOUD_CREDENTIALS_PATH}}

Explaining the workflow

As for the last workflow, this one will be triggered by the same event, however, it will use the GitHub Action Secret feature to store and use secret into the workflow. For now we will take a look at the different new steps this job uses from the last one:

  • setup gcloud: this step is the perfect example of an action developed and maintained by the community that you will be able to use inside your workflow, it will install the gcloud CLI tool into your virtual environment. You can also notice this step will use 2 parameter defined by the maintainers to be able to authenticate into GCP after installing gcloud. the values of those parameters are stored in the Github Action Secrets service (we will see how to create secrets right after this).
  • terraform init: This step will initialize the state file and download the providers needed to plan our code. As you can see, we used environment variable to indicate to terraform the path to our credential file with a secret.
  • terraform plan: This step will let us know if there is any issue with the code we saved and pushed to our branch. It also needs an environment variable to be able to connect to GCP.

Creating your secrets

To create a secret, you should first go into your repository and select the Settings tab, then choose the Secrets and click on New repository secret on the top right of the secret menu. For all 3 secret, you will need to add the following values:

  • GHUB_ACTION_CI_GCP_EMAIL: terraform@my-project.iam.gserviceaccount.com
  • GHUB_ACTION_CI_GCP_KEY: -----BEGIN PRIVATE KEY-----yyy-----END PRIVATE KEY-----
  • TERRAFORM_GCLOUD_CREDENTIALS_PATH: credentials/account.json

You should now be able to run the job without any issues.

Applying your code

As the final step to apply a change to your code, we will now create the deployment workflow.

Adding the deployment-run workflow

Here is the code to commit to your branch:

name: deployment-runs

on: 
  push:
    branches: 
    - master

jobs:
  terraform:
    runs-on: ubuntu-latest
    steps:
      - name: setup terraform
        run: |
          wget https://releases.hashicorp.com/terraform/0.12.24/terraform_0.12.24_linux_amd64.zip
          unzip terraform_0.12.24_linux_amd64.zip
          sudo mv terraform /usr/bin/
          terraform version
      - name: setup gcloud
        uses: google-github-actions/setup-gcloud@master
        with:
          service_account_email: ${{ secrets.GHUB_ACTION_CI_GCP_EMAIL }}
          service_account_key: ${{ secrets.GHUB_ACTION_CI_GCP_KEY }}
      - name: Checkout
        uses: actions/checkout@v2
      - name: terraform init
        run:  cd infrastructure/terraform && terraform init
        env:
          GOOGLE_APPLICATION_CREDENTIALS: ${{ secrets.TERRAFORM_GCLOUD_CREDENTIALS_PATH}}
      - name: terraform apply
        run: cd infrastructure/terraform && terraform apply -auto-approve
        env:
          GOOGLE_APPLICATION_CREDENTIALS: ${{ secrets.TERRAFORM_GCLOUD_CREDENTIALS_PATH}}

Explaining the workflow

This workflow is pretty much the same as the dry-run one except for 2 key differences:

  • The workflow will now be triggered with a push on the master branch. In case you have a repository created in the second half of 2020, you will have to change this value to main.
  • the terraform plan step is now terraform apply. We consider that if the plan worked and matched what you wanted on the plan step, we can automatically approve the application of the changes to your infrastructure.

Final Result

Following the workflows we deployed to our GitHub repository, at each pull requests opened, the 2 first workflow will be triggered and look like this (depending if you screw your code or not, which i did while taking this screenshot):

You can see the presence of 3 checks in this screenshot, the one that is not showed is about the same dry-run, but in Kubernetes. I will show you how to handle the same type of workflows for Kubernetes in another post.

Hopefully, this post will be able to answer some of your questions about Github Actions. For more information, you can visit the official documentation at https://docs.github.com/en/free-pro-team@latest/actions.