GPU Test Case: Deploying an NVIDIA T4/P4 GPU on Google Cloud with Terraform Automation for OS Installation

Lionell Jenious
4 min readOct 13, 2024

--

GPU Test Case: Deploying an NVIDIA T4/P4 GPU on Google Cloud with Terraform Automation for OS Installation

This project demonstrates the automated deployment of an NVIDIA GPU-optimized machine on Google Cloud using Terraform. The scripts were vetted, audited, and successfully passed the terraform plan phase. Due to high associated costs, limited GPU availability, and exceeded quota, the actual Compute Engine deployment did not proceed.

All the prerequisites and staging were successfully executed during the Terraform planning phase. This project highlights key expertise in infrastructure automation, GPU provisioning, and OS installation, and also aligns with a real-world Infrastructure Engineer role.

This project demonstrates expertise in:

  • Infrastructure automation using Terraform for GPU-optimized virtual machines.
  • Scripting to automate cloud resource provisioning and OS setup.
  • OS installation with GPU drivers and CUDA support.
  • Version control via GitLab for continuous development.

Step 1: Created and Set Up a GitLab Project

  1. Created a GitLab Repository:
  • Created a new GitLab repository named nvidia-gpu-os-install
  • Cloned the GitLab repository to a local machine using
  • Navigated to the project directory:
git clone https://gitlab.com/brianjenwah/nvidia-gpu-os-install.git
Navigated to the project directory:

Step 2: Created a Directory for Google Cloud Automation

  1. Created the google_gcp_automation/ Directory
  • Navigated to the new directory (where Terraform Files are to reside)
Navigated to the project directory:
Google Cloud Terraform files

Step 3: Created Terraform Files

  1. Created the main.tf file: resources required to deploy an NVIDIA GPU T4 VM.
  2. Created the variables.tf file to define the input variables for the project ID, zone, and GPU type.
  3. Created the outputs.tf file to output the public IP address of the deployed VM.
Created the main.tf file: resources required to deploy an NVIDIA GPU T4 VM.
variables.tf file to define the input variables for the project ID, zone, and GPU type.
outputs.tf file to output the public IP address of the deployed VM.

Step 4: Defined the Project ID and Variables

  1. Created a terraform.tfvars file to store the project ID and other variable values:
  2. Updated the .gitignore file to exclude .terraform/ and unnecessary files from being tracked in the repository:
terraform.tfvars file to store the project ID and other variable values:
Updated the .gitignore file to exclude .terraform/ and unnecessary files from being tracked in the repository:

Step 5: Initialized Terraform

  1. Initialized Terraform in the google_gcp_automation/ directory to download the required provider plugins:
Initialized Terraform in the google_gcp_automation/ directory to download the required provider plugins:

Step 6: Planned the Terraform Deployment

  1. Generated a Terraform plan to preview the resources that would be created:
Terraform plan to preview the resources that would be created:pg1
Terraform plan to preview the resources that would be created:pg2
Terraform plan to preview the resources that would be created:pg3

Project Summary

In this project, I demonstrated my ability to:

  • Automate infrastructure using Terraform for GPU-optimized virtual machines.
  • Script the deployment of cloud resources and OS setup using Terraform and Google Cloud tools.
  • Perform OS installation with GPU drivers and CUDA support.
  • Manage version control using GitLab for continuous integration and collaboration.

The project highlighted skills relevant to roles that involve bare metal GPU provisioning, automation, scripting, and cloud infrastructure deployment.

Additional steps performed to setup laptop Git Bash & GitLab to Synch with Google Compute Engine API and the Terraform Application

  • Enable Compute Engine API / Enable APIs and Services.
  • gcloud auth login — Authorize the Git Bash to login and manage the Google Cloud Platform via CLI
  • gcloud config set project [nvidia-tesla-t4-gpu-test]
  • Set GitLab and Git Bash CLI for Local Development
gcloud auth login
gcloud auth login — successful
nvidia-tesla-t4-gpu-test
gcloud config set project [nvidia-tesla-t4-gpu-tes]
Reinitialized existing Git repository in C:/Users/Owner/nvidia-gpu-os-install/google_gcp_automation/.git/
git push origin main

--

--

Lionell Jenious
Lionell Jenious

Written by Lionell Jenious

Cloud Software Network Engineer | AWS | AI/ML | Blockchain | Azure | Google Cloud | VMware | Cloud Computing | DevOps | Software Defined Networks SD-WAN

No responses yet