Container and Infrastructure Code Testing

Explores practical tools and strategies to automate testing for container and infrastructure code, helping you maintain consistent policies and enhance code quality.

Do Exploit
5 min readNov 23, 2024

Understanding Container and Infrastructure as Code

Container is a unified package containing your application, its dependencies, and the operating system. Containers are portable and can run anywhere as long as there’s a container runtime like Docker. We define this with a Dockerfile, detailing the steps to build the container image — a ready-to-deploy artifact.

Infrastructure as code (IaC) is the ability to provision and support your computing infrastructure using code instead of manual processes and settings, using tools like terraform.

Challenge

Policies

Having clear policies for your container and infrastructure code ensures that everyone on your team follows the same standards. This leads to higher-qualified output. Let’s look at some examples:

Infrastructure Code

Naming Convention

Have you ever struggled to identify who owns a particular cloud resource or what it’s used for? Let’s look at an example in Terraform:

module "compute_instance" {
source = "terraform-google-modules/vm/google//modules/compute_instance"
version = "~> 12.0"

region = var.region
subnetwork = var.subnetwork
num_instances = 1
hostname = "instance-disk-snapshot"
instance_template = module.instance_template.self_link
}

This code creates a virtual machine in Google Cloud Platform (GCP). But can you tell who owns this VM, which project it’s for, or its purpose?

Now, consider this improved version:

module "virtual_machine_google_hangout_development" {
source = "terraform-google-modules/vm/google//modules/compute_instance"
version = "~> 12.0"

region = var.region
subnetwork = var.subnetwork
num_instances = 1
hostname = "google-hangout-development"
instance_template = module.instance_template.self_link
labels = {
organization = "google"
project = "hangout"
environment = "development”
}
}

By using clear naming conventions and labels, you immediately know:

  • Who owns the virtual machine.
  • Which organization or project it’s part of.
  • What environment it’s in (e.g., development, staging, production).

Allowed 3rd-Party Modules

Using modules in Terraform simplifies your code by bundling resources together. For instance, we used an official GCP Terraform module to create a virtual machine that already includes disks and network interfaces, saving us from declaring each resource individually.

However, be cautious with third-party libraries to avoid supply-chain attacks. For example:

GOOD:

  source  = "terraform-google-modules/vm/google//modules/compute_instance"

BAD:

  source  = "terraform-gogle-modules/vm/google//modules/compute_instance"

Did you notice the typo in the “BAD” example? This is a typosquatting attack. Attackers might publish malicious modules with names similar to official ones. If you’re not careful, you could unintentionally include harmful code in your infrastructure.

Container Image

Minimal Vulnerabilities

Using public base images for your containers is convenient, but they often contain vulnerabilities in system libraries.

The fix can be easy if you’re aware. 20% of images can fix vulnerabilities simply by rebuilding a docker image, 44% by swapping the base image.

There is an increase in the number of vulnerabilities reported for system libraries, affecting some of the popular Linux distributions such as Debian, RedHat Enterprise Linux and Ubuntu.

Snyk.io

One option, to mitigate this, you can update your Dockerfile to upgrade OS packages:

RUN apt-get update \
&& apt-get upgrade -y \
&& rm -rf /var/lib/apt/lists/*

By adding these commands, you ensure your container OS packages uses the latest, more secure packages.

Labels

You can use labels to organize your images, record licensing information, or in any way that makes sense for your business or application.

For example:

LABEL org.label-schema.build-date="2024-05-06T22:04:45.107454559Z" \
org.label-schema.license="Google-Cloud-License" \
org.label-schema.name="Google Cloud Search" \
org.label-schema.schema-version="1.0" \
org.label-schema.url="https://cloud.google.com/search" \
org.label-schema.usage="https://cloud.google.com/search/docs" \
org.label-schema.vcs-ref="b82df118650b55a500dcc181889ac35c6d8da7d" \
org.label-schema.vcs-url="https://github.com/google-cloud/search" \
org.label-schema.vendor="Google" \
org.label-schema.version="1.5.3" \
org.opencontainers.image.created="2024-05-06T22:04:45.107454559Z" \
org.opencontainers.image.documentation="https://cloud.google.com/search/docs" \
org.opencontainers.image.licenses="Google-Cloud-License" \
org.opencontainers.image.revision="b82df118650b55a500dcc181889ac35c6d8da7d" \
org.opencontainers.image.source="https://github.com/google-cloud/search" \
org.opencontainers.image.title="Google Cloud Search" \
org.opencontainers.image.url="https://cloud.google.com/search" \
org.opencontainers.image.vendor="Google" \
org.opencontainers.image.version="1.5.3"

These labels provide detailed information about the container image, ensuring it has a clear identity.

Automated Process

Defining policies is a great start, but how do you ensure they’re followed throughout the development lifecycle? Automating policy enforcement is key. By using tools that integrate with your workflow, you can automatically check for compliance every time code is changed, without manual reviews.

Potential Solutions

Introduce Semgrep

I compared several tools, and only Semgrep could create the policy I needed, according to my knowledge at that time. Semgrep supports 30+ languages, including terraform and can run in an IDE, as a pre-commit check, and as part of CI/CD workflows

Technical Overview: Infrastructure Code Policies

By defining patterns in YAML, Semgrep can look at our code, retrieve the values, and ensure they follow the allowed patterns.

Naming Convention

Source: https://semgrep.dev/playground/

This rule ensures that all module names follow your organization’s naming standards.

Allowed 3rd-Party Modules

Source: https://semgrep.dev/playground/

This rule flags any module that doesn’t use approved sources.

Introduce Container-Structure-Test

I compared several tools, and container-structure-test was the simplest implementation, according to my knowledge at that time. As the name suggests, container-structure-test provides a framework to validate the structure of a container image.

Technical Overview: Container Image Policies

Container-structure-test runs the target container before evaluating the policy, which allows us to evaluate tests that can only be evaluated at runtime. Here are the set of test supported:

  1. Command test
  2. File Existence Tests
  3. File Content Tests
  4. Metadata Test
  5. License Tests

Minimal Vulnerabilities

This test runs upgrades inside the container and checks that no packages need to be upgraded.

Labels

This test verifies that the specified labels are present in the container image, ensuring proper documentation and has clear identity.

Conclusion

Implementing automated testing tools like Semgrep and Container-Structure-Test enhances the reliability and security of your container and infrastructure code. By integrating these tools into your development workflow, you can enforce policies consistently, reduce manual errors, and maintain high-quality standards across your projects.

References

  1. Top ten most popular docker images each contain at least 30 vulnerabilities
  2. Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
  3. What is Infrastructure as Code?
  4. Docker object labels

--

--

Do Exploit
Do Exploit

Written by Do Exploit

I share stories about what I've learned in the past and now. Let's connect to Instagram! @do.exploit

No responses yet