Implementing Infrastructure As Code Security Scanning A Comprehensive Guide

by James Vasile 76 views

Hey guys! Today, we're diving deep into the world of Infrastructure as Code (IaC) security scanning. It's super important to make sure our infrastructure code is rock-solid and doesn't have any sneaky misconfigurations. We're talking about scanning Docker configurations, GitHub Actions workflows, and deployment scripts to keep everything secure. This is a MEDIUM priority task, so we're aiming to get it done within 30 days.

Why IaC Security Scanning Matters

IaC Security scanning is crucial because it helps us catch security misconfigurations, hardcoded secrets, insecure defaults, and compliance violations before they become a problem. Think of it as a safety net for your infrastructure. It's all about ensuring that our infrastructure code follows security best practices, keeping our systems safe and sound.

The Current Risks We're Facing

Right now, we've got a few potential risks lurking in our infrastructure code:

  • Dockerfile security: Our Dockerfiles might contain security anti-patterns that could be exploited.
  • GitHub Actions permissions: Our workflows could have excessive permissions, making them a target for attacks.
  • Docker Compose configurations: These configurations might expose unnecessary ports, creating vulnerabilities.
  • Shell script vulnerabilities: Our shell scripts might have command injection vulnerabilities, which could lead to serious security breaches.

To tackle these risks, we need a robust IaC scanning pipeline that can detect and prevent these issues. So, let's break down the implementation requirements step by step.

Implementation Requirements: Building Our IaC Security Fortress

Alright, let's get into the nitty-gritty of how we're going to implement this IaC security scanning. We've got a few key areas to cover, so let's dive in!

1. Setting Up the IaC Security Scanning Pipeline

First up, we're creating a .github/workflows/iac-security-scan.yml file. This is where the magic happens! This pipeline will automatically scan our code whenever there's a push or pull request that touches specific files like Dockerfiles, Docker Compose files, GitHub Actions workflows, shell scripts, Terraform files, and Kubernetes YAML files.

name: Infrastructure as Code Security Scan

on:
  push:
    paths:
      - 'Dockerfile*'
      - 'docker-compose*.yml'
      - '.github/workflows/*.yml'
      - '**/*.sh'
      - '**/*.tf'
      - 'kubernetes/*.yaml'
  pull_request:
    paths:
      - 'Dockerfile*'
      - 'docker-compose*.yml'
      - '.github/workflows/*.yml'
      - '**/*.sh'
      - '**/*.tf'
      - 'kubernetes/*.yaml'

jobs:
  iac-security:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      security-events: write
      pull-requests: write

    steps:
    - name: Checkout code
      uses: actions/checkout@v4

    # Trivy IaC scanning
    - name: Run Trivy IaC scanner
      uses: aquasecurity/trivy-action@master
      with:
        scan-type: 'config'
        scan-ref: '.'
        format: 'sarif'
        output: 'trivy-iac.sarif'
        severity: 'CRITICAL,HIGH,MEDIUM'
        exit-code: '1'
        skip-dirs: 'node_modules,venv,.git'

    - name: Upload Trivy IaC results
      if: always()
      uses: github/codeql-action/upload-sarif@v3
      with:
        sarif_file: 'trivy-iac.sarif'
        category: 'iac-scanning-trivy'

    # Checkov comprehensive IaC scanning
    - name: Run Checkov scanner
      id: checkov
      uses: bridgecrewio/checkov-action@master
      with:
        directory: .
        quiet: false
        soft_fail: false
        framework: all
        output_format: sarif
        output_file_path: reports/checkov.sarif
        skip_check: CKV_DOCKER_1  # Example skip

    - name: Upload Checkov results
      if: always()
      uses: github/codeql-action/upload-sarif@v3
      with:
        sarif_file: reports/checkov.sarif
        category: 'iac-scanning-checkov'

    # Hadolint for Dockerfile
    - name: Run Hadolint Dockerfile Linter
      uses: hadolint/hadolint-action@v3.1.0
      with:
        dockerfile: Dockerfile
        format: sarif
        output-file: hadolint.sarif
        no-fail: false

    - name: Upload Hadolint results
      if: always()
      uses: github/codeql-action/upload-sarif@v3
      with:
        sarif_file: hadolint.sarif
        category: 'dockerfile-linting'

    # ShellCheck for shell scripts
    - name: Run ShellCheck
      uses: ludeeus/action-shellcheck@master
      with:
        severity: error
        format: gcc
      env:
        SHELLCHECK_OPTS: -e SC2034 -e SC2154

    # KICS for comprehensive IaC scanning
    - name: Run KICS scanner
      uses: checkmarx/kics-github-action@v1.7.0
      with:
        path: '.'
        output_path: 'reports/'
        platform_type: Dockerfile,Ansible,CloudFormation,Kubernetes,Terraform
        fail_on: high
        enable_comments: true

    # GitHub Actions security check
    - name: Check GitHub Actions permissions
      run: |
        echo "## 🔐 GitHub Actions Security Analysis" > actions-report.md
        echo "" >> actions-report.md
        
        # Check for excessive permissions
        for workflow in .github/workflows/*.yml; do
          echo "### $(basename "$workflow")" >> actions-report.md
          
          # Check for write-all permissions
          if grep -q "permissions: write-all" "$workflow"; then
            echo "❌ CRITICAL: Workflow has write-all permissions!" >> actions-report.md
          fi
          
          # Check for secret usage
          secret_count=$(grep -o "\${{ secrets\." "$workflow" | wc -l)
          echo "📊 Secrets used: $secret_count" >> actions-report.md
          
          # Check for third-party actions
          third_party=$(grep -E "uses: (?!actions/|github/)" "$workflow" | wc -l)
          echo "📦 Third-party actions: $third_party" >> actions-report.md
          
          echo "" >> actions-report.md
        done

    # Generate comprehensive report
    - name: Generate IaC Security Report
      if: github.event_name == 'pull_request'
      run: |
        echo "## 🏗️ Infrastructure as Code Security Scan Results" > iac-report.md
        echo "" >> iac-report.md
        
        # Summarize findings
        echo "### Summary" >> iac-report.md
        echo "- Dockerfile issues: $(cat hadolint.sarif | jq '.runs[0].results | length')" >> iac-report.md
        echo "- IaC misconfigurations: $(cat trivy-iac.sarif | jq '.runs[0].results | length')" >> iac-report.md
        echo "- Shell script issues: Checked" >> iac-report.md
        
        # Add actions report
        cat actions-report.md >> iac-report.md

    - name: Comment PR with results
      if: github.event_name == 'pull_request'
      uses: actions/github-script@v7
      with:
        script: |
          const fs = require('fs');
          const report = fs.readFileSync('iac-report.md', 'utf8');
          github.rest.issues.createComment({
            issue_number: context.issue.number,
            owner: context.repo.owner,
            repo: context.repo.repo,
            body: report
          });

This workflow uses a bunch of cool tools like Trivy, Checkov, Hadolint, ShellCheck, and KICS to scan our code. It also checks for GitHub Actions security issues and generates a comprehensive report that gets posted as a comment on pull requests. Pretty neat, huh?

2. Dockerfile Security Best Practices

Next up, we're creating a Dockerfile.security-template. This template is like a blueprint for building secure Docker images. It includes a ton of security best practices to make sure our containers are locked down tight.

# Security-hardened Dockerfile template
# Use specific version with SHA256 hash
ARG PYTHON_VERSION=3.13.0
FROM python:${PYTHON_VERSION}-slim@sha256:XXXXXX AS builder

# Metadata
LABEL maintainer="security@teslaontarget.com"
LABEL security.scan="trivy,hadolint"
LABEL version="1.0.0"

# Security: Don't run as root
RUN groupadd -g 1000 appgroup && \
    useradd -r -u 1000 -g appgroup -s /sbin/nologin -c "App user" appuser

# Security: Update base image
RUN apt-get update && \
    apt-get upgrade -y && \
    apt-get install -y --no-install-recommends \
    ca-certificates && \
    # Clean up
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# Security: Use virtual environment
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Copy requirements first for layer caching
COPY --chown=appuser:appgroup requirements.txt .

# Security: Verify package integrity
RUN pip install --no-cache-dir --upgrade pip && \
    pip install --no-cache-dir -r requirements.txt

# Production stage
FROM python:${PYTHON_VERSION}-slim@sha256:XXXXXX

# Copy user from builder
COPY --from=builder /etc/passwd /etc/passwd
COPY --from=builder /etc/group /etc/group

# Copy virtual environment
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Security: Set up app directory
WORKDIR /app
RUN chown appuser:appgroup /app

# Copy application
COPY --chown=appuser:appgroup . .

# Security configurations
USER appuser

# Security: Read-only root filesystem
# Uncomment if your app supports it
# RUN chmod -R 555 /app

# Security: No new privileges
RUN echo "appuser ALL=(ALL) NOPASSWD: NONE" > /etc/sudoers.d/appuser

# Security: Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD python -c "import sys; sys.exit(0)" || exit 1

# Security: Drop capabilities
# Note: Add this to docker-compose.yml instead:
# cap_drop:
#   - ALL
# cap_add:
#   - NET_BIND_SERVICE

# Security: Resource limits (set in docker-compose.yml)
# deploy:
#   resources:
#     limits:
#       cpus: '1.0'
#       memory: 512M

ENTRYPOINT ["python", "-m", "teslaontarget"]

Some key highlights of this template include:

  • Running as a non-root user: This reduces the attack surface if a container is compromised.
  • Using specific image versions with SHA256 hashes: This ensures we're using the exact image we expect.
  • Updating the base image: This keeps our containers patched against known vulnerabilities.
  • Using virtual environments: This isolates dependencies and prevents conflicts.
  • Implementing health checks: This helps ensure our containers are running smoothly.

3. Docker Compose Security

We're also creating a docker-compose.security.yml file to define secure configurations for our Docker Compose setups. This file includes settings like user namespaces, read-only root filesystems, resource limits, and network isolation.

version: '3.8'

services:
  teslaontarget:
    image: ghcr.io/joshuafuller/teslaontarget:latest
    
    # Security: User namespace
    user: "1000:1000"
    
    # Security: Read-only root filesystem
    read_only: true
    
    # Security: Temp directories for runtime
    tmpfs:
      - /tmp
      - /var/run
    
    # Security: Drop all capabilities
    cap_drop:
      - ALL
    
    # Security: Add only required capabilities
    cap_add:
      - NET_RAW  # If needed for network operations
    
    # Security: No new privileges
    security_opt:
      - no-new-privileges:true
      - seccomp:unconfined  # Customize as needed
    
    # Security: Resource limits
    deploy:
      resources:
        limits:
          cpus: '1.0'
          memory: 512M
        reservations:
          cpus: '0.5'
          memory: 256M
    
    # Security: Restart policy
    restart: on-failure:3
    
    # Security: Network isolation
    networks:
      - teslanet
    
    # Security: Limited environment variables
    env_file:
      - .env.production
    environment:
      - PYTHONDONTWRITEBYTECODE=1
      - PYTHONUNBUFFERED=1
    
    # Security: Volume mounts with restrictions
    volumes:
      - type: volume
        source: tesla_data
        target: /data
        read_only: false
      - type: volume
        source: tesla_logs
        target: /logs
        read_only: false

volumes:
  tesla_data:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: /secure/data/tesla
  
  tesla_logs:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: /secure/logs/tesla

networks:
  teslanet:
    driver: bridge
    ipam:
      config:
        - subnet: 172.28.0.0/16

By setting these configurations, we can significantly reduce the risk of vulnerabilities in our containerized applications.

4. GitHub Actions Security

GitHub Actions workflows are powerful, but they can also be a security risk if not configured properly. That's why we're creating a .github/workflows/security-check.yml file to validate workflow permissions and check for hardcoded secrets.

name: Workflow Security Check

on:
  pull_request:
    paths:
      - '.github/workflows/*.yml'

jobs:
  check-workflows:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: write
    
    steps:
    - uses: actions/checkout@v4
    
    - name: Validate workflow permissions
      run: |
        # Check for excessive permissions
        for workflow in .github/workflows/*.yml; do
          echo "Checking $workflow..."
          
          # Ensure no write-all
          if grep -q "permissions: write-all" "$workflow"; then
            echo "::error file=$workflow::Workflow has write-all permissions"
            exit 1
          fi
          
          # Check for pinned actions
          if grep -E "uses: .+@(main|master|latest)" "$workflow"; then
            echo "::warning file=$workflow::Action should use SHA or tag, not branch"
          fi
        done
    
    - name: Check for hardcoded secrets
      run: |
        # Scan for potential secrets
        if grep -rE "(password|secret|token|key)\s*[:=]\s*['\"][^'\"]+['\"]" .github/; then
          echo "::error::Potential hardcoded secrets found"
          exit 1
        fi

This workflow checks for excessive permissions like write-all and ensures that actions are pinned to specific versions (using SHAs or tags) instead of branches. It also scans for potential hardcoded secrets, which is a big no-no!

5. Shell Script Security

Shell scripts can be a breeding ground for vulnerabilities if not written carefully. To address this, we're creating a scripts/security-check.sh script that includes several security best practices.

#!/bin/bash
set -euo pipefail  # Security: Fail on errors

# Security: Validate inputs
validate_input() {
    local input="$1"
    if [[ "$input" =~ [^a-zA-Z0-9_-] ]]; then
        echo "Error: Invalid input detected"
        exit 1
    fi
}

# Security: Use quotes to prevent word splitting
readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
readonly CONFIG_FILE="${CONFIG_FILE:-/etc/teslaontarget/config}"

# Security: Check file permissions
if [[ -w "$CONFIG_FILE" ]]; then
    echo "Warning: Config file is writable"
fi

# Security: Don't use eval or source untrusted files
# Bad: eval "$user_input"
# Good: case statement or if/then validation

# Security: Sanitize environment
unset LD_PRELOAD
export PATH="/usr/local/bin:/usr/bin:/bin"

# Security: Use arrays for commands
cmd_args=('--flag' 'value' '--other-flag')
some_command "${cmd_args[@]}"

This script includes practices like:

  • Failing on errors: The set -euo pipefail command ensures that the script exits immediately if any command fails.
  • Validating inputs: The validate_input() function checks inputs for invalid characters.
  • Using quotes: Quotes prevent word splitting and ensure variables are treated as single units.
  • Avoiding eval and source: These commands can be dangerous if used with untrusted input.
  • Sanitizing the environment: Unsetting LD_PRELOAD and setting a safe PATH helps prevent malicious code execution.

6. IaC Security Policy

To ensure consistency and enforce security standards, we're creating a .iac-security-policy.yaml file. This policy defines the required and forbidden practices for Dockerfiles, GitHub Actions, Docker Compose configurations, and shell scripts.

# Infrastructure as Code Security Policy
version: 1.0

docker:
  required:
    - no-root-user
    - pinned-base-image
    - no-sudo
    - health-check
    - no-upgrade-in-run
    - minimal-packages
  
  forbidden:
    - latest-tag
    - curl-bash-pattern
    - hardcoded-secrets
    - permissive-copy

github-actions:
  required:
    - pinned-actions
    - minimal-permissions
    - no-pull-request-target
  
  forbidden:
    - write-all-permissions
    - unverified-actions
    - script-injection

compose:
  required:
    - resource-limits
    - no-privileged
    - user-specified
    - restart-policy
  
  forbidden:
    - host-network-mode
    - cap-add-all
    - volume-root-mount

scripts:
  required:
    - set-euo-pipefail
    - quote-variables
    - validate-inputs
  
  forbidden:
    - eval-usage
    - unquoted-variables
    - curl-to-bash

This policy helps us maintain a consistent security posture across our infrastructure code.

7. Compliance Scanning

Compliance is key, especially when it comes to security. We're creating a .compliance/cis-docker-compliance.yaml file to ensure we're following CIS Docker Benchmark recommendations.

# CIS Docker Benchmark Compliance
checks:
  # 4.1 Ensure a user for the container has been created
  - id: CIS-Docker-4.1
    severity: HIGH
    
  # 4.6 Ensure HEALTHCHECK instructions have been added
  - id: CIS-Docker-4.6
    severity: MEDIUM
    
  # 5.3 Ensure containers are not run with root privileges
  - id: CIS-Docker-5.3
    severity: CRITICAL
    
  # 5.12 Ensure readonly root filesystem
  - id: CIS-Docker-5.12
    severity: HIGH

This file defines checks for things like running containers as non-root, adding health checks, and ensuring a read-only root filesystem.

Measuring Success: How Will We Know We've Nailed It?

Alright, so how do we know if we've successfully implemented IaC security scanning? Here are the success criteria we're shooting for:

8. Success Criteria

  • [ ] All IaC files scanned on every change
  • [ ] Zero critical misconfigurations in production
  • [ ] Dockerfile follows security best practices
  • [ ] GitHub Actions use minimal permissions
  • [ ] Shell scripts pass shellcheck
  • [ ] Compliance with CIS benchmarks
  • [ ] Mean remediation time < 7 days
  • [ ] Security policy enforced

We want to make sure that every change to our IaC files triggers a scan, that we have zero critical misconfigurations in production, and that our Dockerfiles, GitHub Actions, and shell scripts all follow security best practices. Compliance with CIS benchmarks is also a must, and we want to remediate any issues quickly (within 7 days on average). Plus, we need to ensure our security policy is actually enforced.

9. Testing Requirements

To make sure our IaC scanning is working as expected, we've got some testing requirements:

  • [ ] Test with intentionally insecure Dockerfile
  • [ ] Verify CI fails on misconfigurations
  • [ ] Test permission validation in workflows
  • [ ] Validate script security checks
  • [ ] Confirm compliance scanning

We'll test with an intentionally insecure Dockerfile to see if our scanners pick up the issues. We'll also verify that our CI pipeline fails when misconfigurations are detected. Testing permission validation in workflows and script security checks is crucial, and we'll confirm that our compliance scanning is working correctly.

Documentation: Sharing the Knowledge

Last but not least, we need to document our IaC security practices. This helps ensure that everyone on the team understands the guidelines and can contribute to maintaining a secure infrastructure.

10. Documentation

We're creating a docs/IAC_SECURITY.md file that serves as our Infrastructure as Code Security Guide.

# Infrastructure as Code Security Guide

## Dockerfile Security
- Always run as non-root user
- Pin base images with SHA256
- Minimize layers and packages
- Use multi-stage builds
- Implement health checks

## GitHub Actions Security
```yaml
# Good: Minimal permissions
permissions:
  contents: read
  
# Bad: Excessive permissions  
permissions: write-all

Local IaC Scanning

# Scan Dockerfile
hadolint Dockerfile

# Scan with Checkov
checkov -f docker-compose.yml

# Scan all IaC
trivy config .

Security Checklist

  • [ ] Non-root user in containers
  • [ ] Pinned dependencies
  • [ ] Resource limits set
  • [ ] No hardcoded secrets
  • [ ] Minimal permissions
  • [ ] Regular security scans

This guide covers Dockerfile security, GitHub Actions security, local IaC scanning tips, and a security checklist. It's a one-stop-shop for all things IaC security!

## Resources: Where to Learn More

To wrap things up, here are some helpful references for diving deeper into IaC security:

## References

*   [Hadolint - Dockerfile Linter](https://github.com/hadolint/hadolint)
*   [Checkov - IaC Scanner](https://www.checkov.io/)
*   [KICS - Keeping IaC Secure](https://kics.io/)
*   [CIS Docker Benchmark](https://www.cisecurity.org/benchmark/docker)
*   [GitHub Actions Security Hardening](https://docs.github.com/en/actions/security-guides/security-hardening-for-github-actions)
*   [OWASP Docker Security](https://cheatsheetseries.owasp.org/cheatsheets/Docker_Security_Cheat_Sheet.html)

By implementing these practices and leveraging these resources, we can build a **strong** and secure infrastructure. Let's get to it, guys!