How to Write Terraform Script
Introduction Terraform has become the de facto standard for Infrastructure as Code (IaC) across cloud-native organizations. Its declarative syntax, provider ecosystem, and state management make it indispensable for automating infrastructure provisioning. But with great power comes great responsibility. A single misconfigured Terraform script can lead to outages, security breaches, compliance viola
Introduction
Terraform has become the de facto standard for Infrastructure as Code (IaC) across cloud-native organizations. Its declarative syntax, provider ecosystem, and state management make it indispensable for automating infrastructure provisioning. But with great power comes great responsibility. A single misconfigured Terraform script can lead to outages, security breaches, compliance violations, or costly over-provisioning. Writing Terraform scripts you can trust isnt about writing more codeits about writing better, more deliberate, and more resilient code.
This guide presents the top 10 proven practices to write Terraform scripts you can trustscripts that are maintainable, secure, predictable, and auditable. Whether you're a beginner scaling your first cloud environment or an experienced engineer managing thousands of resources across multiple regions, these principles will elevate your IaC maturity and reduce the risk of human error.
Trust in Terraform doesnt emerge by accident. Its engineered through discipline, tooling, and adherence to industry best practices. By the end of this article, youll have a clear roadmap to transform your Terraform workflows from fragile and reactive to robust and reliable.
Why Trust Matters
Infrastructure as Code is not a luxuryits a necessity. But code without trust is dangerous. A Terraform script that works in development may fail catastrophically in production. A module that deploys cleanly today might break tomorrow due to an upstream provider change, an unvetted variable, or an untracked state drift. Trust in Terraform means confidence that your infrastructure will behave exactly as intended, every time, without manual intervention.
Loss of trust leads to three critical failures: operational instability, security vulnerabilities, and organizational friction. Teams that dont trust their Terraform scripts resort to manual changesclickOpswhich introduce configuration drift, violate audit trails, and create technical debt. According to a 2023 DevOps Report, 68% of infrastructure outages in cloud environments were traced back to misconfigured IaC, not hardware failures or network issues.
Trust is built through predictability, repeatability, and verifiability. A trusted Terraform script:
- Produces identical infrastructure every time its applied, regardless of environment.
- Is reviewed, tested, and validated before deployment.
- Uses version-controlled, modular components that are reusable and documented.
- Minimizes surface area for attack and complies with security policies.
- Can be rolled back safely if something goes wrong.
Without these traits, Terraform becomes a liability. With them, it becomes your most powerful asset. The following 10 practices form the foundation of that trust.
Top 10 How to Write Terraform Script You Can Trust
1. Use Version Control and Branching Strategies
Version control is the bedrock of trust in any codebaseincluding Terraform. Never write or modify Terraform configurations directly on a production server or local machine. Always use Git (or another version control system) to track every change.
Adopt a branching strategy such as Git Flow or GitHub Flow. Create feature branches for new infrastructure changes, open pull requests for review, and require at least one approval before merging into the main branch. This enforces accountability and peer review.
Use protected branches to prevent force pushes and ensure that only CI/CD pipelines with passing tests can merge into production branches. Include commit messages that follow conventional commits (e.g., feat: add S3 bucket for logs, fix: correct IAM policy permissions). This makes audits and rollbacks easier.
Tag releases for major infrastructure versions (e.g., v1.2.0). This allows you to roll back to a known-good state if a deployment causes unexpected behavior. Version control isnt just about historyits about trust through transparency.
2. Modularize Your Code
Monolithic Terraform configurations are unmaintainable and untrustworthy. As your infrastructure grows, putting all resources in a single main.tf file becomes a nightmare to debug, test, or reuse.
Break your infrastructure into reusable, purpose-driven modules. Each module should encapsulate a single logical component: a VPC, a database, a Kubernetes cluster, or a set of IAM roles. Modules should have clearly defined inputs and outputs, with minimal internal assumptions.
For example, instead of hardcoding an RDS instance in your main configuration, create a module called rds-postgres that accepts variables like instance_class, storage_size, and backup_retention. Then call that module from your environment-specific configurations (dev, staging, prod).
Modularization improves trust by enabling testing in isolation. You can unit-test a module independently of the rest of your stack. It also promotes reuse across teams and projects, reducing duplication and inconsistency. Use the Terraform Registry to consume community modules, but always vet them before use. Prefer private modules for sensitive or custom infrastructure to maintain control and security.
3. Validate Inputs with Variable Definitions and Validation Blocks
One of the most common causes of Terraform failures is invalid or unexpected input. Hardcoded values, typos in variable names, or missing required fields can lead to silent failures or destructive changes.
Always define variables with explicit types, descriptions, and default values where appropriate. Use validation blocks to enforce constraints at plan time. For example:
variable "instance_type" {
description = "EC2 instance type for web servers"
type = string
validation {
condition = contains(["t3.micro", "t3.small", "t3.medium"], var.instance_type)
error_message = "Instance type must be t3.micro, t3.small, or t3.medium."
}
}
This prevents accidental deployment of oversized or unsupported instances. Similarly, validate CIDR ranges, port numbers, and string lengths. Use sensitive = true for secrets like API keys, and never hardcode them in source files.
Validation at the variable level catches errors before Terraform even attempts to create resources. This reduces the chance of partial deployments and rollbacks. Trust is built when your code refuses to proceed unless inputs are correct.
4. Enforce Code Style and Linting
Consistent code style isnt just about aestheticsits about readability, maintainability, and error prevention. A poorly formatted Terraform file can hide logic errors, misaligned resource blocks, or missing commas.
Use Terraform fmt to automatically format your code. Add this step to your CI pipeline: terraform fmt -check -diff. This ensures every contributor follows the same formatting rules.
Additionally, use tfsec or checkov to lint your code for security misconfigurations. These tools scan for common vulnerabilities: publicly accessible S3 buckets, unencrypted EBS volumes, overly permissive IAM policies, or SSH access from 0.0.0.0/0.
Integrate linting into your pull request workflow. If a PR fails linting, it cannot be merged. This creates a culture of quality and prevents bad configurations from ever reaching production.
Trust grows when every line of code is inspectednot just for functionality, but for safety and consistency.
5. Use Remote State with Locking and Encryption
Terraform state is the single source of truth for your infrastructure. If state is lost, corrupted, or accessed concurrently without locking, your infrastructure can become inconsistent or destroyed.
Never use local state in production. Always store state remotely using a backend like Terraform Cloud, AWS S3 with DynamoDB locking, or Azure Blob Storage. Remote backends provide:
- Centralized state management
- State locking to prevent concurrent applies
- Versioning and encryption at rest
- Access control via IAM or RBAC
Enable state versioning on your S3 bucket and use server-side encryption (SSE-S3 or SSE-KMS). Restrict access to state files using bucket policies and IAM roles. Only CI/CD pipelines or authorized operators should have write access.
Regularly back up your state files. Consider using a separate state bucket for disaster recovery. If state is compromised, you can restore from a known-good version. Trust in Terraform depends on the integrity of your state. Treat it like a databasebecause it is.
6. Test Your Infrastructure with Terratest or InSpec
Unit tests for code? Yes. Unit tests for infrastructure? Absolutely.
Terraform scripts should be tested just like application code. Use Terratest (Go-based) or InSpec (Ruby-based) to write automated tests that verify the actual state of your infrastructure after deployment.
For example, write a Terratest that:
- Deploys a VPC module
- Verifies that 3 subnets were created across 2 availability zones
- Confirms that the internet gateway is attached
- Checks that a security group allows HTTPS on port 443
Run these tests in your CI pipeline before merging. If a change breaks expected behavior, the pipeline fails. This prevents regressions and ensures that infrastructure changes meet defined requirements.
Testing transforms Terraform from a hope it works tool into a verified, reliable system. Trust is earned when you can prove, not just assume, that your infrastructure behaves correctly.
7. Implement Policy as Code with Sentinel or Open Policy Agent
Even with testing and validation, human error or misconfiguration can slip through. Policy as Code (PaC) enforces organizational rules automatically, regardless of who writes the Terraform.
Use Sentinel (for Terraform Cloud) or Open Policy Agent (OPA) with Conftest to define policies that reject non-compliant configurations. For example:
- All S3 buckets must have versioning enabled.
- No public IP addresses are allowed on EC2 instances in production.
- All IAM policies must follow the principle of least privilege.
These policies are evaluated during terraform plan or apply. If a policy is violated, the operation is blocked with a clear error message.
Policy as Code removes subjectivity from compliance. It ensures that security and governance standards are enforced uniformly across teams, environments, and regions. Trust increases when rules are automatednot left to human judgment.
8. Use Secrets Management and Avoid Hardcoding
Hardcoding passwords, API keys, or access tokens in Terraform files is a critical security flaw. These secrets can be exposed in version control, CI logs, or backups.
Always use external secrets management tools: AWS Secrets Manager, HashiCorp Vault, Azure Key Vault, or Google Secret Manager. Reference secrets in Terraform using data sources or environment variables.
Example with AWS Secrets Manager:
data "aws_secretsmanager_secret_version" "db_password" {
secret_id = "prod/database/password"
}
resource "aws_rds_cluster" "example" {
master_password = data.aws_secretsmanager_secret_version.db_password.secret_string
}
Never use terraform.tfvars for secrets. Use .tfvars files only for non-sensitive configuration. Use environment variables (TF_VAR_*) or input prompts for sensitive values during local development.
Rotate secrets regularly and automate their renewal using infrastructure automation. Trust is broken when credentials are exposed. Secure secrets management is non-negotiable for production-grade Terraform.
9. Review and Approve Changes with CI/CD Pipelines
Manual approval of Terraform plans is a best practicebut manual execution is not. Automate deployment through CI/CD pipelines to ensure consistency, auditability, and repeatability.
Use GitHub Actions, GitLab CI, Jenkins, or CircleCI to run the following steps:
- terraform init
- terraform validate
- terraform plan -out=tfplan
- Run tfsec/checkov/terratest
- Require manual approval before terraform apply
- Store plan file securely and apply only from the pipeline
This workflow ensures that:
- Changes are reviewed before execution
- Plans are generated from a clean environment
- Apply is never run locally on a developer machine
- All actions are logged and traceable
CI/CD pipelines eliminate the it worked on my machine problem. They create a single source of truth for deployment. Trust is built when infrastructure changes are predictable, auditable, and repeatablenot ad hoc.
10. Document EverythingModules, Variables, Outputs, and Dependencies
Code is not self-documenting. A week after writing a complex Terraform module, even you may forget why a certain variable exists or what an output represents.
Document every module using a README.md file that includes:
- Purpose and use cases
- Input variables with descriptions, types, and defaults
- Output values and their meaning
- Dependencies (e.g., requires AWS provider v4.0+)
- Example usage
- Known limitations
Use Terraforms built-in documentation generation with terraform-docs to auto-generate markdown from your variable and output definitions. Keep documentation in sync with code.
Document state management procedures, backup schedules, and rollback steps. Create a runbook for common operations: how to add a new environment, how to rotate secrets, how to recover from state corruption.
Trust is not just technicalits cultural. When knowledge is shared, onboarding is faster, errors are fewer, and teams collaborate more effectively. Well-documented Terraform is maintainable Terraform. And maintainable Terraform is trusted Terraform.
Comparison Table
The following table compares the impact of adopting versus neglecting each of the 10 trust-building practices. Use this as a maturity assessment for your Terraform environment.
| Practice | Adopted Impact | Neglected Impact |
|---|---|---|
| Version Control & Branching | Full audit trail, rollback capability, team collaboration | Configuration drift, lost changes, no accountability |
| Modularization | Reusable, testable, scalable components | Monolithic, unmaintainable code; duplication and inconsistency |
| Variable Validation | Prevents invalid inputs before deployment | Runtime failures, partial deployments, manual fixes |
| Code Linting | Consistent style, early detection of security issues | Hard-to-read code, undetected misconfigurations |
| Remote State with Locking | Safe concurrent access, encrypted storage, versioning | State corruption, race conditions, data loss |
| Infrastructure Testing | Verified behavior, regression prevention | Assumed correctness, silent failures in production |
| Policy as Code | Automated compliance, enforced security standards | Manual audits, inconsistent policies, compliance violations |
| Secrets Management | Secure credential handling, no exposure in code | Secret leaks, credential theft, regulatory penalties |
| CI/CD Pipelines | Automated, auditable, repeatable deployments | Manual, error-prone, undocumented changes |
| Comprehensive Documentation | Fast onboarding, knowledge retention, team alignment | Knowledge silos, high turnover cost, confusion |
Organizations that adopt all 10 practices achieve enterprise-grade Terraform maturity. Those neglecting even 23 are at high risk of failure. Use this table to identify gaps in your current workflow and prioritize improvements.
FAQs
Can I use Terraform without version control?
No, you should not. Version control is not optionalits foundational. Without it, you lose the ability to track changes, collaborate safely, or roll back from errors. Terraform state is fragile; without Git, youre managing infrastructure by guesswork.
How often should I update my Terraform provider versions?
Update providers only after testing. Major version changes can introduce breaking changes. Pin providers to specific versions in your terraform.tfvars file, and test upgrades in a non-production environment first. Use terraform providers lock to ensure consistency across teams.
Is it safe to use community modules from the Terraform Registry?
They can be, but always audit them. Review the modules source code, check its last update, verify the maintainers reputation, and scan for security issues. Never use a module without understanding what it does. Prefer modules with active maintenance, good documentation, and tests.
Whats the difference between terraform validate and terraform plan?
terraform validate checks syntax and configuration correctness without connecting to cloud providers. terraform plan simulates the changes Terraform will make by comparing your configuration with the current state. Both are essentialvalidate catches errors early; plan reveals impact.
How do I handle environment-specific differences (dev vs prod)?
Use separate Terraform workspaces or directories with environment-specific variable files (e.g., dev.tfvars, prod.tfvars). Avoid conditional logic within modules. Keep modules generic and inject environment values externally. This ensures modules remain reusable and testable.
Can Terraform detect and fix drift automatically?
Terraform does not automatically fix drift. It detects drift during plan and will propose changes to reconcile state. However, drift should be investigated, not ignored. Use tools like Terraform Clouds drift detection or custom scripts to alert on unexpected changes. Fix drift manually after root cause analysis.
What should I do if my Terraform state is corrupted?
Restore from a backup. Always maintain state backups. If no backup exists, manually recreate the infrastructure from scratch using documentation and resource IDs. Never edit state files directly unless absolutely necessaryand never without a full backup. State corruption is a disaster; prevention is critical.
Do I need to test every single resource in my Terraform code?
Nobut you should test critical and high-risk components. Focus on networking, security groups, IAM policies, and data persistence. Use integration tests to verify end-to-end behavior rather than unit-testing every single resource. Prioritize based on impact and failure probability.
How do I prevent Terraform from destroying resources during a plan?
Use terraform plan -out=tfplan to generate a plan file, then review it carefully before applying. Use lifecycle blocks (e.g., prevent_destroy = true) on critical resources. Implement policy as code to block destructive changes in production. Always require manual approval before apply.
Should I use Terraform for everything?
No. Terraform excels at provisioning and managing infrastructure resources. Its not ideal for application deployment, configuration management, or runtime orchestration. Use Ansible, Chef, or Kubernetes operators for those tasks. Combine tools appropriatelyTerraform for infrastructure, other tools for software.
Conclusion
Writing Terraform scripts you can trust is not about writing more codeits about writing smarter, safer, and more deliberate code. The 10 practices outlined in this guide form a comprehensive framework for building infrastructure that is predictable, secure, maintainable, and auditable.
Trust is earned through discipline: versioning every change, validating every input, testing every module, securing every secret, and documenting every decision. Its not enough to deploy infrastructureyou must be able to explain, reproduce, and defend it.
Organizations that adopt these practices reduce outages, accelerate deployment velocity, improve compliance, and empower engineers to innovate with confidence. Terraform becomes a force multipliernot a liability.
Start small. Pick one practiceperhaps version control or variable validationand implement it today. Then add another. Over time, these habits compound into a culture of reliability. Your infrastructure will thank you.
The goal is not perfect code. The goal is trustworthy code. And with these principles, youre not just writing Terraformyoure building the foundation of resilient cloud systems.