Other recent blogs
Let's talk
Reach out, we'd love to hear from you!
Summery:- AWS cost optimization is the continuous practice of reducing cloud spending while maintaining performance — through right-sizing EC2 instances, committing to Reserved Instances or Savings Plans (up to 72% savings), enabling auto-scaling, eliminating idle resources, and embedding FinOps governance. According to Flexera's 2025 State of the Cloud Report, organizations waste an average of 32% of cloud spend. The 12 best practices in this guide directly address that waste.
Cloud infrastructure is no longer a back-office IT expense — it is a direct driver of business competitiveness. Yet despite this strategic importance, most organizations on AWS are significantly overspending. The Flexera 2025 State of the Cloud Report found that 32% of cloud spend is wasted on over-provisioned and idle resources. For a company with a $1 million annual AWS bill, that is $320,000 going nowhere every year.
The good news: AWS cost optimization in 2026 is more achievable than ever, with a maturing toolset — from the AWS Cost Optimization Hub to Graviton4 instances offering 40% better price-performance — and a growing body of FinOps practices that turn cloud spend from an uncontrolled variable into a disciplined, measurable asset.
This guide covers 12 AWS cost optimization best practices for 2026, organized from foundational quick wins to advanced FinOps maturity. Whether you are just starting to optimize or looking to move beyond right-sizing into unit economics, you will find a clear, actionable path.
What is AWS cost optimization?
AWS cost optimization is the practice of reducing overall cloud spend while maximizing business value. It is not just cutting costs; it is ensuring that every dollar is utilized properly and attached to necessary business outcomes.
In other words, cloud optimization is the application of the financial principles of the cloud’s pay-as-you-go model to achieve maximum efficiency.
AWS cost optimization focuses on four main pillars.
- Right-sizing: This involves performing an actual analysis of your application’s actual resource demands ( CPU, memory, storage) and matching them to the appropriate, smallest, and cheapest EC2 instance types. By eliminating over-provisioning, you avoid paying for compute capacity that sits idle, ensuring better performance per dollar spent.
- Elasticity and automation: You should leverage AWS features like auto-scaling and serverless technologies(Lambda) to dynamically adjust capacity based on real-time demand. This ensures resources automatically scale down during peak hours, paving the way for you to only pay for compute and storage when it’s actively required by users.
- Pricing model utilization: The best way to reduce high and consistent usage costs is by committing to capacity via Reserved instances(RIs) or saving plans. These contractual agreements offer significant discounts, up to 72% off the standard on-demand pricing, rewarding organizations for their stable, long-term cloud commitment.
- Continuous governance: This is about establishing an organizational culture and processes to regularly monitor, track, and optimize cloud spending. This includes identifying and eliminating ghost resources, regularly enforcing resource tagging for cost allocation, and setting up billing alerts to manage unexpected egress fees.
Following these AWS cost optimization best practices ensures that organizations gain visibility, avoid waste, and align cloud spending with real business value.
Why AWS costs spiral out of control
AWS's pay-as-you-go model is powerful precisely because it is frictionless — resources spin up in seconds and scale on demand. But that same frictionlessness means costs accumulate just as silently. Three structural reasons explain why even well-run engineering teams end up overspending:
- Provision-once, forget-forever: Infrastructure is sized for peak load at launch and rarely revisited. CPU and memory drift into chronic underutilization as traffic patterns evolve.
- Invisible shared costs: Data egress fees, NAT gateway charges, cross-AZ transfer costs, and CloudWatch log ingestion quietly inflate the bill outside engineers' mental models.
- No FinOps culture: Engineering teams optimize for performance; finance teams track total spend. Without a shared FinOps practice that gives both teams real-time cost visibility, waste goes undetected for months.
The result is a pattern Gartner describes as cloud cost drift — budgets overshoot by 10–30% year-over-year not because organizations are scaling, but because they are funding infrastructure that no longer serves a business purpose.
AWS cost optimization is the discipline that closes this gap. Below are the 12 best practices that do so most effectively in 2026.
12 AWS cost optimization best practices for 2026
1. Establish visibility before you optimize anything
You cannot optimize what you cannot see. Before implementing any other practice in this list, set up a cost visibility foundation using AWS-native tooling:
- AWS Cost Explorer — Visualize spending by service, account, region, and tag. Enable hourly granularity for accurate anomaly detection.
- AWS Cost Optimization Hub (launched 2023, free) — Aggregates rightsizing recommendations from Compute Optimizer, Trusted Advisor, and Savings Plans Recommendations into a single consolidated view with estimated savings in dollars.
- AWS Cost Anomaly Detection — Uses machine learning to alert you within hours of unusual spend spikes, preventing runaway costs from going undetected until the end-of-month invoice.
Action: Enable all three tools and configure budget alerts. Set a monthly budget threshold at 110% of your previous month's spend, with SNS notifications to your engineering Slack channel. This single step has prevented five- and six-figure surprise bills for Kellton's clients.
2. Implement mandatory resource tagging from day one
Without consistent tagging, your AWS bill is a wall of service names with no business context. A robust tagging strategy transforms cost data into cost intelligence.
Mandatory tag keys to enforce across all resources:
- Owner — the team or individual responsible
- Project or Product — the business initiative the resource serves
- Environment — Production, Staging, Dev, UAT
- CostCenter — for chargeback or showback reporting
- Lifecycle — for identifying time-limited resources (e.g., Temporary)
Action: Use AWS Organizations Service Control Policies (SCPs) to block resource creation without mandatory tags. Use AWS Config rules to flag non-compliant resources. This is non-negotiable infrastructure for any FinOps practice — tagging unlocks every downstream cost-attribution and optimization workflow.
3. Right-size EC2, RDS, and Lambda continuously
Right-sizing is the process of matching your instance type and size to actual resource consumption — not to theoretical peak. It is the highest-ROI quick win for most organizations because compute is typically the largest AWS line item.
How to right-size in 2026:
- Use AWS Compute Optimizer to get ML-powered instance recommendations based on 14 days of CloudWatch metrics. Compute Optimizer now supports EC2, EBS, Lambda, ECS on Fargate, and RDS.
- For EC2: look for instances with average CPU utilization below 40% — these are candidates for downsizing to the next smaller instance type.
- For Lambda: AWS Lambda Power Tuning (open-source tool) runs your function at multiple memory settings and finds the configuration with the best cost-performance ratio.
- For RDS: use Performance Insights to identify underutilized databases. Consider Aurora Serverless v2 for variable database workloads — it scales in 0.5 ACU increments and charges only for capacity used.
Action: Run Compute Optimizer monthly. Create a "right-sizing sprint" in your engineering team's quarterly planning to implement the top 20% of recommendations — which typically account for 80% of potential compute savings.
4. Migrate eligible workloads to AWS Graviton4 instances
AWS Graviton4 (available from 2024 in select regions) delivers up to 40% better price-performance compared to equivalent x86-based instances — and up to 30% better performance than Graviton3. This is now one of the highest-leverage cost optimizations available in 2026, especially for compute-intensive workloads.
Best candidates for Graviton migration:
- Web applications running on Node.js, Python, or Java (Spring Boot)
- Containerized microservices on ECS or EKS
- Read replicas on Amazon RDS (low-risk migration)
- AWS Lambda functions (arm64 runtime is 20% cheaper than x86)
Action: Start with non-production environments and stateless services. Use AWS Compute Optimizer's Graviton migration recommendations (available in Cost Optimization Hub) to identify your top candidates. The AWS Free Tier offers t4g.small Graviton2 instances for testing at no cost.
5. Replace on-demand pricing with Reserved Instances and Savings Plans for stable workloads
On-demand pricing is the most expensive way to run predictable, steady-state workloads. Committing to capacity through Reserved Instances (RIs) or Savings Plans is the single largest cost reduction lever for most organizations — and it requires no architectural changes.
Reserved Instances vs Savings Plans vs Spot Instances: which to use when
| Option | Max savings vs on-demand | Flexibility | Commitment | Best for |
|---|---|---|---|---|
| Standard Reserved Instances | Up to 72% | Low (fixed instance type & region) | 1 or 3 years | Steady-state EC2, RDS, ElastiCache with predictable usage |
| Convertible Reserved Instances | Up to 54% | Medium (can exchange instance family) | 1 or 3 years | Steady-state workloads that may change instance type over time |
| Compute Savings Plans | Up to 66% | High (applies across EC2, Fargate, Lambda, regions) | 1 or 3 years (hourly spend commitment) | Most organizations — best balance of savings and flexibility |
| EC2 Instance Savings Plans | Up to 72% | Low (specific instance family + region) | 1 or 3 years | High-volume, stable instance families in a single region |
| Spot Instances | Up to 90% | N/A (can be interrupted with 2-min notice) | None | Batch jobs, CI/CD pipelines, ML training, fault-tolerant workloads |
| On-Demand | Baseline (0%) | Full (start/stop any time) | None | Unpredictable or short-lived |
Action: Analyze 6–12 months of CloudWatch and Cost Explorer usage data to identify your stable "baseline" compute. Cover this baseline with Compute Savings Plans (1-year commitment for most teams). Layer Spot Instances on top for interruptible workloads. Use on-demand only for burst capacity and new workloads until their pattern stabilizes.
6.Use Spot Instances for batch, CI/CD, and ML training workloads
Spot Instances use spare AWS capacity and cost up to 90% less than on-demand. They are interrupted with a 2-minute warning when AWS needs the capacity back — which makes them inappropriate for stateful production databases but ideal for a large class of engineering workloads.
Spot Instance use cases in 2026:
- CI/CD pipelines (Jenkins, GitHub Actions runners) — each build is stateless and can restart automatically
- Machine learning training jobs on SageMaker — training can be checkpointed and resumed
- Batch data processing with AWS Batch — auto-managed retry logic handles interruptions
- Test and QA environments — non-critical by definition
- Containerized microservices on EKS/ECS — use a mix of on-demand and Spot node groups, with Spot for stateless services
Action: Enable Spot Instance diversification — request multiple instance types and sizes across multiple Availability Zones using EC2 Fleet or EKS managed node groups. This dramatically reduces interruption frequency. Use AWS Spot Placement Score to find the regions and instance families with the most available capacity.
7. Implement auto-scaling and serverless for variable workloads
Fixed-size infrastructure running 24/7 is the cloud equivalent of leaving every light in your office on overnight. Auto-scaling and serverless architectures ensure you pay only for what you actively use — which for most web applications means paying for 40–60% less compute than a fixed fleet would require.
Key AWS auto-scaling capabilities:
- EC2 Auto Scaling Groups — scale horizontally based on CPU, memory, or custom CloudWatch metrics. Use target tracking to maintain a specific average utilization.
- Application Auto Scaling — covers ECS, DynamoDB, Aurora, and Lambda Provisioned Concurrency.
- AWS Lambda — serverless functions charge per invocation and per millisecond of execution. Zero cost when idle. Ideal for event-driven, asynchronous, and infrequent workloads.
- AWS Instance Scheduler — automatically stop non-production EC2 and RDS instances outside business hours. A Dev/Test environment running 5 days × 9 hours instead of 7 days × 24 hours saves 73% of compute costs with zero refactoring.
Action: Audit every non-production environment running 24/7. Deploy AWS Instance Scheduler to stop them outside business hours this week. This is the fastest ROI action in this entire list — implementation takes 2 hours and savings start the same day.
8. Optimize S3 storage costs with Intelligent-Tiering and lifecycle policies
S3 is often the second or third largest AWS cost line item for data-heavy organizations — and it is frequently over-provisioned in expensive storage tiers. Three optimizations reduce S3 costs dramatically without touching application code.
- S3 Intelligent-Tiering — automatically moves objects between frequent-access, infrequent-access, and archive tiers based on 30-day access patterns. No retrieval fees on transition. Enabled with a single bucket policy change.
- S3 Lifecycle Policies — transition objects to S3-IA after 30 days, Glacier after 90 days, and delete after 365 days (or your data retention requirement). Glacier Instant Retrieval costs 68% less than S3 Standard.
- S3 Storage Lens — identifies cold data, redundant buckets, and non-current object versions that are silently accumulating storage costs across accounts.
Action: Enable S3 Storage Lens across your AWS Organization. Identify the top 10 buckets by storage cost and apply Intelligent-Tiering to all buckets with objects older than 30 days. Set lifecycle rules to delete non-current object versions after 30 days — this alone commonly reduces S3 costs by 20–30%.
9. Minimize data egress costs with architecture and caching
Data egress — traffic leaving the AWS network to the internet, to end users, or to other cloud providers — is one of the most common surprise charges on AWS bills. AWS charges up to $0.09 per GB of outbound data from most regions, and inter-AZ transfer within the same region costs $0.02/GB in each direction (often overlooked).
Egress cost reduction strategies:
- Amazon CloudFront (CDN) — caches content at 450+ edge locations globally. Data served from CloudFront costs significantly less than data served directly from EC2 or S3. Critical for media, web assets, and API responses.
- VPC Endpoints — replace NAT Gateway for private subnet traffic to S3 and DynamoDB. VPC Endpoint traffic is free; NAT Gateway charges $0.045 per GB processed. For high-volume S3 access, this is a major saving.
- Keep dependent services in the same Availability Zone — inter-AZ traffic incurs charges in both directions. For latency-tolerant services, co-location in one AZ eliminates these costs, though it trades high-availability.
- AWS Global Accelerator vs CloudFront — use CloudFront for cacheable content (static assets, API responses). Use Global Accelerator for dynamic, uncacheable TCP/UDP traffic requiring low latency.
Action: In AWS Cost Explorer, filter by "Data Transfer" service and identify your top egress line items. If NAT Gateway data processing is high, deploy VPC endpoints for S3 and DynamoDB immediately — this is a 30-minute change with permanent cost elimination.
10. Hunt and eliminate ghost resources systematically
Ghost resources are AWS assets that are no longer attached to any running workload but continue to accrue charges. They are the byproduct of fast-moving engineering teams and the absence of a resource lifecycle governance process.
Most common ghost resources to audit:
- Unattached EBS volumes — persist after EC2 instance termination. Average cost: $0.08–$0.10/GB-month. A 1TB unattached volume costs ~$100/month.
- Old EBS snapshots — initial full-volume snapshots are large and expensive. Implement AWS Data Lifecycle Manager to automatically archive or delete snapshots on a schedule.
- Idle Elastic IP addresses — AWS charges $0.005/hr (~$3.60/month) for each EIP not attached to a running instance.
- Zombie load balancers — Application Load Balancers with no healthy targets still charge ~$16/month + LCU hours.
- Unused Elastic Network Interfaces and old CloudWatch log groups with default retention settings (never expire).
Action: Run AWS Trusted Advisor's cost optimization checks monthly (free for Business and Enterprise support plans). For automation, use open-source tools like cloud-nuke or aws-nuke (with appropriate safeguards) to identify and delete ghost resources across all regions. Tag all temporary resources with Lifecycle: Temporary and a ExpiryDate tag — then enforce automated cleanup via Lambda.
11. Optimize AI and ML workload costs — the 2026 priority
In 2026, AI workload costs have become a major — and fast-growing — budget line item for technology organizations. Without specific optimization, inference workloads can consume 60–80% of an ML team's AWS budget.
AWS AI/ML cost optimization best practices for 2026:
- Use AWS Inferentia2 chips for inference — purpose-built for deep learning inference, Inferentia2 delivers up to 50% lower cost-per-inference than GPU instances for popular models (Llama, Stable Diffusion, BERT).
- Use AWS Trainium2 for model training — Trainium2 offers up to 50% lower training cost than comparable GPU instances for large language models.
- Leverage Amazon SageMaker Savings Plans — up to 64% off on-demand SageMaker instance pricing for a 1- or 3-year commitment.
- Use Amazon Bedrock for inference instead of self-hosted models — for organizations running open-source models on EC2 GPU instances, switching to Bedrock's API pay-per-token model often reduces cost for low-to-medium volume inference.
- Rightsize GPU instances — Compute Optimizer now provides GPU utilization recommendations. Many ML workloads run on p3 or p4 instances at 20–30% GPU utilization; migrating to g5 or inf2 instances that match actual GPU memory requirements saves 40–60%.
Action: If your organization has GPU instances running in AWS, enable GPU utilization metrics in CloudWatch immediately. If average GPU utilization is below 60%, run Compute Optimizer and consider Inferentia2 migration for inference workloads.
12 Embed FinOps culture: make cost a first-class engineering metric
Every other best practice in this list is a tactic. FinOps is the strategy that sustains them. Without organizational processes that make cloud cost a real-time, visible metric for engineering teams, cost optimizations decay — teams re-provision oversized instances, ghost resources accumulate again, and on-demand spending creeps back up.
Building a FinOps culture in practice:
- Unit economics over total spend — shift your KPI from "total AWS bill" to "AWS cost per active user" or "cloud cost per transaction." A growing AWS bill that accompanies growing revenue is healthy; unit cost degradation is a problem. Cost Explorer custom reports and third-party tools like CloudZero or Apptio Cloudability enable this.
- Weekly cost reviews in engineering standups — 15 minutes per week reviewing cost anomalies and Trusted Advisor findings builds cost awareness organically.
- Cost gates in CI/CD pipelines — tools like Infracost can estimate the AWS cost of a Terraform plan before it is applied, surfacing cost implications in pull requests before infrastructure changes are deployed.
- FinOps Center of Excellence (CoE) — a small cross-functional team (1–2 cloud architects, a finance partner, and engineering team leads) that owns cloud cost governance, reviews commitments, and drives quarterly optimization sprints.
Action: Identify one engineer per team to be the "cloud cost champion." Give them read access to Cost Explorer with a tag-filtered view of their team's spend. Cost champions who can see their team's weekly spend report identify and fix inefficiencies 3x faster than teams relying on centralized cloud ops reviews.
How cloud migration strategy affects long-term AWS costs
AWS cost optimization does not begin after migration — it begins before the first workload moves. The architectural choices made during migration determine whether your AWS environment is inherently cost-efficient or inherently wasteful.
The lift-and-shift trap
A "lift-and-shift" migration — rehosting on-premise VMs on AWS EC2 with no changes — is the fastest migration path but the most expensive long-term outcome. Legacy applications were designed for a fixed, always-on data center cost model. They do not natively use auto-scaling, serverless functions, managed databases, or storage tiering. The result: oversized EC2 instances running 24/7, replicating your old data center's cost structure in the cloud at a higher per-unit price.
The right migration strategy for cost optimization
Before migrating any workload, assess it against the 6 R's framework and choose the strategy that maximizes long-term cost efficiency:
- Retire — decommission workloads no longer needed. The cheapest infrastructure is infrastructure you do not run.
- Rehost (lift-and-shift) — acceptable only for applications with a defined refactoring plan. Time-box technical debt cleanup to the next 6–12 months.
- Replatform — move to a managed AWS service with minimal code changes. Example: migrate on-premise SQL Server to Amazon RDS. Eliminates OS patching, storage management, and backup costs.
- Refactor — redesign for cloud-native architecture (microservices, Lambda, containers). Highest upfront effort; highest long-term cost efficiency.
- Repurchase — replace legacy software with a SaaS alternative. Converts CapEx to OpEx and eliminates hosting costs entirely.
- Retain — keep on-premise temporarily. Some workloads (regulatory, latency-sensitive) are cheaper on-premise. Hybrid architectures using AWS Direct Connect can optimize costs for these cases.
Build your secure landing zone before migrating
A Secure Landing Zone — built with AWS Control Tower or custom Infrastructure-as-Code — establishes cost governance, security baselines, and account structure before your first workload migrates. It enforces mandatory tagging, creates environment-level AWS accounts (production, staging, development) with separate cost budgets, and prevents the security retrofit costs that blindside many post-migration teams.
Kellton's cloud migration practice always begins with a Landing Zone blueprint, followed by an application portfolio assessment that assigns each workload its optimal "R" before a single VM moves.
Conclusion
Cloud migration to AWS offers transformative business benefits, but the surprising statistics - where 2/3rd of the projects fail paint a different story and prove that “click, create” ease is a dangerous illusion. One will not get successful by simply moving your data; it’s defined by strategic planning and disciplined execution.
The failures we have detailed - from the costly “lift and shift” and unexpected data egress fees to the drain of the skills gap and post-migration security retrofitting- all arise from poor visibility and a rush to meet deadlines. By proactively building a Secure Landing Zone, embedding FinOps practices, and investing in continuous cost optimization, organizations can move beyond the fear of the unknown. Treat your migration as a continuous business transformation, not a one-time IT project, to unlock AWS’s full potential and secure your long-term ROI.
Frequently asked questions: AWS cost optimization 2026
Q1. What is AWS cost optimization?
AWS cost optimization is the continuous practice of reducing cloud spending while maintaining performance and reliability. It involves right-sizing EC2 instances, committing to Reserved Instances or Savings Plans for up to 72% savings, enabling auto-scaling, eliminating idle resources, and embedding FinOps governance across engineering and finance teams. The AWS Well-Architected Framework defines cost optimization as one of its five pillars.
Q2. How can I reduce my AWS bill?
To reduce your AWS bill, start with visibility: enable AWS Cost Optimization Hub and Cost Anomaly Detection. Then eliminate waste (delete ghost resources, stop non-production environments after hours). Then right-size instances using Compute Optimizer. Then commit to Reserved Instances or Savings Plans for your stable baseline compute. Finally, migrate eligible workloads to Graviton instances and use Spot Instances for interruptible workloads. This sequence typically reduces AWS spend by 25–45% within 90 days.
Q3. How much can you save with AWS Reserved Instances?
AWS Reserved Instances save up to 72% compared to on-demand pricing for a 3-year no-upfront commitment. For a 1-year commitment, savings typically range from 30–40%. Compute Savings Plans offer up to 66% savings with more flexibility across instance families, regions, and services like Lambda and Fargate. Most organizations use both together: Savings Plans for the flexible baseline, and Standard RIs for specific high-volume instance families.
Q4. What is AWS FinOps?
AWS FinOps (Cloud Financial Operations) is an organizational practice that brings financial accountability to cloud spending. It bridges engineering, finance, and business teams to ensure every dollar of AWS spend is attributed, measured, and optimized. Core FinOps activities include resource tagging, cost allocation, unit economics tracking (cost per user, cost per transaction), and continuous rightsizing. The FinOps Foundation certifies practitioners and publishes the FOCUS specification for cloud cost data standardization.
Q5. Is AWS Cost Explorer free?
AWS Cost Explorer's basic dashboard and reporting features are included at no charge. Each API request costs $0.01. The AWS Cost Optimization Hub, which aggregates rightsizing recommendations from Compute Optimizer, Trusted Advisor, and Savings Plans Recommendations, is also free to use. AWS Trusted Advisor's cost optimization checks are available to all AWS customers, with additional checks available under Business and Enterprise support plans.
