How We Saved 60% on Data Infrastructure by Migrating from Redshift to Apache Iceberg
Where Did We Start?
When I joined the company, I noticed a significant expense on Redshift, yet the performance wasn’t impressive. While tuning could have improved it, Redshift didn’t offer anything outstanding. To me, it felt outdated in terms of functionality.
My goal was to find a solution that:
- Maintains (or improves) performance
- Significantly reduces costs
- Provides greater flexibility and new opportunities
- Meets our data security requirements by keeping data within our AWS account
Challenges with Redshift
Beyond the high costs, we faced several challenges and missed opportunities with Redshift:
- Scaling up and down is slow and lacks flexibility.
- Since storage and compute are coupled, querying data with other tools isn’t straightforward. In most cases, extracting data to S3 is necessary, which is both time-consuming and expensive.
Why Choose an Open Table Format (OTF), and Why Iceberg?
OTFs have gained traction in recent years. Iceberg stood out as a robust solution with contributions from multiple companies, making it less dependent on a single vendor compared to Delta Lake.
Why Athena, Spark, and Trino?
One of the key advantages of OTFs is the ability to use multiple tools on the same data.
While every data warehouse has its pros and cons, there is no “one-size-fits-all” solution. In our case, Athena is our primary transformation engine due to its cost-effectiveness, ease of use, and zero maintenance.
Iceberg + Athena (Powered by Trino)
✅ The Good
✔️ Simple pricing model, making cost estimation easy
✔️ Serverless, allowing parallel query execution (with some limitations)
✔️ Good integration with AWS Glue Data Catalog
✔️ Decent SQL functionality, largely documented in Trino’s documentation
⚠️ The Bad
❌ Limited support for sorting (not possible on table creation and compactions)
❌ Query performance can be slower than Trino for complex joins and aggregations
❌ No transparency on resource allocation — it’s essentially a black box
💀 The Ugly
☠️ Heavy queries (especially those with large window functions) can trigger “Resource Exhaustion” errors
☠️ Default query timeout is 30 minutes (configurable up to 240 minutes)
☠️ Many Trino functions are removed, and documentation is hard to find
☠️ Only integrates with AWS Glue Data Catalog
Iceberg + Trino
✅ The Good
✔️ Easy to deploy on K8s (using Helm charts) or Amazon EMR
✔️ Advanced table management features (sorting, bucketing, etc.)
✔️ Native Iceberg support with optimizations like partition pruning and predicate pushdown
✔️ Good resource and query monitoring, including query progress indicators
⚠️ The Bad
❌ Memory-intensive queries require careful tuning
❌ Managing a Trino cluster adds operational overhead
❌ Cost estimation is difficult when running on K8s, as it depends on auto-scaling settings
💀 The Ugly
☠️ Performance issues when using Trino for large-scale data writes and compactions, which can impact big data solutions
☠️ Deploying on K8s requires fine-tuning memory and auto-scaling, and garbage collection can be problematic
Iceberg + Spark
✅ The Good
✔️ Native integration between Spark and Iceberg
✔️ Strong data processing capabilities for large-scale datasets
✔️ Supports complex transformations and analytics on Iceberg tables
⚠️ The Bad
❌ Requires more setup and configuration compared to managed services like Athena
❌ May need additional tools for performance tuning and monitoring
💀 The Ugly
☠️ Potential compatibility issues between different versions of Spark and Iceberg
☠️ Careful version management is needed, especially when using Spark on AWS Glue
Other Options We Considered
Snowflake
- A strong candidate, but it wouldn’t reduce costs — it would likely increase them compared to Redshift
- Didn’t meet our data security requirements, as we’d have to store data in Snowflake
StarRocks
An interesting option, but after a POC in April 2024, we found:
- The solution was not mature enough for our needs
- No production-ready Helm charts, making K8s deployment difficult
- Limited SQL functionality
Although StarRocks has likely evolved since then, we felt it didn’t have the community support and stability we needed at the time.
Additional Tools in Our Stack
- Data orchestration with Airflow
- Airbyte (added during migration for data integration)
- Looker
- Redash
Final Thoughts
Migrating from Redshift to Iceberg with Athena, Trino, and Spark allowed us to cut costs by 60% while gaining flexibility. Each tool in our stack serves a specific purpose, balancing cost, performance, and scalability.
If you’re considering a similar migration, understanding the trade-offs of each tool is key. No single solution fits all use cases, but leveraging the strengths of multiple technologies can create a more efficient and cost-effective data infrastructure.
What’s Next?
In this post, I focused on why we migrated from Redshift to Iceberg with Athena, Trino, and Spark, along with the trade-offs of each tool.
In my next post, I’ll dive into the migration process itself — how we executed the transition, the challenges we faced, and the conclusions we drew from the experience. Stay tuned!