Workload

Performance tuning & optimization for Redshift → Snowflake

Redshift tuning habits (DISTKEY/SORTKEY, VACUUM/ANALYZE, WLM queues) don’t translate. We tune Snowflake layout, query shapes, and warehouse posture so pruning works, credits stay stable, and refresh SLAs hold under real concurrency.

Book assessment See migration approach

Back to pair page

At a glance

Input: Redshift Performance tuning & optimization logic
Output: Snowflake equivalent (validated)
Common pitfalls: Carrying over DIST/SORT thinking: assuming physical distribution patterns translate; Snowflake needs pruning-first predicates and evidence-driven clustering.
Clustering by folklore: clustering keys chosen without evidence from predicates/join keys.
Full-target MERGE: missing apply boundaries causes large scans and credit spikes.

Context

Why this breaks

Redshift performance tuning is often encoded in physical design and platform operations: DISTKEY/SORTKEY choices, VACUUM/ANALYZE, and WLM behavior. After migration, teams keep Redshift-era query shapes and expect Snowflake to behave similarly—then costs spike and SLAs slip because Snowflake’s cost/perf model is different. In Snowflake, credit burn is driven by warehouse usage and scan footprint, and MERGE/apply workloads can become full-target scans without bounded scope.

Common post-cutover symptoms:

Queries scan too much because predicates don’t prune effectively
Join-heavy reports reshuffle large datasets; BI latency increases
MERGE/apply jobs scan full targets; credit burn becomes unpredictable
BI refresh contends with batch loads without warehouse isolation
Improvements regress because there are no baselines or gates

Approach

How conversion works

Baseline top workloads: identify the most expensive and most business-critical queries/pipelines (dashboards, marts, incremental loads).
Diagnose root causes: pruning effectiveness, join patterns, large scans, and MERGE scope.
Tune data layout: pruning-friendly clustering where justified by access paths (not folklore).
Rewrite for bounded applies: staged apply, bounded MERGE scopes, and pruning-friendly predicates.
Warehouse posture: isolate batch vs BI warehouses, tune concurrency and sizing, and implement credit guardrails.
Regression gates: store baselines and enforce thresholds so improvements persist.

Supported constructs

Representative tuning levers we apply for Redshift → Snowflake workloads.

Source	Target	Notes
DISTKEY/SORTKEY-era performance assumptions	Pruning-first predicates + evidence-driven clustering	Replace physical tuning with scan-footprint reduction.
WLM queue and concurrency settings	Warehouse isolation + concurrency posture	Stabilize refresh SLAs under peak BI load.
UPSERT jobs and incremental loads	Bounded MERGE scopes + staged apply	Avoid full-target scans and unpredictable credit burn.
Repeated BI scans	Pre-aggregation/materialization strategy	Reduce repeated large scans and stabilize dashboards.
Vacuum/analyze maintenance habits	Evidence-based tuning via query/warehouse metrics	Use baselines and regression gates instead of maintenance folklore.
Ad-hoc expensive queries	Governance guardrails + regression gates	Prevent credit blowups from unmanaged changes.

How workload changes

Topic	Redshift	Snowflake
Primary cost driver	Cluster utilization and WLM behavior	Warehouse credits + scan footprint
Data layout impact	DIST/SORT keys can hide suboptimal SQL	Clustering is optional and evidence-driven
Incremental apply	DELETE+INSERT UPSERT conventions	Bounded MERGE/apply windows
Concurrency planning	WLM queues and limits	Warehouse isolation + concurrency policies

Primary cost driver: Pruning and bounded scans dominate credit burn.

Data layout impact: Use clustering only when predicates/join keys justify it.

Incremental apply: Correctness and cost depend on apply window design.

Concurrency planning: Peak BI refresh needs explicit warehouse posture.

Examples

Illustrative Snowflake optimization patterns after Redshift migration: enforce pruning-friendly filters, bound MERGEs, isolate warehouses, and store baselines for regression gates.

-- Pruning-first query shape (avoid wrapping filter columns)
SELECT
  region,
  SUM(revenue) AS rev
FROM MART.FACT_ORDERS
WHERE EVENT_DATE BETWEEN DATE '2025-01-01' AND DATE '2025-01-31'
GROUP BY 1;

Avoid

Common pitfalls

Carrying over DIST/SORT thinking: assuming physical distribution patterns translate; Snowflake needs pruning-first predicates and evidence-driven clustering.
Clustering by folklore: clustering keys chosen without evidence from predicates/join keys.
Full-target MERGE: missing apply boundaries causes large scans and credit spikes.
No warehouse isolation: BI and batch share warehouses; tail latency and spend spikes follow.
Over-materialization: too many intermediates without controlling refresh cost.
No regression gates: performance improves once, then regresses silently.

Proof

Validation approach

Baseline capture: runtime, scan footprint, and credits for top queries/pipelines.
Pruning checks: confirm pruning-friendly predicates and reduced scan footprint on representative parameters.
Before/after evidence: demonstrate improvements in runtime and credit burn; document tradeoffs.
Correctness guardrails: golden queries and KPI aggregates ensure tuning doesn’t change semantics.
Regression thresholds: define alerts (e.g., +30% credits or +30% runtime) and enforce via CI or scheduled checks.
Operational monitors: dashboards for warehouse utilization, credit burn, failures, and refresh SLA adherence.

Execution

Migration steps

A sequence that improves performance while protecting semantics.

01
Identify top cost and SLA drivers
Rank queries and pipelines by credits, runtime, and business criticality (dashboards, batch windows). Select a tuning backlog with clear owners.
02
Create baselines and targets
Capture current Snowflake metrics (runtime, credit burn, scan footprint) and define improvement targets. Freeze golden outputs so correctness doesn’t regress.
03
Tune layout and apply posture
Apply evidence-driven clustering where beneficial and redesign apply windows so MERGEs are bounded and pruning remains effective.
04
Rewrite for pruning and reuse
Apply pruning-aware rewrites, reduce repeated large scans with materializations where needed, and scope MERGEs to affected windows.
05
Warehouse posture and governance
Isolate batch and BI warehouses, tune concurrency, and implement guardrails to prevent credit blowups from new queries.
06
Add regression gates
Codify performance thresholds and alerting so future changes don’t reintroduce high credit burn or missed SLAs. Monitor post-cutover metrics continuously.

Workload Assessment

Replace Redshift tuning with Snowflake-native pruning

We identify your highest-cost migrated workloads, tune pruning and apply windows, and deliver before/after evidence with regression thresholds—so performance improves and stays stable.

Book assessment See migration approach

Optimization Program

Prevent credit blowups with regression gates

Get an optimization backlog, bounded apply patterns, and performance gates (credit/runtime thresholds) so future releases don’t reintroduce slow dashboards or high credit burn.

Book assessment Explore workloads

FAQ

Frequently asked questions

Why did credits increase after moving from Redshift to Snowflake?+

Most often because DIST/SORT-era query shapes and UPSERT conventions translate into larger scans in Snowflake. We tune pruning-friendly predicates, bound MERGE scopes, and isolate warehouses to stabilize credit burn.

Do we need clustering everywhere?+

No. Clustering is most useful for large tables with stable, selective predicates. We use evidence from query patterns and scan behavior before recommending clustering keys.

Can you optimize MERGE/apply pipelines too?+

Yes. We scope MERGEs to affected windows, design staging boundaries, and validate performance with credit/runtime baselines to prevent unpredictable scans.

How do you prevent BI refresh from impacting batch pipelines?+

We isolate warehouses (or use workload-aware sizing and concurrency limits) so BI and batch don’t contend. Monitoring and regression gates catch credit spikes and SLA regressions early.

Redshift to Snowflake migration

Migration Pair

End-to-end approach: what breaks, validation gates, and cutover plan.

View page

ETL / pipeline migration

Workload

Migrate Redshift pipelines with restartability and bounded MERGE patterns.

View page

SQL / query migration

Workload

Convert Redshift SQL to Snowflake with semantic parity and golden-query validation.

View page