Snowflake SQL queries to BigQuery
Translate Snowflake-specific constructs—QUALIFY, VARIANT/FLATTEN, IFF/DECODE, time semantics, and MERGE/upsert patterns—into BigQuery Standard SQL with validation gates that catch semantic drift.
- Input
- Snowflake SQL / query migration logic
- Output
- BigQuery equivalent (validated)
- Common pitfalls
- QUALIFY drift: ordering is underspecified (no stable tie-breaker), so ‘top-1’ rows change under retries or parallelism.
- JSON type surprises: values extracted as strings (or NULL) and silently change filter selectivity unless casted at the boundary.
- Timezone assumptions: session timezone or NTZ/LTZ behavior is implicit in Snowflake; BigQuery requires explicit timezone conversions when business meaning depends on local time.
Why Snowflake → BigQuery SQL breaks in real migrations
Most Snowflake SQL estates contain implicit correctness rules—NULL handling, casting, timezone assumptions, and semi-structured extraction—that can compile in BigQuery yet drift in meaning. The goal is not just syntactic translation, but a provable parity contract for your highest-impact queries.
Common symptoms after cutover:
- KPI drift from unstable QUALIFY/window ordering or missing tie-breakers
- JSON/VARIANT extraction returns different types (string vs numeric) and breaks filters
- Timestamp logic shifts because session timezone was implicit in Snowflake
- Full-table scans and cost spikes from missing partition pruning patterns
How conversion works
We convert Snowflake SQL to BigQuery by combining deterministic rewrite rules with a review-and-validate loop for the constructs that commonly carry hidden business meaning.
- Inventory & classify queries/views/models by consumer (BI, ELT, app), complexity, and risk patterns (QUALIFY, JSON, timezone, MERGE).
- Rewrite with rule-anchored mappings function equivalents, identifier/quoting normalization, and explicit cast strategies.
- Apply pattern libraries for QUALIFY/window filters, FLATTEN→UNNEST, VARIANT→JSON, and timezone normalization.
- Validate & gate with compilation, catalog checks, and golden-query parity + regression tests for edge windows.
Supported constructs
Representative constructs we commonly handle for Snowflake → BigQuery SQL conversion (exact coverage depends on your estate).
| Source | Target | Notes |
|---|---|---|
| QUALIFY + windowed filters | QUALIFY (BigQuery) with deterministic ORDER BY | Tie-breakers enforced to avoid nondeterministic drift. |
| VARIANT access (col:path / GET / PARSE_JSON) | JSON_VALUE/JSON_QUERY + explicit casts | Extraction boundary defines types; ambiguous paths are review-marked. |
| FLATTEN + LATERAL joins | UNNEST patterns (INNER/LEFT as required) | Empty arrays preserved when needed (LEFT JOIN UNNEST). |
| LISTAGG / STRING aggregation | STRING_AGG with ORDER BY | Ordering preserved; distinct/NULL handling made explicit. |
| IFF / DECODE / NVL | IF / CASE / COALESCE | Explicit casts added where BigQuery coercion differs. |
| DATEADD/DATEDIFF/DATE_TRUNC | DATE_ADD/DATE_DIFF/DATE_TRUNC | Unit semantics normalized; timestamp vs date handled explicitly. |
How the workload changes in BigQuery
| Topic | Snowflake | BigQuery |
|---|---|---|
| Dialect + functions | Snowflake SQL + rich VARIANT helpers | BigQuery Standard SQL + JSON functions |
| Semi-structured | VARIANT + FLATTEN patterns | JSON + UNNEST patterns |
| Performance model | Micro-partitions + clustering effects | Partition pruning + clustering + slot usage |
Snowflake patterns → BigQuery equivalents
Examples show deterministic windowing, VARIANT/JSON extraction, and FLATTEN→UNNEST rewrites. Adjust paths and types to your schema.
-- Snowflake: QUALIFY + deterministic dedupe
SELECT *
FROM events
QUALIFY ROW_NUMBER() OVER (
PARTITION BY business_key
ORDER BY event_ts DESC, src_lsn DESC NULLS LAST, ingested_at DESC
) = 1;Make ordering deterministic
If a query uses QUALIFY or windowed dedupe, enforce explicit tie-breakers (event time + stable offset + ingest time). BigQuery will happily execute an underspecified ORDER BY—until results drift.
Treat JSON types as a contract
Snowflake VARIANT access frequently relies on implicit casts. In BigQuery, choose JSON_VALUE vs JSON_QUERY intentionally and cast to the target scalar type at the extraction boundary.
Common pitfalls to avoid
- QUALIFY drift: ordering is underspecified (no stable tie-breaker), so ‘top-1’ rows change under retries or parallelism.
- JSON type surprises: values extracted as strings (or NULL) and silently change filter selectivity unless casted at the boundary.
- Timezone assumptions: session timezone or NTZ/LTZ behavior is implicit in Snowflake; BigQuery requires explicit timezone conversions when business meaning depends on local time.
- Cost regression: missing partition filters and join reordering cause unbounded scans; prune-aware rewrites should be part of conversion, not an afterthought.
Validation and reconciliation gates
We validate SQL conversion as a repeatable system: compile correctness, semantic parity for golden queries, and regression protection for edge cases (ties, late data windows, null-heavy cohorts).
Execution checks
- Compiles under BigQuery Standard SQL
- Explicit casts + deterministic ordering enforced
Parity checks
- Golden-query output parity (agreed windows + parameters)
- KPI aggregates by key dimensions; checksum aggregates for critical facts
- Sampling diffs on edge cohorts (ties, null-heavy segments, timezone boundary days)
Performance checks
- Partition pruning verified on top queries
- Cost/latency baselines captured; regression thresholds defined
Migration steps
- 01
Inventory and prioritize the SQL estate
Collect BI extracts, view DDL, dbt/ELT models, and app-embedded SQL. Rank by business impact, frequency, and risk patterns (QUALIFY, VARIANT, timezones, MERGE).
- 02
Define the parity contract
Lock in semantic expectations: NULL rules, casting strategy, timezone contract, tie-breakers for window logic, and JSON typing. Select a ‘golden query’ set for sign-off.
- 03
Convert with rules + review markers
Apply deterministic mappings for the common cases; flag ambiguous intent (implicit casts, JSON path uncertainty, collation expectations) with inline review markers.
- 04
Run validation gates
Compile in BigQuery, run smoke tests, execute golden queries, and compare aggregates/checksums over agreed windows. Fail fast on drift and fix at the pattern level.
- 05
Performance-safe rewrites
Add pruning-aware patterns (partition filters, join order, pre-aggregation) so converted queries don’t become unbounded scans. Confirm cost/latency with representative volumes.
We inventory your Snowflake SQL estate, convert a representative slice, and produce parity evidence on golden queries—plus a risk register for the constructs that carry business meaning.
Get a conversion plan, review markers, and validation artifacts (compile + parity + regression) so cutover is gated by evidence and rollback criteria.