Workload

Validation & reconciliation for Snowflake -> BigQuery

Turn "looks close" into a measurable parity contract. We prove correctness for batch and incremental systems with golden queries, KPI diffs, and integrity simulations-then gate cutover with rollback-ready criteria.

At a glance
Input
Snowflake Validation & reconciliation logic
Output
BigQuery equivalent (validated)
Common pitfalls
  • Over-reliance on spot checks: a few sampled rows miss drift in edge cohorts and ties.
  • No tolerance model: teams argue over “small” diffs because thresholds were never defined.
  • Ignoring incremental behavior: parity looks fine on a static snapshot but fails under retries/late data.
Context

Why this breaks

Most migration failures aren’t syntax problems-they’re undetected semantic drift. Teams cut over because queries compile, dashboards “look right,” and spot checks pass. Weeks later, KPIs diverge because the correctness rules were implicit and never stress-tested.

Common drift drivers in Snowflake -> BigQuery:

  • Implicit time contracts: session timezone/NTZ behavior vs BigQuery’s explicit conversions
  • Type coercion differences: CASE/COALESCE branches and join keys silently cast differently
  • Windowing ambiguity: top-1/QUALIFY logic without deterministic tie-breakers
  • Semi-structured extraction: VARIANT access relies on implicit casts; BigQuery requires explicit JSON typing
  • Incremental behavior: retries, reprocessing windows, and late-arrival corrections weren’t validated

Validation must treat the system as incremental + operational, not a one-time batch.

Approach

How conversion works

  1. Define the parity contract: which outputs must match (tables, KPIs, dashboards) and what tolerances apply (exact vs threshold).
  2. Build the validation dataset: golden inputs, edge cohorts (ties, null-heavy segments), and representative time windows (including boundary days).
  3. Run execution gates: compile + run checks, schema/type alignment, dependency readiness.
  4. Run parity gates: KPI aggregates, checksum aggregates, dimensional rollups, and targeted row-level sampling diffs.
  5. Validate incremental integrity (where applicable): idempotency reruns, late-arrival injections, and backfill simulations.
  6. Gate cutover: define pass/fail thresholds, canary strategy, rollback triggers, and monitoring for post-cutover regressions.

Supported constructs

Representative validation and reconciliation mechanisms we apply in Snowflake -> BigQuery migrations.

SourceTargetNotes
Golden queries / reportsGolden query harness + repeatable parameter setsCodifies business sign-off into runnable tests.
Row counts and profilesPartition-level counts + null/min/max/distinct profilesCheap, high-signal early drift detection.
KPI validationAggregate diffs by key dimensions + tolerance thresholdsAligns validation with business outcomes.
Row-level diffsTargeted sampling diffs + edge cohort testsUse deep diffs only where aggregates signal drift.
Incremental pipelinesIdempotency reruns + late-arrival/backfill simulationsValidates behavior under retries and late data.
Operational sign-offCanary gates + rollback criteria + monitorsPrevents "successful cutover" from becoming a long tail of KPI debates.

How workload changes

TopicSnowflakeBigQuery
Drift driversSession-level time & implicit casting often toleratedExplicit timezone conversions and casting required
Cost of validationWarehouse credits; some diff styles cheaperBytes scanned + slot time; design layered gates
Incremental semanticsStreams/Tasks often encode retry/late behaviorBehavior must be simulated explicitly
Drift drivers: Validation must focus on these explicitness boundaries.
Cost of validation: Start with aggregates/profiles before deep diffs.
Incremental semantics: Reruns/late injections are mandatory gates.

Examples

Illustrative patterns for parity checks in BigQuery. Replace datasets, keys, and KPI definitions to match your migration.

-- Partition/window row counts (BigQuery)
SELECT
  DATE(event_ts) AS d,
  COUNT(*) AS rows
FROM `proj.mart.fact_orders`
WHERE DATE(event_ts) BETWEEN @start_date AND @end_date
GROUP BY 1
ORDER BY 1;
Avoid

Common pitfalls

  • Over-reliance on spot checks: a few sampled rows miss drift in edge cohorts and ties.
  • No tolerance model: teams argue over “small” diffs because thresholds were never defined.
  • Ignoring incremental behavior: parity looks fine on a static snapshot but fails under retries/late data.
  • Unstable ordering: window functions without complete ORDER BY; results drift under parallelism.
  • Wrong level of comparison: comparing raw rows when the business cares about KPI rollups (or vice versa).
  • Cost-blind validation: exhaustive row-level diffs can become expensive; use layered gates (cheap->deep).
Proof

Validation approach

Gate set (layered)

Gate 0 - Readiness

  • BigQuery datasets, permissions, and target schemas exist
  • Dependencies (UDFs/routines, reference tables) are deployed

Gate 1 - Execution

  • Converted jobs compile and run reliably
  • Deterministic ordering + explicit casts enforced in high-risk areas

Gate 2 - Structural parity

  • Row counts by partition/window
  • Null/min/max/distinct profiles for key columns

Gate 3 - KPI parity

  • KPI aggregates by key dimensions
  • Top-N rankings and funnel metrics validated on edge windows

Gate 4 - Integrity (incremental systems)

  • Idempotency: rerun same batch -> no net change
  • Late-arrival: inject late updates -> only expected rows change
  • Backfill: historical windows replay without drift

Gate 5 - Cutover & monitoring

  • Canary rollout criteria + rollback triggers
  • Post-cutover monitors: latency, scan bytes/slot time, diff sentinels on critical KPIs
Execution

Migration steps

A practical sequence for making validation fast, repeatable, and dispute-proof.
  1. 01

    Define the parity contract

    Decide what must match (tables, dashboards, KPIs), at what granularity, and with what tolerance thresholds. Identify golden queries and business sign-off owners.

  2. 02

    Create validation datasets and edge cohorts

    Select representative time windows, boundary days, and cohorts that trigger edge behavior (ties, null-heavy segments, timezone boundaries, schema drift).

  3. 03

    Implement layered gates

    Start with cheap checks (counts/profiles), then KPI diffs, then deep diffs only where needed. Codify gates into runnable jobs so validation is repeatable.

  4. 04

    Validate incremental integrity

    Run idempotency reruns, late-arrival injections, and backfill simulations. These are the scenarios that typically break after cutover if not tested.

  5. 05

    Gate cutover and monitor

    Establish canary/rollback criteria and post-cutover monitors for critical KPIs and pipeline health (latency, failures, scan bytes/slot time).

Workload Assessment
Make parity measurable before you cut over

We define your parity contract, build the golden-query set, and implement layered reconciliation gates-including reruns and late-data simulations-so KPI drift is caught before production cutover.

Cutover Readiness
Gate cutover with evidence and rollback criteria

Get a validation plan, runnable gates, and sign-off artifacts (diff reports, thresholds, monitors) so Snowflake->BigQuery cutover is controlled and dispute-proof.

FAQ

Frequently asked questions

Do we need row-level diffs for everything?+
Usually no. A layered approach is faster and cheaper: start with counts/profiles and KPI aggregates, then do targeted row-level diffs only where aggregates signal drift or for high-risk entities.
How do you handle small KPI differences?+
We define tolerance thresholds up front (exact vs %/absolute). If differences exceed thresholds, we trace back using dimensional rollups and targeted sampling to isolate the drift source.
What if the system is incremental with late-arriving data?+
Then validation must include simulations: idempotency reruns, late-arrival injections, and backfill windows. These gates prove the migrated system behaves correctly under operational stress.
How does validation tie into cutover?+
We turn gates into cutover criteria: pass/fail thresholds, canary rollout, rollback triggers, and post-cutover monitors. Cutover becomes an evidence-based decision, not a debate.