Migration

Teradata → BigQuery migration

Move Teradata workloads (SQL/BTEQ scripts, macros, stored procedures, volatile tables, and WLM-shaped concurrency) to BigQuery with predictable conversion and verified parity. SmartMigrate makes semantic and performance differences explicit, produces reconciliation evidence you can sign off on, and gates cutover with rollback-ready criteria—so production outcomes are backed by proof, not optimism.

At a glance
Scope
  • Query and schema conversion
  • Semantic and type alignment
  • Validation and cutover readiness
Risk areas
  • Teradata-specific SQL semantics (QUALIFY, TOP, set ops, date logic)
  • Primary Index, AMP distribution, and stats assumptions
  • Volatile tables and session-based workflows
Deliverables
  • Prioritized execution plan
  • Parity evidence and variance log
  • Rollback-ready cutover criteria
Qualifier

Is this migration right for you?

This approach is ideal if
  • You have a mature Teradata estate with hundreds/thousands of queries and BI dependencies
  • You rely on macros, procedures, and BTEQ/TPT orchestration in production
  • Performance and concurrency predictability matter for SLAs
  • You require provable parity and reconciliation evidence before cutover
  • You want cost and performance governance in BigQuery from day one
This approach may not be a fit if
  • You’re doing a simple one-time table copy with no semantic dependency on existing queries
  • You don’t need reconciliation evidence or rollback guarantees
  • There are no critical consumers depending on strict output parity
Risk map

What breaks in a Teradata → BigQuery migration

These are the common “gotchas” that cause silent result drift, operational breaks, or cost/performance surprises if not handled explicitly.
  • Teradata-specific SQL semantics (QUALIFY, TOP, set ops, date logic)

    Teradata patterns look familiar but behave differently in BigQuery around filtering, ranking, NULL handling, and date/timestamp behavior—often surfacing as “same query, different KPI.”

  • Primary Index, AMP distribution, and stats assumptions

    Teradata performance depends on PI choices, AMP distribution, and collected statistics. BigQuery performance is driven by partitioning/clustering, join strategy, and scan economics. If you port queries without a physical design plan, performance can swing wildly.

  • Volatile tables and session-based workflows

    Many Teradata pipelines rely on volatile tables and session logic. These need deliberate redesign (staging datasets, temp tables, materialization strategy) or workloads break.

  • Macros, stored procedures, and operational SQL (BTEQ/TPT)

    Teradata estates are rarely “just SQL.” Macros, stored procedures, BTEQ scripts, FastLoad/MultiLoad/TPT jobs, and scheduler glue must be migrated with an execution plan—or production orchestration fails post-cutover.

  • Concurrency and workload management mismatch

    Teradata workload management patterns don’t translate 1:1. Without a plan for BigQuery concurrency (on-demand vs reservations, slot contention, BI bursts), SLAs become unpredictable.

  • Hidden BI coupling and “business meaning” drift

    Dashboards often depend on undocumented behavior (rounding, casting, ordering, time semantics). If you don’t lock a “parity contract,” correctness becomes a debate at cutover.

BreakageMitigation
Teradata-specific SQL semantics (QUALIFY, TOP, set ops, date logic)
Teradata patterns look familiar but behave differently in BigQuery around filtering, ranking, NULL handling, and date/timestamp behavior—often surfacing as “same query, different KPI.”
Assess, rewrite where needed, then validate with parity checks.
Primary Index, AMP distribution, and stats assumptions
Teradata performance depends on PI choices, AMP distribution, and collected statistics. BigQuery performance is driven by partitioning/clustering, join strategy, and scan economics. If you port queries without a physical design plan, performance can swing wildly.
Assess, rewrite where needed, then validate with parity checks.
Volatile tables and session-based workflows
Many Teradata pipelines rely on volatile tables and session logic. These need deliberate redesign (staging datasets, temp tables, materialization strategy) or workloads break.
Assess, rewrite where needed, then validate with parity checks.
Macros, stored procedures, and operational SQL (BTEQ/TPT)
Teradata estates are rarely “just SQL.” Macros, stored procedures, BTEQ scripts, FastLoad/MultiLoad/TPT jobs, and scheduler glue must be migrated with an execution plan—or production orchestration fails post-cutover.
Assess, rewrite where needed, then validate with parity checks.
Concurrency and workload management mismatch
Teradata workload management patterns don’t translate 1:1. Without a plan for BigQuery concurrency (on-demand vs reservations, slot contention, BI bursts), SLAs become unpredictable.
Assess, rewrite where needed, then validate with parity checks.
Hidden BI coupling and “business meaning” drift
Dashboards often depend on undocumented behavior (rounding, casting, ordering, time semantics). If you don’t lock a “parity contract,” correctness becomes a debate at cutover.
Assess, rewrite where needed, then validate with parity checks.
Flow

Migration Flow

Extract → Plan → Convert → Reconcile → Cutover to BigQuery, with exception handling, validation gates, and a rollback path

Flow diagram showing SmartMigrate’s end-to-end migration pipeline
Conversion

SQL & Workload Conversion Overview

Teradata → BigQuery migration is not just “SQL translation.” The objective is to preserve business meaning while aligning to BigQuery’s execution model and cost structure. SmartMigrate converts what is deterministic, flags ambiguity, and structures the remaining work so engineering teams can resolve exceptions quickly.

What we automate vs. what we flag:

  • Automated:
    Common Teradata SQL patterns into BigQuery SQL, routine DDL mapping, straightforward join/aggregation conversions, safe function rewrites, and baseline view conversion.
  • Flagged as “review required”:
    QUALIFY + window edge cases, implicit casts, NULL-sensitive logic, volatile table workflows, macros/procedures, and performance-sensitive query shapes.
  • Manual by design:
    Final partitioning/clustering strategy, hot-query rewrites, macro/procedure execution decisions, and concurrency/slot strategy.
Risk

Common failure modes

  • QUALIFY and ranking drift
    Analytic filters and window logic translate syntactically but change edge-case outputs.
  • PI/AMP assumptions carried over
    Teradata physical tuning is treated as “schema,” and BigQuery performance collapses.
  • Volatile table workflow breaks
    Session-based pipelines lose their staging semantics and fail mid-stream.
  • Macro/procedure gaps
    Macros and stored procedures are deferred, and operational workloads break after cutover.
  • Load tool mismatch
    FastLoad/MultiLoad/TPT patterns aren’t re-homed cleanly, causing slow loads and fragile backfills.
  • Implicit CAST behavior
    Differences in numeric/timestamp casting silently change aggregates and join matches.
  • Stats-driven plan surprises
    Teradata stats expectations don’t carry; BigQuery needs partitioning/clustering/materialization choices.
  • Cost model whiplash
    Queries tuned for Teradata become expensive in BigQuery due to scan patterns and missing pruning.
Proof

Validation and reconciliation you can sign off on

In a Teradata → BigQuery migration, success must be measurable. We validate correctness in layers: first ensuring translated workloads compile and execute reliably, then proving that outputs match expected business meaning via reconciliation.

Validation is driven by pre-agreed thresholds and a defined set of golden queries and datasets. This makes sign-off objective: when reconciliation passes, cutover is controlled; when it fails, you get a precise delta report that identifies where semantics, type mapping, or query logic needs adjustment.

Checks included (typical set):

  • Row counts by table and key partitions where applicable
  • Null distribution + basic profiling (min/max, distinct counts where appropriate)
  • Checksums/hashes for stable subsets where feasible
  • Aggregate comparisons by key dimensions (day, region, customer/product keys)
  • Sampling diffs: top-N, edge partitions, known corner cases
  • Query result parity for golden queries (reports and KPI queries)
  • Post-cutover SLA monitoring plan (latency, bytes scanned, failure rates, concurrency/slots)
Cost & speed

Performance optimization in BigQuery

Partitioning strategy (ingestion-time and query-time):
Align partition keys to dominant filters to improve pruning and reduce bytes scanned.
Clustering on high-selectivity columns:
Improve performance for common predicates and joins when partitions alone are not sufficient.
Materialized views and summary tables:
Stabilize BI workloads and reduce repeated full-scan aggregations.
Concurrency and slot planning:
Decide on on-demand vs reservations based on workload mix and required predictability.
Cost hygiene via selectivity and pruning:
Encourage explicit column selection, predicate placement, and partition filters.
Query plan observability:
Use query logs and INFORMATION_SCHEMA to detect regressions and validate tuning outcomes.
Join strategy tuning:
Reduce large intermediate results by managing join cardinality and filter ordering.
Do the work

Teradata → BigQuery migration checklist

  • Parity contract exists
    Do you have signed-off golden queries/reports + thresholds (including QUALIFY/window edge cases and casting/time behavior) before conversion starts?
  • Operational SQL is in scope (not “later”)
    Do you have a plan for macros, procedures, BTEQ scripts, and load jobs (TPT/FastLoad/MultiLoad) with operational parity?
  • Volatile-table workflows are accounted for
    Have you identified session-based staging patterns and decided the BigQuery equivalents (temp tables, staging datasets, materialization)?
  • Performance strategy replaces PI/AMP assumptions
    Have you defined how partitioning/clustering/materialization will replace Teradata physical tuning for hot workloads?
  • Cutover is rollback-safe under real concurrency and cost
    Parallel run + canary gates + rollback criteria + BigQuery guardrails (bytes scanned, slots, latency) are ready.
FAQ

Frequently asked questions

What are the biggest differences between Teradata and BigQuery SQL?+
Both support analytics SQL, but practical differences show up in Teradata-specific constructs (e.g., QUALIFY patterns), type casting and date/time behavior, and execution/performance models. Teradata relies on PI/AMP distribution and statistics; BigQuery relies on partitioning/clustering, scan economics, and slot-based concurrency. Reliable migration requires handling semantics—not just syntax.
How do you handle QUALIFY and window function differences?+
We translate patterns, then validate with targeted edge-case datasets and golden queries where ranking/sessionization/report logic is sensitive. We explicitly enforce ordering and frames where needed so outputs are stable and auditable.
What happens to macros, procedures, and BTEQ/TPT scripts?+
We inventory and classify operational SQL and orchestration dependencies, translate what is straightforward, and re-home execution into appropriate patterns (e.g., Dataform/dbt + Composer/Workflows) so schedules and operational behavior remain intact.
How do you validate results are correct after SQL translation?+
We use layered validation: compiler/execution checks, then reconciliation checks (row counts, profiling, aggregates by key dimensions, sampling diffs) and golden query parity. Thresholds are defined up front so correctness is measurable and auditable.
How do you estimate BigQuery cost after migration?+
Cost is driven by bytes scanned, concurrency, and repeated query patterns. We analyze query shapes and scan patterns, then recommend partitioning/clustering/materialization and slot strategy. Post-migration, we monitor bytes scanned and regressions.
Can we migrate with minimal downtime?+
Often yes—via a parallel run window with controlled cutover gates and a rollback-ready plan. The approach depends on batch vs CDC movement, SLA requirements, and downstream consumer behavior. Cutover is gated by reconciliation thresholds and operational readiness.
Migration Acceleration
Book a Teradata → BigQuery migration assessment

Get a migration plan you can execute—with validation built in. We’ll inventory your Teradata estate (including macros, procedures, BTEQ/TPT jobs, and volatile-table workflows), convert representative workloads, surface risks in SQL translation and type mapping, and define a validation and reconciliation approach tied to your SLAs. You’ll also receive a cutover plan with rollback criteria and performance optimization guidance for BigQuery.