Workload

Teradata stored procedures, macros & UDFs to BigQuery

Re-home procedural logic-BTEQ-era routines, Teradata stored procedures, macros, and UDFs-into BigQuery routines with an explicit behavior contract and validation harness so results don’t drift after cutover.

At a glance
Input
Teradata Stored procedure / UDF migration logic
Output
BigQuery equivalent (validated)
Common pitfalls
  • Macro misuse: macros used as “functions” with hidden dependencies; migration must make dependencies explicit.
  • Row-by-row loops/cursors: common in procedural code; often needs set-based refactors for BigQuery cost/runtime.
  • Session coupling: reliance on session variables, volatile tables, or execution context assumptions.
Context

Why this breaks

Teradata procedural assets are rarely “just code.” They carry operational behavior from decades of platform constraints: macros used as reusable query units, stored procedures that manage control tables and restartability, and UDFs that encode business logic and typing assumptions. When migrated naïvely, routines may compile in BigQuery but drift in results or operational behavior.

Common symptoms after migration:

  • Procedures compile but results drift due to NULL/type coercion differences inside branches
  • Macro expansion behavior changes; parameter and identifier resolution becomes inconsistent
  • Dynamic SQL behaves differently (quoting, binding, or execution context)
  • Error handling and return codes no longer match, breaking orchestration and restart flows
  • Side effects (control tables, audit logs) aren’t modeled, so reruns double-apply or silently skip

A successful migration turns implicit behavior into a testable contract and validates it under real rerun/backfill scenarios.

Approach

How conversion works

  1. Inventory & classify routines: stored procedures, macros, SQL UDFs, and their call graph (ETL jobs, BI tools, schedulers).
  2. Extract the behavior contract: inputs/outputs, side effects, state assumptions, error semantics, and performance constraints.
  3. Choose target form per asset:
    • BigQuery SQL UDF for pure expressions
    • BigQuery JavaScript UDF for complex transformation logic
    • BigQuery stored procedure (SQL scripting) for multi-statement control flow
    • Refactor to set-based SQL where procedural loops/cursors exist
  4. Translate macro patterns into views/UDFs/procedures with explicit parameter contracts.
  5. Validate with a harness: golden inputs/outputs, branch coverage, failure-mode tests, and side-effect assertions (control/audit tables).

Supported constructs

Representative Teradata procedural constructs we commonly migrate to BigQuery routines (exact coverage depends on your estate).

SourceTargetNotes
Teradata SQL UDFsBigQuery SQL UDFsPure expression logic mapped with explicit casts and NULL behavior.
Teradata stored proceduresBigQuery stored procedures (SQL scripting)Control flow and side effects modeled explicitly.
Teradata macrosViews/UDFs/procedures with explicit parameter contractsMacro expansion replaced by reusable, testable assets.
Control tables for restartabilityApplied-window tracking + idempotency markersReruns are safe and auditable under failures/backfills.
Dynamic SQL patternsEXECUTE IMMEDIATE with parameter bindingNormalize identifier resolution; reduce injection and drift.
Volatile tables inside routinesTemp tables / staged materializationsConcurrency-safe without hidden session coupling.

How workload changes

TopicTeradataBigQuery
Reusable logicMacros and stored procedures often act as reusable building blocksRoutines must have explicit inputs/outputs and dependency contracts
State and restartabilityControl tables + operational conventions encode rerun behaviorIdempotency markers + deterministic ordering enforced
Performance modelProcedural loops sometimes tolerated under platform constraintsSet-based refactors often required for cost/latency
Dynamic SQLString-built SQL patterns commonPrefer parameter binding and explicit identifier rules
Reusable logic: We make hidden dependencies explicit and testable.
State and restartability: Reruns and backfills are validated via simulations.
Performance model: Row-by-row is the most common hidden cost after migration.
Dynamic SQL: Reduces injection risk and semantic drift.

Examples

Representative patterns for translating Teradata UDF/macro logic into BigQuery routines. Adjust datasets, types, and identifiers to match your environment.

-- Teradata UDF (illustrative)
-- (Exact syntax varies by TD version; treat as representative)
CREATE FUNCTION util.safe_div(n DECIMAL(18,6), d DECIMAL(18,6))
RETURNS DECIMAL(18,6)
RETURN CASE WHEN d IS NULL OR d = 0 THEN NULL ELSE n / d END;
Avoid

Common pitfalls

  • Macro misuse: macros used as “functions” with hidden dependencies; migration must make dependencies explicit.
  • Row-by-row loops/cursors: common in procedural code; often needs set-based refactors for BigQuery cost/runtime.
  • Session coupling: reliance on session variables, volatile tables, or execution context assumptions.
  • Exception semantics mismatch: differences in how failures propagate and what counts as success.
  • Type drift in branches: CASE/IF returns mixed types; BigQuery may require explicit casts.
  • Dynamic SQL injection risk: string concatenation without parameter binding and escaping.
  • Side effects not modeled: control-table updates and audit logging missing, breaking restartability.
Proof

Validation approach

  • Compile + interface checks: every routine compiles in BigQuery; argument/return types match the contract.
  • Golden tests: curated input sets validate outputs, including NULL-heavy and boundary cases.
  • Branch + failure-mode coverage: exercise key branches and expected failures (missing rows, invalid inputs).
  • Side-effect verification: assert expected writes to control/log/audit tables and idempotency under retries.
  • Integration replay: run routines within representative ETL slices and compare downstream KPI aggregates.
  • Performance gate: confirm no hidden row-by-row scans; set-based refactors validated with cost/latency baselines.
Execution

Migration steps

A sequence that keeps behavior explicit, testable, and safe to cut over.
  1. 01

    Inventory routines and build the call graph

    Collect Teradata stored procedures, macros, and UDFs, plus their callers and dependencies. Identify side effects: control tables, audit logs, temp/volatile usage, and dynamic SQL paths.

  2. 02

    Define the behavior contract

    For each asset, specify inputs/outputs, error behavior, side effects, restartability semantics, and performance expectations. Choose the BigQuery target form (UDF/view/procedure/refactor).

  3. 03

    Convert logic with safety patterns

    Rewrite control flow, casts, and dynamic SQL using parameter binding and explicit identifier rules. Replace cursor/loop logic with set-based SQL where feasible to control cost and runtime.

  4. 04

    Build a validation harness

    Create golden inputs, boundary cases, and failure-mode tests. Validate outputs and side effects deterministically so behavior isn’t debated during cutover.

  5. 05

    Integrate, replay, and cut over

    Run routines within representative ETL slices, compare downstream KPIs, and verify rerun/restart behavior. Cut over behind gates with rollback-ready criteria.

Workload Assessment
Migrate Teradata routines with a test harness

We inventory your Teradata procedures/macros/UDFs, choose the right BigQuery target form per asset, migrate a representative subset, and deliver a harness that proves parity-including side effects and restart behavior.

Migration Acceleration
Cut over routines with proof-backed sign-off

Get an actionable conversion plan, review markers for ambiguous intent, and validation artifacts so Teradata routine cutover is gated by evidence and rollback criteria.

FAQ

Frequently asked questions

What happens to Teradata macros in BigQuery?+
We migrate macros based on intent: reusable query macros often become views or UDFs; operational macros may become stored procedures or orchestration steps. The key is making inputs, dependencies, and side effects explicit.
Do we have to rewrite everything as BigQuery stored procedures?+
Not necessarily. Many routines can be simplified into set-based SQL, views, or SQL UDFs. We choose the target form per asset to minimize operational risk and cost.
How do you prove behavior parity?+
We build a replayable harness: golden inputs/outputs, branch coverage, expected failure cases, and side-effect checks. Integration replay validates downstream metrics before cutover.
Will performance change after migration?+
It can. Row-by-row loops are the most common risk. We refactor to set-based SQL where possible and gate performance with cost/latency baselines so routines don’t become hidden BigQuery spend.