Medallion Architecture

A layered data pipeline pattern organizing data into Bronze (raw), Silver (modeled), and Gold (consumption) layers of progressive refinement.

Last updated: 2026-04-12

Overview

The medallion architecture organizes a data platform into three progressive layers. Databricks popularized the Bronze/Silver/Gold naming; dbt uses staging/intermediate/marts; other teams say raw/curated/refined or landing/transform/serve. The branding varies but the core idea is identical: data flows from source through increasing levels of refinement until it’s ready for consumption.

The pattern answers “where does this model live?” — but deliberately doesn’t prescribe what each layer contains. That loose definition is both its strength (flexible) and its weakness (teams diverge on what belongs where).

The Three Layers

LayerAlso calledWhat lives hereWho consumes it
BronzeRaw / Landing / StagingSource data as-received; mechanical cleaning onlyData engineers debugging quality issues; auditors tracing original values
SilverCurated / Transform / IntermediateBusiness data model — facts, dimensions, conformed entitiesData scientists needing full grain; analysts exploring unmodeled questions
GoldRefined / Serve / MartsConsumption-ready — governed metrics, pre-joined wide tables, semantic layerAnalysts, dashboards, BI tools, AI tools; anyone who wants pre-defined metrics without writing SQL

Combining Medallion with Kimball and Semantic Layer

The medallion architecture answers project structure but not modeling methodology or consumption pattern. Three patterns address three different questions:

PatternQuestion it answers
MedallionWhere does this model live?
Kimball dimensional modelingHow do I represent the business accurately?
Semantic layerHow do I expose governed metrics to consumers?

When mapped intentionally:

  • Bronze = staging: Normalize column names, cast types, deduplicate, handle nulls. No business interpretation — just a clean representation of what each source provided.
  • Silver = Kimball: Fact tables at declared grains, dimension tables, SCDs, conformed dimensions. The authoritative answer to “what is a customer?”
  • Gold = semantic layer: Governed metric definitions (revenue, CAC, ROAS) as first-class objects. BI tools query here. Definitions are centralized, not reimplemented per dashboard.

OBTs and Dimensional Models Coexist

One-big-tables (OBTs) and dimensional models are not either/or. OBTs are a consumption artifact that belong in Gold, downstream of Silver’s facts and dimensions. You get the rigor of dimensional modeling and the usability of pre-joined wide tables. One feeds the other.

Concrete Example: Marketing Attribution

Bronze  ← Ad platform exports, clickstream events, conversion events
          (cast types, normalize names, deduplicate, fix timezones)

Silver  ← campaign_dim (conformed across all ad platforms)
          fact_ad_spend (campaign/channel/day grain)
          fact_conversions (event grain)

Gold    ← metric view: total_spend, total_conversions, CAC, ROAS
          (defined once; every dashboard gets the same number)

Connections

Sources