Dimensional Modeling

table_chart

Fact Tables

Transaction Fact Table

Records individual business events at the most atomic grain. One row per transaction (sale, order, click). Captures the 'who, what, when, where, how much' of each event. Most common fact table type. Supports detailed drill-down analysis. Can grow very large but provides maximum flexibility for analysis.

Key Features
  • One row per atomic event
  • Highest level of detail
  • Foreign keys to all dimensions
  • Additive measures (quantity, amount)
  • Timestamp for event occurrence
Use Cases
  • Point-of-sale transactions
  • Order line items
  • Clickstream events
  • Financial transactions
Related Concepts
Periodic SnapshotAccumulating SnapshotFactless Fact
Periodic Snapshot Fact Table

Captures state at regular intervals (daily, weekly, monthly). One row per period per entity being measured. Shows cumulative or point-in-time metrics. Ideal for inventory, account balances, pipeline status. Predictable row count based on periods and entities. Complements transaction facts for trend analysis.

Key Features
  • Fixed time intervals
  • Point-in-time measurements
  • Predictable table growth
  • Semi-additive measures common
  • Period dimension required
Use Cases
  • Daily inventory levels
  • Monthly account balances
  • Weekly pipeline snapshots
  • End-of-day positions
Related Concepts
Transaction FactAccumulating SnapshotAggregate Table
Accumulating Snapshot Fact Table

Tracks lifecycle of a process with defined milestones. Single row updated as entity progresses through stages. Multiple date foreign keys for each milestone. Lag calculations between milestones. Row is updated (not inserted) as process advances. Best for processes with clear start/end and measurable stages.

Key Features
  • Multiple milestone dates per row
  • Row updates as process progresses
  • Lag metrics between stages
  • Workflow/process tracking
  • Defined lifecycle stages
Use Cases
  • Order fulfillment pipeline
  • Insurance claim processing
  • Student enrollment lifecycle
  • Loan application workflow
Related Concepts
Transaction FactPeriodic SnapshotProcess Mining
Factless Fact Table

Records events or relationships without numeric measures. Captures occurrences or coverage. Two types: event tracking (student attendance) and coverage (product promotions). Foreign keys only, no measures. COUNT(*) is the implicit measure. Essential for tracking what didn't happen or eligibility.

Key Features
  • No numeric measures
  • Foreign keys only
  • Event occurrence tracking
  • Coverage/eligibility recording
  • COUNT as implicit measure
Use Cases
  • Student class attendance
  • Promotion coverage by product
  • Employee training completion
  • What products were NOT sold
Related Concepts
Bridge TableTransaction FactCoverage Table
grain

Grain & Granularity

Grain Declaration

The most critical design decision - defines what one row represents. Must be declared before choosing dimensions or facts. Expressed as 'one row per X'. Determines what questions can be answered. Too coarse loses detail; too fine wastes storage. Cannot mix grains in same fact table.

Key Features
  • Defines row meaning precisely
  • Set before dimension/fact selection
  • Constrains all design decisions
  • Documented in data dictionary
  • Immutable once data loaded
Use Cases
  • One row per order line item
  • One row per day per product per store
  • One row per customer per month
  • One row per click per session
Related Concepts
Aggregate GrainAtomic GrainDerived Grain
Atomic Grain

Lowest level of detail captured by source system. Most granular data possible. Provides maximum analytical flexibility. Supports any aggregation or drill-down. Often required for compliance and audit. Higher storage costs but future-proof for unknown requirements.

Key Features
  • Finest level of detail
  • Maximum query flexibility
  • Supports all aggregations
  • Required for drill-through
  • Source system fidelity
Use Cases
  • Transaction-level analysis
  • Audit and compliance
  • Ad-hoc exploration
  • Machine learning features
Related Concepts
Summary GrainAggregated GrainRolled-up Grain
Aggregate Grain

Pre-summarized data at higher level than atomic. Improves query performance for common analyses. Must be derived from atomic facts. Aggregate navigation in BI tools. Separate aggregate fact tables or materialized views. Trade-off: faster queries vs storage and ETL complexity.

Key Features
  • Pre-computed summaries
  • Faster query response
  • Derived from atomic grain
  • Maintained via ETL or materialized views
  • Aggregate awareness in BI tools
Use Cases
  • Daily sales summaries
  • Monthly rollups
  • Category-level aggregates
  • Dashboard performance
Related Concepts
Atomic GrainOLAP CubeMaterialized View
Mixed Grain (Anti-pattern)

Combining different grains in same fact table - a design flaw. Causes incorrect aggregations and confusing results. Example: mixing order header and line items. Creates NULL foreign keys and broken joins. Always separate into distinct fact tables. Consolidate in views or reporting layer if needed.

Key Features
  • Multiple row meanings
  • NULL dimension keys
  • Incorrect SUM/COUNT results
  • Confusing for analysts
  • ETL complexity
Use Cases
  • Recognize and refactor
  • Split into separate facts
  • Create header/line fact pairs
  • Document grain violations
Related Concepts
Separate Fact TablesHeader/Detail FactsProper Grain Design
category

Dimension Types

Conformed Dimension

Dimension shared across multiple fact tables and subject areas. Single definition of business entity (customer, product, date). Enables cross-process analysis and enterprise reporting. Bus architecture foundation. Same keys, attributes, and hierarchies everywhere. Master data management integration.

Key Features
  • Shared across fact tables
  • Single source of truth
  • Enables cross-functional analysis
  • Consistent hierarchies
  • Enterprise bus architecture
Use Cases
  • Customer dimension across sales/service
  • Product dimension across inventory/sales
  • Date dimension for all facts
  • Geography dimension enterprise-wide
Related Concepts
Local DimensionRole-playing DimensionOutrigger
Role-Playing Dimension

Same physical dimension used multiple times in one fact with different meanings. Common for date dimension (order date, ship date, due date). Implemented as views or aliases. Each role has different foreign key in fact. Avoids dimension duplication while supporting multiple perspectives.

Key Features
  • Single physical dimension
  • Multiple logical roles
  • Views/aliases for each role
  • Separate FK per role in fact
  • Common with Date dimension
Use Cases
  • Order date vs ship date vs due date
  • Ship-from vs ship-to location
  • Buyer vs seller customer
  • Primary vs secondary contact
Related Concepts
Separate DimensionsJunk DimensionBridge Table
Junk Dimension

Combines low-cardinality flags and indicators into single dimension. Avoids cluttering fact table with multiple flag columns. Cartesian product of all flag combinations. Typically small (dozens to hundreds of rows). Improves fact table compression. Named 'junk' but actually good design.

Key Features
  • Low-cardinality attributes
  • Flags and indicators combined
  • Small dimension table
  • Improves fact table design
  • Pre-computed combinations
Use Cases
  • Yes/No flags (is_returned, is_promoted)
  • Status codes (new, approved, rejected)
  • Transaction type indicators
  • Order flags combination
Related Concepts
Flags in FactSeparate Mini-dimensionsDegenerate Dimension
Degenerate Dimension

Dimension key stored in fact table with no corresponding dimension table. Typically transaction identifiers (order number, invoice number). No attributes to store in dimension. Provides drill-through to source. Occupies dimension position but has no join. Common in transaction fact tables.

Key Features
  • Key only, no dimension table
  • Transaction identifiers
  • Stored directly in fact
  • Enables source drill-through
  • No attributes to describe
Use Cases
  • Order number
  • Invoice number
  • Transaction ID
  • Control numbers
Related Concepts
Mini-dimensionOutriggerSeparate Dimension
Outrigger Dimension

Secondary dimension joined to primary dimension (not fact). Normalizes dimension hierarchy. Snowflake extension of star schema. Used when dimension attribute has its own attributes. Date dimension as outrigger to other dimensions. Trade-off: normalization vs query complexity.

Key Features
  • Dimension joins to dimension
  • Hierarchy normalization
  • Reduces dimension redundancy
  • Adds join complexity
  • Snowflake characteristic
Use Cases
  • Date attributes on dimension changes
  • Geography hierarchy normalized
  • Reference data linked to dimension
  • Shared sub-dimension
Related Concepts
Denormalized in DimensionBridge TableStar Schema
Bridge Table

Resolves many-to-many relationships between fact and dimension. Sits between fact and multi-valued dimension. Contains weighting factors for allocation. Examples: multiple diagnoses per patient, multiple categories per product. Enables M:M without violating star schema.

Key Features
  • Many-to-many resolution
  • Weighting factors
  • Between fact and dimension
  • Group keys for filtering
  • Allocation percentages
Use Cases
  • Multiple diagnoses per visit
  • Multiple sales reps per deal
  • Product in multiple categories
  • Multi-valued attributes
Related Concepts
Junk DimensionFactless FactArray in Dimension
history

Slowly Changing Dimensions (SCD)

SCD Type 0 - Retain Original

Never update dimension attributes. Original value preserved forever. Simplest approach - no change handling needed. Used for attributes that should never change or where original value matters. Often used for original customer segment or first order date.

Key Features
  • No updates ever
  • Original value preserved
  • Simplest implementation
  • No history tracking
  • Insert-only pattern
Use Cases
  • Original customer acquisition channel
  • First order date
  • Initial credit score
  • Registration source
Related Types
Type 1Type 2Type 3
SCD Type 1 - Overwrite

Replace old value with new value. No history preserved. Current state only. Simplest change handling. Used when history is irrelevant or corrections to bad data. Historical reports will show new values retroactively. Fast and space-efficient.

Key Features
  • Simple UPDATE statement
  • No history preserved
  • Current value only
  • Historical reports affected
  • Corrections and fixes
Use Cases
  • Error corrections
  • Name spelling fixes
  • Attributes where history irrelevant
  • Reference data updates
Related Types
Type 0Type 2Type 6
SCD Type 2 - Add New Row

Insert new row for each change, preserving full history. Surrogate key changes, natural key stays same. Effective date range (start/end dates) or current flag. Most common for historical analysis. Enables point-in-time reporting. Dimension table grows with changes.

Key Features
  • Full history preserved
  • New surrogate key per version
  • Effective date ranges
  • Current record flag
  • Point-in-time analysis
Use Cases
  • Customer address history
  • Product price changes
  • Employee department transfers
  • Status changes over time
Related Types
Type 1Type 3Type 6
SCD Type 3 - Add New Column

Add column to store previous value. Limited history (typically current and previous only). Same row, new columns. Quick access to both values. Simple queries comparing before/after. Limited to small number of changes tracked.

Key Features
  • Previous value column
  • Limited history depth
  • Same row updated
  • Easy before/after comparison
  • Simple implementation
Use Cases
  • Previous and current address
  • Before/after reorganization
  • Last price vs current price
  • Simple A/B comparisons
Related Types
Type 2Type 1Type 6
SCD Type 4 - Mini-Dimension

Rapidly changing attributes split into separate mini-dimension. Keeps main dimension stable. Mini-dimension has its own surrogate key in fact. Used for attributes that change frequently (age bands, income ranges). Reduces Type 2 row explosion.

Key Features
  • Separate rapidly changing attributes
  • Stable main dimension
  • Mini-dim FK in fact table
  • Reduces row proliferation
  • Band/range attributes
Use Cases
  • Customer age bands
  • Credit score ranges
  • Income brackets
  • Loyalty tier levels
Related Types
Type 2Type 6Junk Dimension
SCD Type 6 - Hybrid (1+2+3)

Combines Types 1, 2, and 3. New row for history (Type 2), current value column on all rows (Type 1 behavior), previous value column (Type 3). Maximum flexibility for analysis. Complex to implement and maintain. Also called 'Type 2+' or 'Hybrid SCD'.

Key Features
  • Full history (Type 2)
  • Current value on all rows
  • Previous value column
  • Maximum query flexibility
  • Complex ETL logic
Use Cases
  • Current vs historical analysis
  • Trend analysis with current state
  • Flexible reporting requirements
  • Analytical data warehouses
Related Types
Type 2Type 3Type 4
functions

Measures & Metrics

Additive Measures

Can be summed across all dimensions. Most common and useful measure type. Examples: sales amount, quantity, cost. Freely aggregated by any dimension. Foundation of most business metrics. Sum, count operations always valid.

Key Features
  • Sum across all dimensions
  • Most flexible for analysis
  • Standard aggregations work
  • Revenue, quantity, cost
  • Foundation metrics
Use Cases
  • Total sales amount
  • Units sold
  • Total cost
  • Transaction counts
Related Concepts
Semi-additiveNon-additiveCalculated Measures
Semi-Additive Measures

Can be summed across some dimensions but not all (typically not time). Balance and inventory measures. Snapshot values that shouldn't be summed over time. Use AVG, MIN, MAX, or LAST for time dimension. Common in periodic snapshot facts.

Key Features
  • Additive across most dimensions
  • Not additive across time
  • Balance/inventory measures
  • Snapshot point-in-time values
  • Special time aggregation needed
Use Cases
  • Account balances
  • Inventory levels
  • Headcount snapshots
  • Pipeline values
Related Concepts
AdditiveNon-additivePeriod-end Values
Non-Additive Measures

Cannot be summed across any dimension. Ratios, percentages, unit prices. Must be calculated from additive components. Store components, calculate ratio in reporting. Examples: profit margin, conversion rate, average price.

Key Features
  • Cannot be summed
  • Ratios and percentages
  • Calculate from components
  • Store numerator/denominator
  • Derived measures
Use Cases
  • Profit margin percentage
  • Conversion rates
  • Unit prices
  • Averages and ratios
Related Concepts
AdditiveSemi-additiveCalculated Metrics
Derived/Calculated Measures

Computed from other measures at query time. Not stored in fact table. Business logic in BI tool or view. Examples: profit (revenue - cost), margin (profit/revenue). Keeps fact table clean. Trade-off between storage and compute.

Key Features
  • Calculated at query time
  • Not physically stored
  • Business logic in reports
  • Derived from stored measures
  • Flexible definitions
Use Cases
  • Profit = Revenue - Cost
  • Year-over-year growth
  • Moving averages
  • Variance calculations
Related Concepts
Stored MeasuresPre-calculatedETL-computed