Will Percey — Portfolio

Feature Stores

> > Updated Dec 2025

storage

Open Source Feature Stores

Feast

Open-source feature store for managing and serving ML features with support for multiple online and offline stores.

Key Features

Python SDK for feature definition
Online and offline stores
Point-in-time correct joins
Feature versioning
Supports Redis, DynamoDB, BigQuery, Snowflake

Use Cases

Real-time ML applications
Batch feature engineering
Multi-cloud deployments
Team feature collaboration

Alternatives

TectonHopsworksAWS Feature StoreDatabricks Feature Store

Hopsworks Feature Store

Enterprise feature store with data-centric AI platform, supporting Python and Spark with built-in data quality.

Key Features

Data validation and quality checks
Feature monitoring
Time travel queries
Streaming and batch pipelines
Feature lineage tracking

Use Cases

Enterprise ML platforms
Regulated industries
Feature governance
Multi-model serving

Alternatives

FeastTectonAWS Feature StoreFeature Store on Vertex AI

Feathr

LinkedIn's open-source feature store with support for batch, streaming, and real-time feature computation.

Key Features

Anchor-Derivation framework
Time-based aggregations
Streaming feature joins
Multi-data source support
Azure and cloud-agnostic

Use Cases

LinkedIn-scale features
Streaming ML pipelines
Time-series features
Azure ML integration

Alternatives

FeastHopsworksTectonAzure Feature Store

Featuretools

Automated feature engineering library for creating features from relational and temporal data.

Key Features

Deep feature synthesis (DFS)
Automated temporal aggregations
Entity relationships
Feature primitives library
Integration with Pandas

Use Cases

Automated feature generation
Time-series features
Relational data
Feature exploration

Alternatives

tsfreshKatsTSFreshManual feature engineering

cloud_sync

Managed Feature Store Services

Tecton

Enterprise feature platform built by Uber's Michelangelo team with real-time streaming features and operational ML focus.

Key Features

Real-time streaming transformations
Feature serving with SLA guarantees
Automatic backfilling
Drift detection and monitoring
Native Spark and Flink support

Use Cases

Real-time ML applications
Fraud detection
Recommendation systems
High-scale feature serving

Similar Technologies

AWS Feature StoreDatabricks Feature StoreVertex AI Feature Store

AWS SageMaker Feature Store

Fully managed feature store with low-latency online access and offline historical feature storage on AWS.

Key Features

Online and offline stores
Point-in-time queries
Feature groups with schemas
Built-in data quality
Integration with SageMaker ecosystem

Use Cases

AWS ML workflows
SageMaker model training
Production ML on AWS
Multi-team feature sharing

Similar Technologies

Vertex AI Feature StoreDatabricks Feature StoreFeast

Databricks Feature Store

Unified feature store integrated with Delta Lake providing feature serving for Databricks ML workflows.

Key Features

Delta Lake integration
Automatic feature lookup
Feature lineage tracking
Online store with CosmosDB/DynamoDB
Unity Catalog integration

Use Cases

Databricks ML workflows
Spark-based feature pipelines
Lakehouse architectures
Multi-workspace sharing

Similar Technologies

AWS Feature StoreVertex AI Feature StoreTecton

Vertex AI Feature Store

Google Cloud's managed feature store with low-latency serving, streaming ingestion, and BigQuery integration.

Key Features

Bigtable-backed online serving
BigQuery offline storage
Streaming feature ingestion
Feature monitoring
Explainable AI integration

Use Cases

GCP ML workflows
Real-time predictions
BigQuery ML integration
AutoML features

Similar Technologies

AWS Feature StoreDatabricks Feature StoreTecton

Core Feature Store Capabilities

Feature Management

Feature Definition: Declarative feature schemas
Feature Registry: Centralized feature catalog
Feature Versioning: Track feature changes over time
Feature Discovery: Search and explore features
Feature Lineage: Track data provenance

Data Serving

Online Serving: Low-latency feature retrieval
Offline Serving: Batch feature generation
Point-in-Time Joins: Prevent data leakage
Feature Caching: Optimize serving performance
Multi-Store Support: Redis, DynamoDB, Cassandra

Feature Engineering

Transformations: SQL, Pandas, PySpark
Streaming Features: Real-time aggregations
Batch Features: Historical aggregations
On-Demand Features: Computed at request time
Feature Pipelines: Orchestrated transformations

Monitoring & Quality

Data Quality: Validation and constraints
Feature Drift: Detect distribution changes
Data Freshness: Monitor staleness
Metrics & Observability: Feature usage tracking
Alerting: Notify on anomalies

Feature Store Architecture Patterns

Component	Online Store	Offline Store	Purpose
Storage	Redis, DynamoDB, Cassandra	S3, BigQuery, Snowflake, Redshift	Low-latency vs batch access
Latency	< 10ms	Minutes	Real-time vs batch serving
Data Volume	Limited	Unlimited	Hot vs cold data
Use Case	Model inference, real-time predictions	Model training, batch predictions	Serving vs training
Consistency	Eventually consistent	Strongly consistent	Fresh vs historical

Why Use a Feature Store?

speed

Reduce Time to Production

Reuse features across models and teams, avoiding duplicate work and accelerating development.

sync

Training-Serving Consistency

Ensure features computed during training match exactly those used in production inference.

hub

Centralized Feature Management

Single source of truth for all features with versioning, documentation, and governance.

security

Data Quality & Governance

Built-in validation, monitoring, and access controls for feature data.

groups

Team Collaboration

Data scientists discover and reuse features created by other teams.

timeline

Point-in-Time Correctness

Prevent data leakage with accurate historical feature values for training.