Grafana Stack
Core Stack
Open-source analytics and interactive visualization platform for querying, visualizing, alerting, and exploring metrics, logs, and traces from multiple data sources with beautiful, customizable dashboards.
Open-source monitoring system with a dimensional data model, flexible query language (PromQL), efficient time series database, and pull-based metrics collection with service discovery.
Handles alerts sent by Prometheus server, taking care of deduplicating, grouping, routing to correct receiver integrations, silencing, and inhibition of alerts.
Logging
Horizontally-scalable, highly-available, multi-tenant log aggregation system inspired by Prometheus, designed to be cost-effective and easy to operate without indexing log contents.
Agent that ships local log contents to a Loki instance, discovering targets, attaching labels, and pushing log streams to Loki with efficient batching and retry mechanisms.
Kubernetes operator for deploying and managing Loki on Kubernetes, handling deployment, configuration, and lifecycle management with native Kubernetes integration.
Tracing
High-volume distributed tracing backend that requires only object storage for operation, integrating with Grafana, Prometheus, and Loki for complete observability correlation.
Lightweight telemetry collector optimized for sending metrics, logs, and traces to Grafana Cloud or open source deployments, with lower resource consumption than full-featured collectors.
Native support for OpenTelemetry protocol enabling standardized collection of traces, metrics, and logs with vendor-neutral instrumentation and automatic correlation across signals.
Metrics Storage
Horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus metrics with massive scale capabilities and seamless Prometheus compatibility.
Legacy horizontally scalable, multi-tenant Prometheus-as-a-Service solution (now superseded by Mimir) for long-term metric storage with high availability and durability.
Time series database built into Prometheus for efficient storage and retrieval of metrics with configurable retention and compression for local storage.
Data Sources & Integration
Native Grafana integration with Prometheus for querying metrics with PromQL, supporting alerting, annotations, and template variables for dynamic dashboards.
Built-in Grafana support for querying Loki logs with LogQL, enabling log exploration, filtering, and correlation with metrics and traces in unified dashboards.
Native integration for querying distributed traces in Tempo, enabling trace visualization, service dependency graphs, and correlation with logs and metrics.
Extensive plugin ecosystem supporting 150+ data sources including SQL databases, cloud providers, time series databases, and custom applications through plugin architecture.
Visualization & Dashboards
Customizable, interactive dashboards combining multiple visualizations with support for templating, annotations, time range controls, and collaborative sharing across teams.
Rich visualization library including graphs, tables, heatmaps, gauges, and custom panel plugins for creating tailored visualizations of time series and tabular data.
Ad-hoc query and exploration interface for investigating metrics, logs, and traces without building dashboards, ideal for troubleshooting and learning your data.
Event markers overlaid on graphs to correlate deployments, incidents, and other events with metrics changes, supporting both manual and automated annotation creation.
Enterprise Features
On-call management platform with intelligent alert routing, escalation policies, schedules, and integrations for managing incidents and reducing alert fatigue.
Incident management solution integrated with Grafana for declaring, managing, and resolving incidents with automated runbooks and post-incident analysis.
Service Level Objective tracking and monitoring for measuring service reliability, error budgets, and SLI compliance with automated alerting on budget burn.
Fully managed Grafana, Prometheus, Loki, and Tempo as a service with global data centers, automatic scaling, and integrated observability across the entire stack.
Enterprise-grade role-based access control with team permissions, LDAP/SAML/OAuth integration, and audit logging for compliance and security governance.
