Last modified July 17, 2025
Data transformation
Data transformation allows you to process, enrich, and modify observability data to better suit your analysis and monitoring needs. The Giant Swarm Observability Platform offers several transformation approaches at different stages of the data pipeline.
Transformation approaches
Server-side transformations
Transform data before storage to improve performance and create derived metrics:
- Recording rules: Pre-compute complex PromQL expressions as new time series
- Relabeling rules: Modify, filter, or enrich metrics and logs during collection
- Data parsing: Extract structured data from logs and add contextual information
Client-side transformations
Transform data during visualization for specific dashboard requirements:
- Grafana transformations: Real-time data processing in dashboards and panels
Recording rules
Recording rules pre-compute frequently used or expensive PromQL expressions and store results as new time series. This improves dashboard performance and enables complex aggregations for alerting.
Recording rules are created using the same PrometheusRule
resources as alerting rules and are covered in detail in our alert rules documentation.
Key benefits for data transformation
- Performance optimization: Pre-calculate expensive aggregations to speed up dashboards
- Simplified queries: Break complex expressions into manageable, reusable components
- Custom metrics creation: Combine multiple metrics into business-relevant indicators
- Consistent calculations: Ensure identical computation across dashboards and alerts
For comprehensive guidance on creating and managing recording rules, including examples and best practices, see the recording rules section in our alert rules documentation.
Relabeling rules
Relabeling rules modify metric and log labels during collection, allowing you to filter data, add context, or standardize naming conventions before storage.
Metrics relabeling
Configure relabeling in ServiceMonitors and PodMonitors to transform metrics during scraping:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
observability.giantswarm.io/tenant: my_team
name: application-metrics
namespace: my-namespace
spec:
endpoints:
- path: /metrics
port: metrics
# Transform labels during collection
relabelings:
# Add environment label based on namespace
- sourceLabels: [__meta_kubernetes_namespace]
targetLabel: environment
regex: "production-(.*)"
replacement: "prod"
# Drop sensitive metrics
- sourceLabels: [__name__]
regex: "secret_.*|password_.*"
action: drop
# Rename metric labels
- sourceLabels: [application_name]
targetLabel: app
action: replace
selector:
matchLabels:
app: my-application
Log relabeling
Configure relabeling in PodLogs resources to enrich log metadata:
apiVersion: monitoring.grafana.com/v1alpha2
kind: PodLogs
metadata:
name: application-logs
namespace: my-namespace
spec:
relabelings:
# Set tenant for data routing
- action: replace
replacement: my_team
targetLabel: giantswarm_observability_tenant
# Add application version from pod labels
- sourceLabels: [__meta_kubernetes_pod_label_version]
targetLabel: app_version
action: replace
# Extract service name from pod name
- sourceLabels: [__meta_kubernetes_pod_name]
targetLabel: service
regex: "(.+)-[0-9a-f]+-[0-9a-z]{5}"
replacement: "${1}"
selector:
matchLabels:
app: my-application
For detailed relabeling configuration, see the Prometheus relabeling documentation.
Data parsing and enrichment
Transform unstructured logs into structured data using LogQL parsers and extract meaningful information for analysis.
JSON parsing
Extract fields from JSON-formatted logs:
# Parse JSON logs and extract specific fields
{app="my-application"}
| json
| level="error"
| line_format "{{.timestamp}} [{{.level}}] {{.component}}: {{.message}}"
Pattern extraction
Use regex patterns to extract data from unstructured logs:
# Extract HTTP request details from access logs
{job="nginx"}
| pattern `<ip> - - [<timestamp>] "<method> <uri> <protocol>" <status> <bytes>`
| status >= 400
Label enhancement
Add contextual information during log processing:
# Add severity based on log level
{app="my-application"}
| json level
| label_format severity=`{{ if eq .level "error" }}critical{{ else if eq .level "warn" }}warning{{ else }}info{{ end }}`
For advanced LogQL techniques, see our advanced LogQL tutorial.
Grafana transformations
Grafana transformations process data client-side during visualization, enabling real-time calculations and formatting without modifying stored data.
Common transformation use cases
- Calculate derived values: Create ratios, percentages, or growth rates
- Merge data sources: Combine metrics and logs in single visualizations
- Format for presentation: Rename fields, apply units, or create custom formatting
- Filter and aggregate: Focus on specific data subsets or summary statistics
Example transformations
Calculate error percentage:
- Query total requests:
sum(rate(http_requests_total[5m]))
- Query error requests:
sum(rate(http_requests_total{status=~"5.."}[5m]))
- Apply “Add field from calculation” transformation
- Formula:
Error Rate = (Error Requests / Total Requests) * 100
Merge time series data:
- Query multiple metrics with different time ranges
- Apply “Merge” transformation to combine series
- Use “Organize fields” to rename and reorder columns
Performance considerations
- Use sparingly: Client-side transformations impact dashboard performance
- Prefer recording rules: For frequently used calculations, create recording rules instead
- Consider data volume: Large datasets may cause browser performance issues
For comprehensive transformation examples, see the Grafana transformations documentation.
Best practices
Performance optimization
- Use recording rules for expensive calculations used in multiple dashboards
- Apply relabeling early in the pipeline to reduce storage and network overhead
- Limit transformation complexity in Grafana to maintain dashboard responsiveness
Data quality
- Validate transformations in test environments before production deployment
- Monitor transformation impact on resource usage and query performance
- Document transformation logic for maintenance and troubleshooting
Security and compliance
- Drop sensitive data early in the pipeline using relabeling rules
- Standardize labeling across teams to improve data discoverability
- Use tenant isolation to ensure data transformation doesn’t cross tenant boundaries
Next steps
- Learn how to create effective dashboards with your transformed data
- Set up alerting rules using recording rules for better performance
For questions about data transformation, contact your Giant Swarm support team or explore our community resources.
Need help, got feedback?
We listen to your Slack support channel. You can also reach us at support@giantswarm.io. And of course, we welcome your pull requests!