Last modified October 7, 2025
Observe your clusters and apps
Giant Swarm’s observability platform gives you immediate visibility into your clusters and applications. Every cluster comes with automatic monitoring configured out of the box, collecting metrics and logs that help you understand system health and application performance. Additionally you can request distributed tracing to help you understand request flows across your microservices.
This guide will walk you through your first steps with the observability platform, from accessing pre-built dashboards to setting up custom monitoring for your applications.
What you get automatically
Every Giant Swarm cluster includes:
- Platform metrics: CPU, memory, disk, and network metrics from all cluster nodes
- Kubernetes metrics: Pod status, deployments, services, and cluster events
- Application logs: Automatic collection from all pods in your cluster
- Distributed tracing: Trace collection for understanding request flows. To enable on your clusters, ask your account engineer.
- Pre-built dashboards: Ready-to-use visualizations for infrastructure and platform components
- Default alerts: Proactive notifications for common infrastructure issues
- Secure access: Integration with your organization’s identity provider
Prerequisites
Before you start:
- Running workload cluster: If you don’t have one, create a workload cluster first
- Sample application: Deploy the
hello-worldapplication or use your own instrumented application - Local tools: Install
jqfor command-line JSON processing
Important: If you’re using your own application, ensure it’s instrumented to export metrics. Adding new metrics impacts platform costs, so choose your metrics thoughtfully.
Step 1: Access your observability platform
Start by accessing Grafana, your main interface for observability data:
- Find your Grafana URL: Check the ingress resource in the
monitoringnamespace of your management cluster - Log in: Use your organization’s identity provider credentials
- Explore pre-built dashboards: Navigate to
Dashboards→Giant Swarm Public Dashboards
You’ll immediately see dashboards showing:
- Cluster overview: Resource usage, node health, and capacity
- Kubernetes metrics: Pod status, deployment health, and service performance
- Infrastructure metrics: CPU, memory, disk, and network across all nodes
- Platform components: Ingress controllers, DNS, and other system services
Step 2: Set up application metrics collection
To monitor your specific applications, you need to configure metrics collection. For the hello-world application, enable the built-in service monitor by setting serviceMonitor.enabled to true in your Helm values.
For custom applications, create a ServiceMonitor resource:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
observability.giantswarm.io/tenant: my-team
app.kubernetes.io/instance: my-service
name: my-service
namespace: my-namespace
spec:
endpoints:
- interval: 60s
path: /metrics
port: web
selector:
matchLabels:
app.kubernetes.io/instance: my-service
Key configuration points:
- Tenant label: The
observability.giantswarm.io/tenant: my-teamlabel is required for metrics routing and data isolation - Scrape interval: Metrics are collected every 60 seconds (adjust based on your needs)
- Metrics endpoint: The
/metricspath should expose Prometheus-format metrics - Port reference: Use the port name from your service definition
- Label selector: Must match your application’s labels for discovery
Apply this configuration to start collecting metrics from your application.
⚠️ Need more advanced ingestion? See our comprehensive Data Ingestion guide for logs, metrics, and traces.
Step 3: Explore your metrics
Once metrics collection is configured, explore your data using Grafana’s Explore view:
- Open Explore: In Grafana, click
Explorein the left sidebar - Select data source: Choose the appropriate Prometheus data source
- Query your metrics: Use PromQL to query application metrics

Try these sample queries to get started:
rate(http_requests_total[5m])- Request rate over the last five minuteshistogram_quantile(0.95, http_request_duration_seconds_bucket)- 95th percentile response timeup{job="my-service"}- Check if your service is up and running
New to PromQL? Check our Advanced PromQL Tutorial for detailed guidance.
Step 4: Monitor application logs
Giant Swarm automatically collects logs from all pods. To access your application logs:
- Navigate to Explore: Select the Loki data source
- Use LogQL queries: Search and filter your logs
Try these sample queries:
{namespace="my-namespace"}- All logs from your namespace{app="my-app"} |= "error"- Error messages from your application{namespace="my-namespace"} | json | level="error"- Structured log parsing for error levels
Want to learn more? Our Advanced LogQL Tutorial covers complex log queries and analysis.
Step 5: Set up distributed tracing (Optional)
Distributed tracing helps you understand how requests flow through your microservices architecture. Giant Swarm’s observability platform supports tracing through Tempo and OpenTelemetry.
Important: Tracing is available in alpha from cluster release v31 and fully supported from v33. To use it in your installation, it must be enabled. You can request tracing in your observability platform by talking to your account engineer.
Prerequisites for tracing
- Cluster version: v31+ (alpha) or v33+ (full support)
- Tracing enabled: Contact your account engineer to enable tracing for your cluster
- Application instrumentation: Your applications must be instrumented with OpenTelemetry
Configure trace collection
Once tracing is enabled, configure your applications to send traces to the cluster-local OpenTelemetry (OTLP) endpoint. Note that this example uses the environment variables to configure the OTLP endpoint but your app might need to be configured in a different way.
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-traced-app
spec:
template:
metadata:
labels:
# Required for trace data routing
observability.giantswarm.io/tenant: my-team
app: my-traced-app
spec:
containers:
- name: app
image: my-app:latest
env:
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "http://otlp-gateway.kube-system.svc:4318" # 4318 for http, or 4317 for grpc
- name: OTEL_RESOURCE_ATTRIBUTES
value: "service.name=my-traced-app"
Explore traces with TraceQL
Once traces are flowing, explore them in Grafana:
- Navigate to Explore: Select the Tempo data source
- Use TraceQL queries: Search and analyze your traces
Try these example queries:
{resource.service.name = "my-traced-app"}- All traces from the service “my-traced-app”{trace.duration > 5s}- Find slow requests{status = error}- Find failed traces{span.http.method = "POST" && span.http.status_code >= 400}- Failed HTTP requests
Service graph insights
Tempo automatically generates service graphs from your trace data, showing:
- Service topology: How your services communicate
- Request rates: Traffic between services
- Error rates: Failed requests between services
- Performance metrics: Response times and throughput
Access the service graph in Grafana under the Tempo data source’s “Service Graph” tab.
Want to learn more? Our Advanced TraceQL Tutorial covers complex trace queries and analysis.
Step 6: Review pre-built dashboards
The platform provides comprehensive pre-built dashboards for immediate insights. Access them in Grafana under Dashboards → Giant Swarm Public Dashboards:
Infrastructure dashboards:
- Cluster overview: High-level cluster health and resource usage
- Node metrics: Individual node performance and capacity
- Kubernetes resources: Pod, deployment, and service status
Platform component dashboards:
- Ingress controller: Request rates, response times, and error rates
- DNS performance: Query success rates and response times
- Flux GitOps: Deployment status and reconciliation metrics
Application insights:
Below is an example showing hello-world application metrics, including the cardinality of the promhttp_metric_handler_requests_total metric:

These dashboards give you instant visibility into system health and help identify patterns or issues quickly.
Step 7: Create your first custom dashboard
While pre-built dashboards provide great starting points, you’ll likely want custom visualizations for your specific applications and workflows.
Method 1: Create in Grafana UI (Quick start)
- Create new dashboard: In Grafana, click
+→Dashboard - Add panels: Choose visualization types (graphs, tables, stats)
- Configure queries: Use PromQL for metrics, LogQL for logs, or TraceQL for traces
- Save dashboard: Name and organize your dashboard
Your dashboard automatically persists in the platform’s PostgreSQL storage with regular backups.
Method 2: GitOps approach (Recommended)
For production environments, treat dashboards as code:
- Export dashboard JSON: Use
Share→Exportfrom any dashboard - Store in Git: Version control your dashboard definitions
- Deploy via ConfigMaps: Use Kubernetes resources to deploy dashboards
- Automate updates: Integrate with your CI/CD pipeline
Download our example dashboard or check our comprehensive dashboard creation guide for detailed instructions.
Import the example dashboard
To quickly get started with a custom dashboard:
- Open import: Go to
Dashboards→New→Import

- Upload JSON: Load the example dashboard file

- Explore the result: Your new dashboard shows application-specific metrics

The example dashboard demonstrates key application metrics like error rates, success rates, and request distribution by HTTP status code.
What’s next
Now that you’re monitoring your clusters and applications, explore these advanced capabilities:
Enhance your observability
- Set up alerting: Get notified before issues impact users
- Learn advanced querying: Master PromQL and LogQL for deeper insights
- Configure data ingestion: Collect custom logs, metrics, and traces
- Explore data transformation: Optimize metrics storage and processing
Integrate external tools
- Import/export data: Connect external systems and analysis tools
- Set up multi-tenancy: Organize data access for teams and environments
Ready to explore platform security? Learn more in the security overview.
Need help, got feedback?
We listen to your Slack support channel. You can also reach us at support@giantswarm.io. And of course, we welcome your pull requests!