Last modified July 24, 2025
Observe your clusters and apps
Giant Swarm’s observability platform gives you immediate visibility into your clusters and applications. Every cluster comes with automatic monitoring configured out of the box, collecting metrics, logs, and traces that help you understand system health and application performance.
This guide will walk you through your first steps with the observability platform, from accessing pre-built dashboards to setting up custom monitoring for your applications.
What you get automatically
Every Giant Swarm cluster includes:
- Platform metrics: CPU, memory, disk, and network metrics from all cluster nodes
- Kubernetes metrics: Pod status, deployments, services, and cluster events
- Application logs: Automatic collection from all pods in your cluster
- Pre-built dashboards: Ready-to-use visualizations for infrastructure and platform components
- Default alerts: Proactive notifications for common infrastructure issues
- Secure access: Integration with your organization’s identity provider
Prerequisites
Before you start:
- Running workload cluster: If you don’t have one, create a workload cluster first
- Sample application: Deploy the
hello-world
application or use your own instrumented application - Local tools: Install
jq
for command-line JSON processing
Important: If you’re using your own application, ensure it’s instrumented to export metrics. Adding new metrics impacts platform costs, so choose your metrics thoughtfully.
Step 1: Access your observability platform
Start by accessing Grafana, your main interface for observability data:
- Find your Grafana URL: Check the ingress resource in the
monitoring
namespace of your management cluster - Log in: Use your organization’s identity provider credentials
- Explore pre-built dashboards: Navigate to
Dashboards
→Giant Swarm Public Dashboards
You’ll immediately see dashboards showing:
- Cluster overview: Resource usage, node health, and capacity
- Kubernetes metrics: Pod status, deployment health, and service performance
- Infrastructure metrics: CPU, memory, disk, and network across all nodes
- Platform components: Ingress controllers, DNS, and other system services
Step 2: Set up application metrics collection
To monitor your specific applications, you need to configure metrics collection. For the hello-world
application, enable the built-in service monitor by setting serviceMonitor.enabled
to true
in your Helm values.
For custom applications, create a ServiceMonitor
resource:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
observability.giantswarm.io/tenant: my-team
app.kubernetes.io/instance: my-service
name: my-service
namespace: my-namespace
spec:
endpoints:
- interval: 60s
path: /metrics
port: web
selector:
matchLabels:
app.kubernetes.io/instance: my-service
Key configuration points:
- Tenant label: The
observability.giantswarm.io/tenant: my-team
label is required for metrics routing and data isolation - Scrape interval: Metrics are collected every 60 seconds (adjust based on your needs)
- Metrics endpoint: The
/metrics
path should expose Prometheus-format metrics - Port reference: Use the port name from your service definition
- Label selector: Must match your application’s labels for discovery
Apply this configuration to start collecting metrics from your application.
⚠️ Need more advanced ingestion? See our comprehensive Data Ingestion guide for metrics and logs.
Step 3: Explore your metrics
Once metrics collection is configured, explore your data using Grafana’s Explore view:
- Open Explore: In Grafana, click
Explore
in the left sidebar - Select data source: Choose the appropriate Prometheus data source
- Query your metrics: Use PromQL to query application metrics
Try these sample queries to get started:
rate(http_requests_total[5m])
- Request rate over the last five minuteshistogram_quantile(0.95, http_request_duration_seconds_bucket)
- 95th percentile response timeup{job="my-service"}
- Check if your service is up and running
New to PromQL? Check our Advanced PromQL Tutorial for detailed guidance.
Step 4: Monitor application logs
Giant Swarm automatically collects logs from all pods. To access your application logs:
- Navigate to Explore: Select the Loki data source
- Use LogQL queries: Search and filter your logs
Try these sample queries:
{namespace="my-namespace"}
- All logs from your namespace{app="my-app"} |= "error"
- Error messages from your application{namespace="my-namespace"} | json | level="error"
- Structured log parsing for error levels
Want to learn more? Our Advanced LogQL Tutorial covers complex log queries and analysis.
Step 5: Review pre-built dashboards
The platform provides comprehensive pre-built dashboards for immediate insights. Access them in Grafana under Dashboards
→ Giant Swarm Public Dashboards
:
Infrastructure dashboards:
- Cluster overview: High-level cluster health and resource usage
- Node metrics: Individual node performance and capacity
- Kubernetes resources: Pod, deployment, and service status
Platform component dashboards:
- Ingress controller: Request rates, response times, and error rates
- DNS performance: Query success rates and response times
- Flux GitOps: Deployment status and reconciliation metrics
Application insights:
Below is an example showing hello-world
application metrics, including the cardinality of the promhttp_metric_handler_requests_total
metric:
These dashboards give you instant visibility into system health and help identify patterns or issues quickly.
Step 6: Create your first custom dashboard
While pre-built dashboards provide great starting points, you’ll likely want custom visualizations for your specific applications and workflows.
Method 1: Create in Grafana UI (Quick start)
- Create new dashboard: In Grafana, click
+
→Dashboard
- Add panels: Choose visualization types (graphs, tables, stats)
- Configure queries: Use PromQL for metrics or LogQL for logs
- Save dashboard: Name and organize your dashboard
Your dashboard automatically persists in the platform’s PostgreSQL storage with regular backups.
Method 2: GitOps approach (Recommended)
For production environments, treat dashboards as code:
- Export dashboard JSON: Use
Share
→Export
from any dashboard - Store in Git: Version control your dashboard definitions
- Deploy via ConfigMaps: Use Kubernetes resources to deploy dashboards
- Automate updates: Integrate with your CI/CD pipeline
Download our example dashboard or check our comprehensive dashboard creation guide for detailed instructions.
Import the example dashboard
To quickly get started with a custom dashboard:
- Open import: Go to
Dashboards
→New
→Import
- Upload JSON: Load the example dashboard file
- Explore the result: Your new dashboard shows application-specific metrics
The example dashboard demonstrates key application metrics like error rates, success rates, and request distribution by HTTP status code.
What’s next
Now that you’re monitoring your clusters and applications, explore these advanced capabilities:
Enhance your observability
- Set up alerting: Get notified before issues impact users
- Learn advanced querying: Master PromQL and LogQL for deeper insights
- Configure data ingestion: Collect custom metrics and logs
- Explore data transformation: Optimize metrics storage and processing
Integrate external tools
- Import/export data: Connect external systems and analysis tools
- Set up multi-tenancy: Organize data access for teams and environments
Ready to explore platform security? Learn more in the security overview.
Need help, got feedback?
We listen to your Slack support channel. You can also reach us at support@giantswarm.io. And of course, we welcome your pull requests!