Last modified July 24, 2025

Observe your clusters and apps

Your Giant Swarm clusters are automatically connected to the observability platform the moment they’re created. This means you can start monitoring your infrastructure and applications immediately, without any additional setup or configuration.

This tutorial shows you how to access and use the monitoring capabilities that are ready and waiting for you.

Access your Grafana instance

Your observability platform is accessible through your dedicated Grafana instance:

Find your Grafana URL: Your Grafana instance is available at https://grafana.<your-base-domain>
Log in: Use your organization’s single sign-on (SSO) credentials
Start exploring: You’ll see a familiar Grafana interface with data already flowing in

🔗 Need help with access? Check our data exploration guide for detailed authentication steps.

Explore pre-configured dashboards

Giant Swarm provides production-ready dashboards that give you immediate insights into your clusters and workloads.

Platform dashboards

These dashboards show the health and performance of your Giant Swarm platform:

Cluster Overview: High-level view of all your clusters, their status, and resource utilization
Node Monitoring: Detailed metrics for cluster nodes including CPU, memory, disk, and network
Kubernetes Monitoring: Pod status, deployments, services, and Kubernetes-specific metrics
Platform Health: Giant Swarm platform components and their operational status

Application dashboards

Monitor your workloads with dashboards designed for application observability:

Application Performance: Response times, throughput, and error rates for your services
Resource Usage: CPU, memory, and storage consumption by your applications
Network Traffic: Service-to-service communication and external connectivity
Log Analysis: Application logs with filtering and search capabilities

Finding and using dashboards

Browse by folder: Dashboards are organized in folders like “Platform”, “Applications”, and “Infrastructure”
Use search: Find specific dashboards using the search bar
Filter by tags: Dashboards are tagged by cluster, namespace, or application type
Bookmark favorites: Star the dashboards you use most frequently

Monitor cluster health

Key metrics to watch

Your clusters generate hundreds of metrics, but here are the most important ones to start with:

Cluster-level health:

Overall cluster status and node availability
Resource utilization across the cluster
Network connectivity and performance

Node-level health:

CPU and memory usage per node
Disk space and I/O performance
Network traffic and errors

Application health:

Pod restart counts and failure rates
Resource requests vs. actual usage
Service response times and error rates

Quick health checks

Use these dashboards for rapid health assessment:

Start with the Cluster Overview to get a bird’s-eye view
Drill down to Node Monitoring if you see resource issues
Check Application Performance for workload-specific problems
Review Platform Health if there are cluster-wide issues

View application logs

Accessing logs

Your application logs are automatically collected and available in Grafana:

Go to the Explore tab in Grafana
Select the Loki data source for log queries
Use LogQL queries to filter and search your logs
Apply time ranges to focus on specific incidents

Common log queries

Here are some useful LogQL queries to get you started:

# All logs from a specific cluster
{cluster_id="your-cluster-name"}

# Application logs by namespace
{cluster_id="your-cluster-name", namespace="your-app-namespace"}

# Error logs only
{cluster_id="your-cluster-name"} |= "error"

# Logs from specific pods
{cluster_id="your-cluster-name", pod=~"your-app-.*"}

🎓 Want to learn more? Check our Advanced LogQL Tutorial for powerful log analysis techniques.

Get alerted on issues

Default alert rules

Your clusters come with essential alert rules already configured:

Node down: Get notified when cluster nodes become unavailable
High resource usage: Alerts when CPU or memory usage exceeds thresholds
Pod failures: Notifications about failing or repeatedly restarting pods
Disk space: Warnings when storage is running low

Viewing active alerts

Check the Alerting section in Grafana’s main menu
Review Alert Rules to see what’s being monitored
Check Alert Groups to see current alert status
Configure contact points to receive notifications

Customizing alerts

While default alerts cover essential monitoring, you can customize them for your needs:

Adjust thresholds based on your applications’ normal behavior
Add application-specific alerts for business metrics
Configure notification channels like Slack, email, or PagerDuty
Set up alert silences during maintenance windows

🔧 Ready to customize? Learn more in our alert management documentation.

Monitor application performance

Application metrics

If your applications expose metrics (using Prometheus format), they’re automatically collected and available for monitoring:

HTTP request metrics: Response times, status codes, throughput
Business metrics: Custom metrics specific to your application logic
Runtime metrics: Garbage collection, memory usage, thread counts (for applicable languages)

Adding metrics to your apps

To get metrics from your applications:

Expose metrics endpoint: Configure your app to serve metrics at /metrics
Add ServiceMonitor: Create a Kubernetes ServiceMonitor resource to tell the platform about your metrics
View in Grafana: Your metrics will appear in the Prometheus data source within minutes

Example ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: my-app-metrics
  namespace: my-app-namespace
spec:
  selector:
    matchLabels:
      app: my-app
  endpoints:
  - port: metrics
    path: /metrics

🚀 Need help with metrics? Our data ingestion guide covers this in detail.

Next steps

Now that you’re monitoring your clusters and applications, here are some next steps to get even more value from the observability platform:

Explore deeper

Learn advanced querying with PromQL and LogQL
Create custom dashboards tailored to your team’s needs
Set up multi-tenancy to organize data by team or environment

Integrate external data

Import external logs from SaaS applications or other infrastructure
Export data to external monitoring tools or analytics platforms
Transform data to match your organization’s standards

Scale your monitoring

Configure advanced alerting with complex conditions and routing
Manage alert routing to ensure the right teams get notified
Organize teams with separate Grafana organizations

The observability platform grows with your needs - start simple and add complexity as your monitoring requirements evolve.

Need help, got feedback?

We listen to your Slack support channel. You can also reach us at support@giantswarm.io. And of course, we welcome your pull requests!

Giant Swarm Offerings