Young Gyu Kim <credemol@gmail.com>

Automating Grafana Resource Provisioning with Helm Charts

On this page

Overview
Grafana Datasources
- Key Capabilities
- Example: Prometheus Datasource Configuration
Grafana Dashboard Providers
- Key Capabilities
- Example: Dashboard Provider Configuration
Grafana Dashboards
Grafana Alerting
Contact Points
- Key Capabilities
- Example: Email Contact Point Configuration
Alert Rules
- Key Capabilities
- Example: Alert Rules for JVM Memory and CPU Usage
Conclusion

Overview

This guide outlines how to provision Grafana resources using Helm charts in a Kubernetes environment. The following Grafana components are covered:

Datasources
Dashboard Providers
Dashboards
Alerting Contact Points
Alerting Rules

For more details on Grafana provisioning, refer to the official documentation:

https://grafana.com/docs/grafana/latest/administration/provisioning/

Grafana Datasources

A Grafana datasource defines how Grafana connects to external systems such as Prometheus, Elasticsearch, or MySQL to retrieve data for dashboards and panels.

Key Capabilities

Connect to external data sources
Configure authentication and access modes
Query and visualize metrics
Support various data source types (e.g., Prometheus, Loki)

Example: Prometheus Datasource Configuration

Here, we will configure a Prometheus datasource for Grafana using a Helm chart. This datasource will be used to query metrics collected by Prometheus.

values.yaml - datasources

datasources:
  datasources.yaml:
    apiVersion: 1
    datasources:
    - name: o11y-prometheus
      type: prometheus
      url: http://prometheus:9090
      uid: o11y-prometheus
      # Access mode - proxy (server in the UI) or direct (browser in the UI).
      access: proxy
      isDefault: true

Figure 1. Grafana Datasource

Grafana Dashboard Providers

Dashboard providers in Grafana enable you to load dashboards from files at startup. This allows dashboards to be managed declaratively, which is ideal for CI/CD pipelines or GitOps workflows.

Key Capabilities

Load dashboards automatically from JSON files
Group dashboards into folders
Control editability and deletion
Integrate with Infrastructure as Code tools

Example: Dashboard Provider Configuration

The example below shows how to configure a dashboard provider in Grafana using a Helm chart. This configuration will load dashboards from a specified directory and make them available in Grafana.

values.yaml - dashboardProviders

dashboardProviders:
  dashboardproviders.yaml:
    apiVersion: 1
    providers:
      - name: default
        orgId: 1
        folder: ''
        type: file
        disableDeletion: false
        editable: true
        options:
          path: /var/lib/grafana/dashboards/default

Grafana Dashboards

Dashboards are visual interfaces that display data through panels such as graphs, tables, and gauges. They provide real-time observability and insights for developers and operators

Key Capabilities

Aggregate and visualize telemetry data
Organize data into panels and layouts
Share and reuse across teams

Example 1: Inline JSON Dashboard Configuration

RAW_JSON is a JSON representation of a Grafana dashboard. The example below shows how to configure a dashboard in Grafana using a Helm chart. This configuration will load a dashboard from a JSON file and make it available in Grafana.

values.yaml - dashboards

dashboards:
  # default is the name of the dashboard provider
  default:
    some-dashboard:
      json: |
        $RAW_JSON

This way is more straightforward for simple dashboards, but it can become cumbersome for larger or more complex dashboards. In this document, we will focus on using ConfigMaps to manage Grafana dashboards, which is a more scalable approach.

Example 2: Using a ConfigMap to Load Dashboards

The example below shows how to configure a dashboard in Grafana using a Helm chart with the dashboardConfigMaps section. This configuration will load a dashboard from a ConfigMap and make it available in Grafana.

values.yaml - dashboardConfigMaps

dashboardsConfigMaps:
  # default is the name of the dashboard provider
  # grafana-default-dashboards is the name of the ConfigMap containing the dashboard JSON files
  default: grafana-default-dashboards

Creating the ConfigMap

Use the following command to generate a ConfigMap containing all dashboard JSON files:

$ kubectl -n o11y create configmap grafana-default-dashboards --from-file=./dashboards/

This command creates a ConfigMap named grafana-default-dashboards in the o11y namespace, containing all JSON files from the ./dashboards/ directory. Grafana will automatically load these dashboards when it starts.

View Details - jvm-memory-dashboard.json

jvm-memory-dashboard.json

{
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": {
          "type": "grafana",
          "uid": "-- Grafana --"
        },
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "type": "dashboard"
      }
    ]
  },
  "editable": true,
  "fiscalYearStartMonth": 0,
  "graphTooltip": 0,
  "id": 3,
  "links": [],
  "panels": [
    {
      "collapsed": false,
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 0
      },
      "id": 3,
      "panels": [],
      "title": "JVM Metrics",
      "type": "row"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "o11y-prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisBorderShow": false,
            "axisCenteredZero": false,
            "axisColorMode": "text",
            "axisLabel": "Memory(MB)",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "barWidthFactor": 0.6,
            "drawStyle": "line",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "insertNulls": false,
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "auto",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green"
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": [
          {
            "__systemRef": "hideSeriesFrom",
            "matcher": {
              "id": "byNames",
              "options": {
                "mode": "exclude",
                "names": [
                  "otel-spring-example-57d5cc6b88-p6hj9"
                ],
                "prefix": "All except:",
                "readOnly": true
              }
            },
            "properties": [
              {
                "id": "custom.hideFrom",
                "value": {
                  "legend": false,
                  "tooltip": false,
                  "viz": false
                }
              }
            ]
          }
        ]
      },
      "gridPos": {
        "h": 7,
        "w": 11,
        "x": 0,
        "y": 1
      },
      "id": 2,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom",
          "showLegend": true
        },
        "tooltip": {
          "hideZeros": false,
          "mode": "single",
          "sort": "none"
        }
      },
      "pluginVersion": "12.0.0",
      "targets": [
        {
          "editorMode": "code",
          "expr": "sum by(pod) (jvm_memory_used_bytes{jvm_memory_type=\"heap\"} / (1024 * 1024))",
          "legendFormat": "__auto",
          "range": true,
          "refId": "A"
        }
      ],
      "title": "JVM Used Memory(MB)",
      "type": "timeseries"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "o11y-prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisBorderShow": false,
            "axisCenteredZero": false,
            "axisColorMode": "text",
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "barWidthFactor": 0.6,
            "drawStyle": "line",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "insertNulls": false,
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "auto",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green"
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 7,
        "w": 11,
        "x": 11,
        "y": 1
      },
      "id": 1,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom",
          "showLegend": true
        },
        "tooltip": {
          "hideZeros": false,
          "mode": "single",
          "sort": "none"
        }
      },
      "pluginVersion": "12.0.0",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "o11y-prometheus"
          },
          "editorMode": "code",
          "expr": "(\n                                      sum by(pod) (jvm_memory_used_bytes{jvm_memory_type=\"heap\"})\n                                      /\n                                      sum by(pod) (jvm_memory_limit_bytes{jvm_memory_type=\"heap\"})\n                                    ) * 100",
          "legendFormat": "__auto",
          "range": true,
          "refId": "A"
        }
      ],
      "title": "Java Memory Usage(%)",
      "type": "timeseries"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "o11y-prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisBorderShow": false,
            "axisCenteredZero": false,
            "axisColorMode": "text",
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "barWidthFactor": 0.6,
            "drawStyle": "line",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "insertNulls": false,
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "auto",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green"
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 7,
        "w": 11,
        "x": 0,
        "y": 8
      },
      "id": 4,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom",
          "showLegend": true
        },
        "tooltip": {
          "hideZeros": false,
          "mode": "single",
          "sort": "none"
        }
      },
      "pluginVersion": "12.0.0",
      "targets": [
        {
          "editorMode": "code",
          "expr": "sum by (pod)(avg_over_time(jvm_cpu_recent_utilization_ratio[2m]))",
          "legendFormat": "__auto",
          "range": true,
          "refId": "A"
        }
      ],
      "title": "Average CPU Utilization Ratio(2m)",
      "type": "timeseries"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "o11y-prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisBorderShow": false,
            "axisCenteredZero": false,
            "axisColorMode": "text",
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "barWidthFactor": 0.6,
            "drawStyle": "line",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "insertNulls": false,
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "auto",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green"
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": [
          {
            "__systemRef": "hideSeriesFrom",
            "matcher": {
              "id": "byNames",
              "options": {
                "mode": "exclude",
                "names": [
                  "otel-spring-example-57d5cc6b88-gwkkr"
                ],
                "prefix": "All except:",
                "readOnly": true
              }
            },
            "properties": [
              {
                "id": "custom.hideFrom",
                "value": {
                  "legend": false,
                  "tooltip": false,
                  "viz": false
                }
              }
            ]
          }
        ]
      },
      "gridPos": {
        "h": 7,
        "w": 11,
        "x": 11,
        "y": 8
      },
      "id": 5,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom",
          "showLegend": true
        },
        "tooltip": {
          "hideZeros": false,
          "mode": "single",
          "sort": "none"
        }
      },
      "pluginVersion": "12.0.0",
      "targets": [
        {
          "editorMode": "code",
          "expr": "sum by (pod)(jvm_thread_count)",
          "legendFormat": "__auto",
          "range": true,
          "refId": "A"
        }
      ],
      "title": "Thread Count per Pod",
      "type": "timeseries"
    }
  ],
  "preload": false,
  "schemaVersion": 41,
  "tags": [],
  "templating": {
    "list": []
  },
  "time": {
    "from": "now-1h",
    "to": "now"
  },
  "timepicker": {},
  "timezone": "browser",
  "title": "JVM Metrics Dashboard",
  "uid": "2549ec10-8ff8-43b3-899b-9b88d0d62b7d",
  "version": 17
}

Figure 2. Grafana Dashboard

Grafana Alerting

Grafana’s unified alerting system enables you to define alert rules and notification channels declaratively.

values.yaml - alerting structure

alerting:
  policies.yaml: {}
  rules.yaml: {}
  contactpoints.yaml: {}
  templates.yaml: {}
  mutetimes.yaml: {}

This guide focuses on contactpoints.yaml and rules.yaml.

Unlike dashboards, alerting resources are typically not managed via ConfigMaps but instead provisioned directly from Helm values.

Contact Points

Contact points define how alerts are delivered to external systems such as email, Slack, or PagerDuty.

Key Capabilities

Define alert notification channels
Configure contact methods for teams or services
Enable alert routing and escalation

Example: Email Contact Point Configuration

The example below shows how to configure a contact point in Grafana using a Helm chart. This configuration will create an email contact point that sends notifications to a specified email address.

values.yaml - contactpoints

alerting:
  contactpoints.yaml:
    apiVersion: 1
    contactPoints:
      - orgId: 1
        name: service-operators
        receivers:
          - uid: cen4b1ckf03cwb
            type: email
            settings:
              addresses: nsalexamy@gmail.com
              singleEmail: false
            disableResolveMessage: false

Figure 3. Grafana ContactPoints

Alert Rules

Alert rules define the logic and thresholds that determine when an alert should be triggered.

Key Capabilities

Monitor specific metrics using PromQL or expressions
Define thresholds for alerting
Trigger actions via contact points

Example: Alert Rules for JVM Memory and CPU Usage

The example below shows how to configure alert rules in Grafana using a Helm chart. This configuration will create alert rules that monitor JVM memory usage and CPU utilization, triggering notifications when thresholds are exceeded.

values.yaml - rules

# in alerting section
  rules.yaml:
    apiVersion: 1
    groups:
      - orgId: 1
        name: java-metrics-evaluation
        folder: java-metrics
        interval: 1m
        rules:
          - uid: een4b7bt3iu4gf
            title: High JVM Memory Usage
            condition: C
            data:
              - refId: A
                relativeTimeRange:
                  from: 600
                  to: 0
                datasourceUid: o11y-prometheus
                model:
                  editorMode: code
                  expr: |-
                    (
                                      sum by(pod) (jvm_memory_used_bytes{jvm_memory_type="heap"})
                                      /
                                      sum by(pod) (jvm_memory_limit_bytes{jvm_memory_type="heap"})
                                    ) * 100
                  instant: true
                  intervalMs: 1000
                  legendFormat: __auto
                  maxDataPoints: 43200
                  range: false
                  refId: A
              - refId: C
                datasourceUid: __expr__
                model:
                  conditions:
                    - evaluator:
                        params:
                          - 80
                        type: gt
                      operator:
                        type: and
                      query:
                        params:
                          - C
                      reducer:
                        params: [ ]
                        type: last
                      type: query
                  datasource:
                    type: __expr__
                    uid: __expr__
                  expression: A
                  intervalMs: 1000
                  maxDataPoints: 43200
                  refId: C
                  type: threshold
            dashboardUid: ""
            panelId: 0
            noDataState: NoData
            execErrState: Error
            for: 1m
            isPaused: false
            notification_settings:
              receiver: service-operators
      - orgId: 1
        name: jvm-cpu-evaluation
        folder: java-metrics
        interval: 30s
        rules:
          - uid: den6wkxi4n75sf
            title: high-jvm-cpu
            condition: C
            data:
              - refId: A
                relativeTimeRange:
                  from: 600
                  to: 0
                datasourceUid: o11y-prometheus
                model:
                  editorMode: code
                  expr: avg_over_time(jvm_cpu_recent_utilization_ratio[2m])
                  instant: true
                  intervalMs: 1000
                  legendFormat: __auto
                  maxDataPoints: 43200
                  range: false
                  refId: A
              - refId: C
                datasourceUid: __expr__
                model:
                  conditions:
                    - evaluator:
                        params:
                          - 0.6
                        type: gt
                      operator:
                        type: and
                      query:
                        params:
                          - C
                      reducer:
                        params: [ ]
                        type: last
                      type: query
                  datasource:
                    type: __expr__
                    uid: __expr__
                  expression: A
                  intervalMs: 1000
                  maxDataPoints: 43200
                  refId: C
                  type: threshold
            noDataState: NoData
            execErrState: Error
            for: 1m
            isPaused: false
            notification_settings:
              receiver: service-operators

Figure 4. Grafana Alert Rules

Conclusion

This guide demonstrated how to provision and manage Grafana resources using Helm charts, covering datasources, dashboard providers, dashboards, contact points, and alert rules. This approach enables repeatable, automated deployment of observability tooling within Kubernetes environments.

To explore advanced configurations, refer to the official Grafana documentation.

You can also view this document in web format at: https://nsalexamy.github.io/service-foundry/pages/documents/o11y-foundry/grafana-resource-provisioning/