Commit 1e8b10ef by Diana Payton Committed by GitHub

Docs: Add best practices (#27057)

* added content

* adding content

* Update best-practices-for-creating-dashboards.md

* moved strategies.md

* updating content

* Update best-practices-for-creating-dashboards.md

* Update best-practices-for-managing-dashboards.md

* Update best-practices-for-creating-dashboards.md

* Update best-practices-for-managing-dashboards.md

* Update best-practices-for-creating-dashboards.md

* content updates

* Update common-observability-strategies.md

* Update best-practices-for-managing-dashboards.md

* Update best-practices-for-managing-dashboards.md

* Update dashboard-management-maturity-levels.md

* updated menu

* Update dashboard-management-maturity-levels.md

* added Torkel tips

* Update best-practices-for-creating-dashboards.md

* Update docs/sources/best-practices/best-practices-for-creating-dashboards.md

Co-authored-by: Peter Holmberg <peterholmberg@users.noreply.github.com>

* Update docs/sources/best-practices/best-practices-for-managing-dashboards.md

Co-authored-by: Peter Holmberg <peterholmberg@users.noreply.github.com>

* Update docs/sources/best-practices/best-practices-for-managing-dashboards.md

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>

* Update docs/sources/best-practices/dashboard-management-maturity-levels.md

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>

* Update docs/sources/best-practices/dashboard-management-maturity-levels.md

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>

* Update best-practices-for-creating-dashboards.md

* DavKal updates

* Update best-practices-for-creating-dashboards.md

Co-authored-by: Peter Holmberg <peterholmberg@users.noreply.github.com>
Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>
parent 66c6b20c
+++
title = "Best practices"
description = "Best practices for working with Grafana"
type = "docs"
[menu.docs]
weight = 200
+++
# Best practices
This section provides information about best practices for intermediate Grafana administrators and users.
+++
title = "Best practices for creating dashboards"
description = "Best practices for creating dashboards in Grafana"
type = "docs"
[menu.docs]
weight = 200
+++
# Best practices for creating dashboards
This page outlines some best practices to follow when creating Grafana dashboards.
## Before you begin
Here are some principles to consider before you create a dashboard.
### A dashboard should tell a story or answer a question
What story are you trying to tell with your dashboard? Try to create a logical progression of data, such as large to small or general to specific. What is the goal for this dashboard? (Hint: If the dashboard doesn't have a goal, then ask yourself if you really need the dashboard.)
Keep your graphs simple and focused on answering the question that you are asking. For example, if your question is "which servers are in trouble?", then maybe you don't need to show all the server data. Just show data for the ones in trouble.
### Dashboards should reduce cognitive load, not add to it
_Cognitive load_ is basically how hard you need to think about something in order to figure it out. Make your dashboard easy to interpret. Other users and future you (when you're trying to figure out what broke at 2AM) will appreciate it.
Ask yourself:
- Can I tell what exactly each graph represents? Is it obvious, or do I have to think about it?
- If I show this to someone else, how long will it take them to figure it out? Will they get lost?
### Have a monitoring strategy
It's easy to make new dashboards. It's harder to optimize dashboard creation and adhere to a plan, but it's worth it. This strategy should govern both your overall dashboard scheme and enforce consistency in individual dashboard design.
Refer to [Common observability strategies]({{< relref "common-observability-strategies.md" >}}) and [Dashboard management maturity levels]({{< relref "dashboard-management-maturity-levels.md" >}}) for more information.
### Write it down
Once you have a strategy or design guidelines, write them down to help maintain consistency over time. Check out this [Wikimedia runbook example](https://wikitech.wikimedia.org/wiki/Performance/Runbook/Grafana_best_practices).
## Best practices to follow
- When creating a new dashboard, make sure it has a meaningful name.
- If you are creating a dashboard to play or experiment, then put the word `TEST` or `TMP` in the name.
- Consider including your name or initials in the dashboard name or as a tag so that people know who owns the dashboard.
- Remove temporary experiment dashboards when you are done with them.
- If you create many related dashboards, think about how to cross-reference them for easy navigation. Refer to [Best practices for managing dashboards]({{< relref "best-practices-for-managing-dashboards.md" >}}) for more information.
- Grafana retrieves data from a data source. A basic understanding of [data sources]({{< relref "../features/datasources/data-sources.md" >}}) in general and your specific is important.
- Avoid unnecessary dashboard refreshing to reduce the load on the network or backend. For example, if your data changes every hour, then you don't need to set the dashboard refresh rate to 30 seconds.
- Use the left and right Y-axes when displaying time series with different units or ranges.
- Add documentation to dashboards and panels.
- To add documentation to a dashboard, add a [Text panel visualization]({{< relref "../panels/visualizations/text-panel.md" >}}) to the dashboard. Record things like the purpose of the dashboard, useful resource links, and any instructions users might need to interact with the dashboard. Check out this [Wikimedia example](https://grafana.wikimedia.org/d/000000066/resourceloader?orgId=1).
- To add documentation to a panel, [edit the panel settings]({{< relref "../panels/add-a-panel.md#edit-panel-settings" >}}) and add a description. Any text you add will appear if you hover your cursor over the small `i` in the top left corner of the panel.
- Reuse your dashboards and enforce consistency by using [templates and variables]({{< relref "../variables/templates-and-variables.md" >}}).
- Be careful with stacking graph data. The visualizations can be misleading, and hide important data. We recommend turning it off in most cases.
+++
title = "Best practices for managing dashboards"
description = "Best practices for managing dashboards in Grafana"
type = "docs"
[menu.docs]
weight = 200
+++
# Best practices for managing dashboards
This page outlines some best practices to follow when managing Grafana dashboards.
## Before you begin
Here are some principles to consider before you start managing dashboards.
### Strategic observability
There are several [common observability strategies]({{< relref "common-observability-strategies.md" >}}). You should research them and decide whether one of them works for you or if you want to come up with your own. Either way, have a plan, write it down, and stick to it.
Adapt your strategy to changing needs as necessary.
### Maturity level
What is your dashboard maturity level? Analyze your current dashboard setup and compare it to the [Dashboard management maturity levels]({{< relref "dashboard-management-maturity-levels.md" >}}). Understanding where you are can help you decide how to get to where you want to be.
## Best practices to follow
- Avoid dashboard sprawl, meaning the uncontrolled growth of dashboards. Dashboard sprawl negatively affects time to find the right dashboard. Duplicating dashboards and changing “one thing” (worse: keeping original tags) is the easiest kind of sprawl.
- Periodically review the dashboards and remove unnecessary ones.
- If you create a temporary dashboard, perhaps to test something, prefix the name with `TEST: `. Delete the dashboard when you are finished.
- Copying dashboards with no significant changes is not a good idea.
- You miss out on updates to the original dashboard, such as documentation changes, bug fixes, or additions to metrics.
- In many cases copies are being made to simply customize the view by setting template parameters. This should instead be done by maintaining a link to the master dashboard and customizing the view with [URL parameters]({{< relref "../variables/variable-types/url-variables.md" >}}).
- When you must copy a dashboard, clearly rename it and _do not_ copy the dashboard tags. Tags are important metadata for dashboards that are used during search. Copying tags can result in false matches.
- Maintain a dashboard of dashboards or cross-reference dashboards. This can be done in several ways:
- Create dashboard links, panel, or data links. Links can go to other dashboards or to external systems. For more information, refer to [Linking]({{< relref "../linking/_index.md" >}}).
- Add a [Dashboard list panel]({{< relref "../panels/visualizations/dashboard-list-panel.md" >}}). You can then customize what you see by doing tag or folder searches.
- Add a [Text panel]({{< relref "../panels/visualizations/text-panel.md" >}}) and use markdown to customize the display.
+++
title = "Monitoring strategies"
description = "Common monitoring strategies"
title = "Common observability strategies"
description = "Common observability strategies"
keywords = ["grafana", "intro", "guide", "concepts", "methods"]
aliases = ["/docs/grafana/latest/getting-started/strategies/"]
type = "docs"
[menu.docs]
weight = 500
......@@ -52,3 +53,5 @@ This method is similar to the RED method, but it includes saturation.
- **Traffic -** How much demand is placed on your system
- **Errors -** Rate of requests that are failing
- **Saturation -** How "full" your system is
[Here's an example from Grafana Play](https://play.grafana.org/d/000000109/the-four-golden-signals?orgId=1).
+++
title = "Dashboard management maturity model"
description = "Explanation of dashboard management maturity model"
type = "docs"
[menu.docs]
weight = 200
+++
# Dashboard management maturity model
_Dashboard management maturity_ refers to how well-designed and efficient your dashboard ecosystem is. We recommend periodically reviewing your dashboard setup to gauge where you are and how you can improve.
Broadly speaking, dashboard maturity can be defined as low, medium, or high.
Much of the content for this topic was taken from the KubeCon 2019 talk [Fool-Proof Kubernetes Dashboards for Sleep-Deprived Oncalls](https://www.youtube.com/watch?v=YE2aQFiMGfY).
## Low - default state
At this stage, you have no coherent dashboard management strategy. Almost everyone starts here.
How can you tell you are here?
- Everyone can modify your dashboards.
- Lots of copied dashboards, little to no dashboard reuse.
- One-off dashboards that hang around forever.
- No version control (dashboard JSON in version control).
- Lots of browsing for dashboards, searching for the right dashboard. This means lots of wasted time trying to find the dashboard you need.
- Not having any alerts to direct you to the right dashboard.
## Medium - methodical dashboards
At this stage, you are starting to manage your dashboard use with methodical dashboards. You might have laid out a strategy, but there are some things you could improve.
How can you tell you are here?
- Prevent sprawl by using template variables. For example, you don't need a separate dashboard for each node, you can use query variables. Even better, you can make the data source a template variable too, so you can reuse the same dashboard across different clusters and monitoring backends.
Refer to the list of [Variable examples]({{< relref "../variables/variable-examples.md" >}}) if you want some ideas.
- Methodical dashboards according to an [observability strategy]({{< relref "common-observability-strategies.md" >}}).
- Hierarchical dashboards with drill-downs to the next level.
{{< imgbox max-width="100%" img="/img/docs/best-practices/drill-down-example.png" caption="Example of using drill-down" >}}
- Dashboard design reflects service hierarchies. The example shown below uses the RED method (request and error rate on the left, latency duration on the right) with one row per service. The row order reflects the data flow.
{{< imgbox max-width="100%" img="/img/docs/best-practices/service-hierarchy-example.png" caption="Example of a service hierarchy" >}}
- Compare like to like: split service dashboards when the magnitude differs. Make sure aggregated metrics don't drown out important information.
- Expressive charts with meaningful use of color and normalizing axes where you can.
- Example of meaningful color: Blue means it's good, red means it's bad. [Thresholds]({{< relref "../panels/thresholds.md" >}}) can help with that.
- Example of normalizing axes: When comparing CPU usage, measure by percentage rather than raw number, because machines can have a different number of cores. Normalizing CPU usage by the number of cores reduces cognitive load because the viewer can trust that at 100% all cores are being used, without having to know the number of CPUs.
- Directed browsing cuts down on "guessing."
- Template variables make it harder to “just browse” randomly or aimlessly.
- Most dashboards should be linked to by alerts.
- Browsing is directed with links. For more information, refer to [Linking]({{< relref "../linking/_index.md" >}}).
- Version-controlled dashboard JSON.
## High - optimized use
At this stage, you have optimized your dashboard management use with a consistent and thoughtful strategy. It requires maintenance, but the results are worth it.
- Actively reducing sprawl.
- Regularly review existing dashboards to make sure they are still relevant.
- Only approved dashboards added to master dashboard list.
- Tracking dashboard use. If you're an Enterprise user, you might take advantage of [Usage insights]({{< relref "../enterprise/usage-insights.md" >}}).
- Consistency by design.
- Use scripting libraries to generate dashboards, ensure consistency in pattern and style.
- grafonnet (Jsonnet)
- grafanalib (Python)
- No editing in the browser. Dashboard viewers change views with variables.
- Browsing for dashboards is the exception, not the rule.
- Perform experimentation and testing in a separate Grafana instance dedicated to that purpose, not your production instance. When a dashboard in the test environment is proven useful, then add that dashboard to your main Grafana instance.
......@@ -9,4 +9,4 @@ weight = 100
# Getting started
These guides will help you get started and acquainted with Grafana. To learn more about Grafana in general, refer to [What is Grafana?]({{< relref "what-is-grafana.md" >}}).
These guides will help beginners get started and acquainted with Grafana. To learn more about Grafana in general, refer to [What is Grafana?]({{< relref "what-is-grafana.md" >}}).
......@@ -15,6 +15,17 @@
link: /getting-started/strategies/
- name: Glossary
link: /getting-started/glossary/
- name: Best practices
link: /best-practices/
children:
- name: Best practices for creating dashboards
link: /best-practices/best-practices-for-creating-dashboards/
- name: Best practices for managing dashboards
link: /best-practices/best-practices-for-managing-dashboards/
- name: Observability strategies
link: /best-practices/common-observability-strategies/
- name: Dashboard management maturity model
link: /best-practices/dashboard-management-maturity-model/
- name: Installation
link: /installation/
children:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment