Commit 7db42f0a by Totalus Committed by GitHub

Transformations: Adding group by and aggregate on multiple fields transformation

* Adding Occurences transformer

* Adding test for Occurences Transformer

* Cleanup. Adding a test.

* Adding doc

* Modifying UI to support custom calculations options

* Implementing data transformation

* Finalizing calculations implementation

* Cleanup

* Using Fields instead of arrays in data grouping

* Renaming transformation to GroupBy

* Adding some doc

* Apply suggestions (solving TS typing errors)

Co-authored-by: Marcus Andersson <systemvetaren@gmail.com>

* Tweaking UI

* Preventing of selecting twice the same field name.

* Removing console print. No calculations by default.

* Forgot to add the current value to the GroupBy selector

* Solving some typing issues and prettyfier errors

* Cleanup

* Updating test

* Ensure proper copy of options (solves some issues)

* Check if the fields exist in the data before processing

* Adding missing import in test file

* If group by field not specified, return all data untouched.

* Adding another missing import in test

* Minor updates

* Implementing GroupBy multiple fields + Improve field typing

* Removing console prints

* Allowing the exact number of fields to be added as aggregation

* Centering remove button icon

* Cleanup

* Correcting TS error

* Chaging transformer options structure

* Sorting so GroupBy fields appear on top

* Cleanup

* Simplifying some operations. Adding curly brackets.

* Changing some labels on the UI

* Updating test

* Cleanup

* Updating doc

* Fixed field list. Storing options as Record instead of Array.

* Update test

* Cleaned up the group by editor UI code.

* changed the transform to a table layout instead of a flexbox layout.

* cleaned up group by transformer.

* removed unused imports.

* Added some more tests.

* Added one more test and cleaned up code.

* fixed failing test.

* Fixed so we we have the proper casing on naming.

* fixed so we don't wrap on the first row.

Co-authored-by: Marcus Andersson <systemvetaren@gmail.com>
Co-authored-by: Torkel Ödegaard <torkel@grafana.com>
Co-authored-by: Marcus Andersson <marcus.andersson@grafana.com>
parent fe6d399f
......@@ -74,6 +74,7 @@ Grafana comes with the following transformations:
- [Join by field (outer join)](#join-by-field-outer-join)
- [Add field from calculation](#add-field-from-calculation)
- [Labels to fields](#labels-to-fields)
- [Group By](#group-by)
- [Series to rows](#series-to-rows)
- [Debug transformations](#debug-transformations)
......@@ -222,6 +223,67 @@ After I apply the transformation, my labels appear in the table as fields.
{{< docs-imagebox img="/img/docs/transformations/labels-to-fields-after-7-0.png" class="docs-image--no-shadow" max-width= "1100px" >}}
### Group By
This transformation groups the data by a specified field (column) value and processes calculations on each group. The available calculations are the same as the Reduce transformation.
Here's an example of original data.
| Time | Server ID | CPU Temperature | Server Status
|---------------------|-------------|-----------------|----------
| 2020-07-07 11:34:20 | server 1 | 80 | Shutdown
| 2020-07-07 11:34:20 | server 3 | 62 | OK
| 2020-07-07 10:32:20 | server 2 | 90 | Overload
| 2020-07-07 10:31:22 | server 3 | 55 | OK
| 2020-07-07 09:30:57 | server 3 | 62 | Rebooting
| 2020-07-07 09:30:05 | server 2 | 88 | OK
| 2020-07-07 09:28:06 | server 1 | 80 | OK
| 2020-07-07 09:25:05 | server 2 | 88 | OK
| 2020-07-07 09:23:07 | server 1 | 86 | OK
This transformation goes in two steps. First you specify one or multiple fields to group the data by. This will group all the same values of those fields together, as if you sorted them. For instance if we `Group By` the `Server ID` field, it would group the data this way:
| Time | Server ID | CPU Temperature | Server Status
|---------------------|-------------|-----------------|----------
| 2020-07-07 11:34:20 | **server 1** | 80 | Shutdown
| 2020-07-07 09:28:06 | **server 1** | 80 | OK
| 2020-07-07 09:23:07 | **server 1** | 86 | OK
|
| 2020-07-07 10:32:20 | server 2 | 90 | Overload
| 2020-07-07 09:30:05 | server 2 | 88 | OK
| 2020-07-07 09:25:05 | server 2 | 88 | OK
|
| 2020-07-07 11:34:20 | ***server 3*** | 62 | OK
| 2020-07-07 10:31:22 | ***server 3*** | 55 | OK
| 2020-07-07 09:30:57 | ***server 3*** | 62 | Rebooting
All rows with the same value of `Server ID` are grouped together.
After choosing which field you want to group your data by, you can add various calculations on the other fields, and the calculation will be applied on each group of rows. For instance, we could want to calculate the average `CPU temperature` for each of those servers. So we can add the _mean_ calculation applied on the `CPU Temperature` field to get the following:
| Server ID | CPU Temperature (mean)
|-----------|--------------------------
| server 1 | 82
| server 2 | 88.6
| server 3 | 59.6
And we can add more than one of those calculation. For instance :
- For field `Time`, we can calculate the *Last* value, to know when the last data point was received for each server
- For field `Server Status`, we can calculate the *Last* value to know what is the last state value for each server
- For field `Temperature`, we can also calculate the *Last* value to know what is the latest monitored temperature for each server
We would then get :
| Server ID | CPU Temperature (mean) | CPU Temperature (last) | Time (last) | Server Status (last)
|-----------|-------------------------- |------------------------|------------------|----------------------
| server 1 | 82 | 80 | 2020-07-07 11:34:20 | Shutdown
| server 2 | 88.6 | 90 | 2020-07-07 10:32:20 | Overload
| server 3 | 59.6 | 62 | 2020-07-07 11:34:20 | OK
This transformation allows you to extract some key information out of your time series and display them in a convenient way.
## Series to rows
> **Note:** This documentation refers to a Grafana 7.1 feature.
......
......@@ -12,6 +12,7 @@ import { seriesToRowsTransformer } from './transformers/seriesToRows';
import { renameFieldsTransformer } from './transformers/rename';
import { labelsToFieldsTransformer } from './transformers/labelsToFields';
import { ensureColumnsTransformer } from './transformers/ensureColumns';
import { groupByTransformer } from './transformers/groupBy';
import { mergeTransformer } from './transformers/merge';
export const standardTransformers = {
......@@ -30,5 +31,6 @@ export const standardTransformers = {
renameFieldsTransformer,
labelsToFieldsTransformer,
ensureColumnsTransformer,
groupByTransformer,
mergeTransformer,
};
import { toDataFrame } from '../../dataframe/processDataFrame';
import { groupByTransformer, GroupByTransformerOptions, GroupByOperationID } from './groupBy';
import { mockTransformationsRegistry } from '../../utils/tests/mockTransformationsRegistry';
import { transformDataFrame } from '../transformDataFrame';
import { Field, FieldType } from '../../types';
import { DataTransformerID } from './ids';
import { ArrayVector } from '../../vector';
import { ReducerID } from '../fieldReducer';
import { DataTransformerConfig } from '@grafana/data';
describe('GroupBy transformer', () => {
beforeAll(() => {
mockTransformationsRegistry([groupByTransformer]);
});
it('should not apply transformation if config is missing group by fields', () => {
const testSeries = toDataFrame({
name: 'A',
fields: [
{ name: 'time', type: FieldType.time, values: [3000, 4000, 5000, 6000, 7000, 8000] },
{ name: 'message', type: FieldType.string, values: ['one', 'two', 'two', 'three', 'three', 'three'] },
{ name: 'values', type: FieldType.string, values: [1, 2, 2, 3, 3, 3] },
],
});
const cfg: DataTransformerConfig<GroupByTransformerOptions> = {
id: DataTransformerID.groupBy,
options: {
fields: {
message: {
operation: GroupByOperationID.aggregate,
aggregations: [ReducerID.count],
},
},
},
};
const result = transformDataFrame([cfg], [testSeries]);
expect(result[0]).toBe(testSeries);
});
it('should group values by message', () => {
const testSeries = toDataFrame({
name: 'A',
fields: [
{ name: 'time', type: FieldType.time, values: [3000, 4000, 5000, 6000, 7000, 8000] },
{ name: 'message', type: FieldType.string, values: ['one', 'two', 'two', 'three', 'three', 'three'] },
{ name: 'values', type: FieldType.string, values: [1, 2, 2, 3, 3, 3] },
],
});
const cfg: DataTransformerConfig<GroupByTransformerOptions> = {
id: DataTransformerID.groupBy,
options: {
fields: {
message: {
operation: GroupByOperationID.groupBy,
aggregations: [],
},
},
},
};
const result = transformDataFrame([cfg], [testSeries]);
const expected: Field[] = [
{
name: 'message',
type: FieldType.string,
values: new ArrayVector(['one', 'two', 'three']),
config: {},
},
];
expect(result[0].fields).toEqual(expected);
});
it('should group values by message and summarize values', () => {
const testSeries = toDataFrame({
name: 'A',
fields: [
{ name: 'time', type: FieldType.time, values: [3000, 4000, 5000, 6000, 7000, 8000] },
{ name: 'message', type: FieldType.string, values: ['one', 'two', 'two', 'three', 'three', 'three'] },
{ name: 'values', type: FieldType.string, values: [1, 2, 2, 3, 3, 3] },
],
});
const cfg: DataTransformerConfig<GroupByTransformerOptions> = {
id: DataTransformerID.groupBy,
options: {
fields: {
message: {
operation: GroupByOperationID.groupBy,
aggregations: [],
},
values: {
operation: GroupByOperationID.aggregate,
aggregations: [ReducerID.sum],
},
},
},
};
const result = transformDataFrame([cfg], [testSeries]);
const expected: Field[] = [
{
name: 'message',
type: FieldType.string,
values: new ArrayVector(['one', 'two', 'three']),
config: {},
},
{
name: 'values (sum)',
type: FieldType.number,
values: new ArrayVector([1, 4, 9]),
config: {},
},
];
expect(result[0].fields).toEqual(expected);
});
it('should group by and compute a few calculations for each group of values', () => {
const testSeries = toDataFrame({
name: 'A',
fields: [
{ name: 'time', type: FieldType.time, values: [3000, 4000, 5000, 6000, 7000, 8000] },
{ name: 'message', type: FieldType.string, values: ['one', 'two', 'two', 'three', 'three', 'three'] },
{ name: 'values', type: FieldType.string, values: [1, 2, 2, 3, 3, 3] },
],
});
const cfg: DataTransformerConfig<GroupByTransformerOptions> = {
id: DataTransformerID.groupBy,
options: {
fields: {
message: {
operation: GroupByOperationID.groupBy,
aggregations: [],
},
time: {
operation: GroupByOperationID.aggregate,
aggregations: [ReducerID.count, ReducerID.last],
},
values: {
operation: GroupByOperationID.aggregate,
aggregations: [ReducerID.sum],
},
},
},
};
const result = transformDataFrame([cfg], [testSeries]);
const expected: Field[] = [
{
name: 'message',
type: FieldType.string,
values: new ArrayVector(['one', 'two', 'three']),
config: {},
},
{
name: 'time (count)',
type: FieldType.number,
values: new ArrayVector([1, 2, 3]),
config: {},
},
{
name: 'time (last)',
type: FieldType.time,
values: new ArrayVector([3000, 5000, 8000]),
config: {},
},
{
name: 'values (sum)',
type: FieldType.number,
values: new ArrayVector([1, 4, 9]),
config: {},
},
];
expect(result[0].fields).toEqual(expected);
});
it('should group values in data frames induvidually', () => {
const testSeries = [
toDataFrame({
name: 'A',
fields: [
{ name: 'time', type: FieldType.time, values: [3000, 4000, 5000, 6000, 7000, 8000] },
{ name: 'message', type: FieldType.string, values: ['one', 'two', 'two', 'three', 'three', 'three'] },
{ name: 'values', type: FieldType.string, values: [1, 2, 2, 3, 3, 3] },
],
}),
toDataFrame({
name: 'B',
fields: [
{ name: 'time', type: FieldType.time, values: [3000, 4000, 5000, 6000, 7000, 8000] },
{ name: 'message', type: FieldType.string, values: ['one', 'two', 'two', 'three', 'three', 'three'] },
{ name: 'values', type: FieldType.string, values: [0, 2, 5, 3, 3, 2] },
],
}),
];
const cfg: DataTransformerConfig<GroupByTransformerOptions> = {
id: DataTransformerID.groupBy,
options: {
fields: {
message: {
operation: GroupByOperationID.groupBy,
aggregations: [],
},
values: {
operation: GroupByOperationID.aggregate,
aggregations: [ReducerID.sum],
},
},
},
};
const result = transformDataFrame([cfg], testSeries);
const expectedA: Field[] = [
{
name: 'message',
type: FieldType.string,
values: new ArrayVector(['one', 'two', 'three']),
config: {},
},
{
name: 'values (sum)',
type: FieldType.number,
values: new ArrayVector([1, 4, 9]),
config: {},
},
];
const expectedB: Field[] = [
{
name: 'message',
type: FieldType.string,
values: new ArrayVector(['one', 'two', 'three']),
config: {},
},
{
name: 'values (sum)',
type: FieldType.number,
values: new ArrayVector([0, 7, 8]),
config: {},
},
];
expect(result[0].fields).toEqual(expectedA);
expect(result[1].fields).toEqual(expectedB);
});
});
import { DataTransformerID } from './ids';
import { DataFrame, FieldType, Field } from '../../types/dataFrame';
import { DataTransformerInfo } from '../../types/transformations';
import { getFieldDisplayName } from '../../field/fieldState';
import { ArrayVector } from '../../vector/ArrayVector';
import { guessFieldTypeForField } from '../../dataframe/processDataFrame';
import { reduceField, ReducerID } from '../fieldReducer';
import { MutableField } from '../../dataframe/MutableDataFrame';
export enum GroupByOperationID {
aggregate = 'aggregate',
groupBy = 'groupby',
}
export interface GroupByFieldOptions {
aggregations: ReducerID[];
operation: GroupByOperationID | null;
}
export interface GroupByTransformerOptions {
fields: Record<string, GroupByFieldOptions>;
}
export const groupByTransformer: DataTransformerInfo<GroupByTransformerOptions> = {
id: DataTransformerID.groupBy,
name: 'Group by',
description: 'Group the data by a field values then process calculations for each group',
defaultOptions: {
fields: {},
},
/**
* Return a modified copy of the series. If the transform is not or should not
* be applied, just return the input series
*/
transformer: (options: GroupByTransformerOptions) => {
const hasValidConfig = Object.keys(options.fields).find(
name => options.fields[name].operation === GroupByOperationID.groupBy
);
return (data: DataFrame[]) => {
if (!hasValidConfig) {
return data;
}
const processed: DataFrame[] = [];
for (const frame of data) {
const groupByFields: Field[] = [];
for (const field of frame.fields) {
if (shouldGroupOnField(field, options)) {
groupByFields.push(field);
}
}
if (groupByFields.length === 0) {
continue; // No group by field in this frame, ignore the frame
}
// Group the values by fields and groups so we can get all values for a
// group for a given field.
const valuesByGroupKey: Record<string, Record<string, MutableField>> = {};
for (let rowIndex = 0; rowIndex < frame.length; rowIndex++) {
const groupKey = String(groupByFields.map(field => field.values.get(rowIndex)));
const valuesByField = valuesByGroupKey[groupKey] ?? {};
if (!valuesByGroupKey[groupKey]) {
valuesByGroupKey[groupKey] = valuesByField;
}
for (let field of frame.fields) {
const fieldName = getFieldDisplayName(field);
if (!valuesByField[fieldName]) {
valuesByField[fieldName] = {
name: fieldName,
type: field.type,
config: { ...field.config },
values: new ArrayVector(),
};
}
valuesByField[fieldName].values.add(field.values.get(rowIndex));
}
}
const fields: Field[] = [];
const groupKeys = Object.keys(valuesByGroupKey);
for (const field of groupByFields) {
const values = new ArrayVector();
const fieldName = getFieldDisplayName(field);
for (let key of groupKeys) {
const valuesByField = valuesByGroupKey[key];
values.add(valuesByField[fieldName].values.get(0));
}
fields.push({
name: field.name,
type: field.type,
config: {
...field.config,
},
values: values,
});
}
// Then for each calculations configured, compute and add a new field (column)
for (const field of frame.fields) {
if (!shouldCalculateField(field, options)) {
continue;
}
const fieldName = getFieldDisplayName(field);
const aggregations = options.fields[fieldName].aggregations;
const valuesByAggregation: Record<string, any[]> = {};
for (const groupKey of groupKeys) {
const fieldWithValuesForGroup = valuesByGroupKey[groupKey][fieldName];
const results = reduceField({
field: fieldWithValuesForGroup,
reducers: aggregations,
});
for (const aggregation of aggregations) {
if (!Array.isArray(valuesByAggregation[aggregation])) {
valuesByAggregation[aggregation] = [];
}
valuesByAggregation[aggregation].push(results[aggregation]);
}
}
for (const aggregation of aggregations) {
const aggregationField: Field = {
name: `${fieldName} (${aggregation})`,
values: new ArrayVector(valuesByAggregation[aggregation]),
type: FieldType.other,
config: {},
};
aggregationField.type = detectFieldType(aggregation, field, aggregationField);
fields.push(aggregationField);
}
}
processed.push({
fields,
length: groupKeys.length,
});
}
return processed;
};
},
};
const shouldGroupOnField = (field: Field, options: GroupByTransformerOptions): boolean => {
const fieldName = getFieldDisplayName(field);
return options?.fields[fieldName]?.operation === GroupByOperationID.groupBy;
};
const shouldCalculateField = (field: Field, options: GroupByTransformerOptions): boolean => {
const fieldName = getFieldDisplayName(field);
return (
options?.fields[fieldName]?.operation === GroupByOperationID.aggregate &&
Array.isArray(options?.fields[fieldName].aggregations) &&
options?.fields[fieldName].aggregations.length > 0
);
};
const detectFieldType = (aggregation: string, sourceField: Field, targetField: Field): FieldType => {
switch (aggregation) {
case ReducerID.allIsNull:
return FieldType.boolean;
case ReducerID.last:
case ReducerID.lastNotNull:
case ReducerID.first:
case ReducerID.firstNotNull:
return sourceField.type;
default:
return guessFieldTypeForField(targetField) ?? FieldType.string;
}
};
......@@ -17,4 +17,5 @@ export enum DataTransformerID {
filterByRefId = 'filterByRefId',
noop = 'noop',
ensureColumns = 'ensureColumns',
groupBy = 'groupBy',
}
......@@ -94,6 +94,7 @@ export const getSelectStyles = stylesFactory((theme: GrafanaTheme) => {
`,
multiValueRemove: css`
margin: 0 ${theme.spacing.xs};
cursor: pointer;
`,
};
});
import React, { useMemo, useCallback } from 'react';
import { css, cx } from 'emotion';
import {
DataTransformerID,
standardTransformers,
TransformerRegistyItem,
TransformerUIProps,
ReducerID,
SelectableValue,
} from '@grafana/data';
import { getAllFieldNamesFromDataFrames } from './OrganizeFieldsTransformerEditor';
import { Select, StatsPicker, stylesFactory } from '@grafana/ui';
import {
GroupByTransformerOptions,
GroupByOperationID,
GroupByFieldOptions,
} from '@grafana/data/src/transformations/transformers/groupBy';
interface FieldProps {
fieldName: string;
config?: GroupByFieldOptions;
onConfigChange: (config: GroupByFieldOptions) => void;
}
export const GroupByTransformerEditor: React.FC<TransformerUIProps<GroupByTransformerOptions>> = ({
input,
options,
onChange,
}) => {
const fieldNames = useMemo(() => getAllFieldNamesFromDataFrames(input), [input]);
const onConfigChange = useCallback(
(fieldName: string) => (config: GroupByFieldOptions) => {
onChange({
...options,
fields: {
...options.fields,
[fieldName]: config,
},
});
},
[options]
);
return (
<div>
{fieldNames.map((key: string) => (
<GroupByFieldConfiguration
onConfigChange={onConfigChange(key)}
fieldName={key}
config={options.fields[key]}
key={key}
/>
))}
</div>
);
};
const options = [
{ label: 'Group by', value: GroupByOperationID.groupBy },
{ label: 'Calculate', value: GroupByOperationID.aggregate },
];
export const GroupByFieldConfiguration: React.FC<FieldProps> = ({ fieldName, config, onConfigChange }) => {
const styles = getStyling();
const onChange = useCallback(
(value: SelectableValue<GroupByOperationID | null>) => {
onConfigChange({
aggregations: config?.aggregations ?? [],
operation: value?.value ?? null,
});
},
[config, onConfigChange]
);
return (
<div className={cx('gf-form-inline', styles.row)}>
<div className={cx('gf-form', styles.fieldName)}>
<div className={cx('gf-form-label', styles.rowSpacing)}>{fieldName}</div>
</div>
<div className={cx('gf-form', styles.cell)}>
<div className={cx('gf-form-spacing', styles.rowSpacing)}>
<Select
className="width-12"
options={options}
value={config?.operation}
placeholder="Ignored"
onChange={onChange}
isClearable
menuPlacement="bottom"
/>
</div>
</div>
{config?.operation === GroupByOperationID.aggregate && (
<div className={cx('gf-form', 'gf-form--grow', styles.calculations)}>
<StatsPicker
className={cx('flex-grow-1', styles.rowSpacing)}
placeholder="Select Stats"
allowMultiple
stats={config.aggregations}
onChange={stats => {
onConfigChange({ ...config, aggregations: stats as ReducerID[] });
}}
menuPlacement="bottom"
/>
</div>
)}
</div>
);
};
const getStyling = stylesFactory(() => {
const cell = css`
display: table-cell;
`;
return {
row: css`
display: table-row;
`,
cell: cell,
rowSpacing: css`
margin-bottom: 4px;
`,
fieldName: css`
${cell}
min-width: 250px;
white-space: nowrap;
`,
calculations: css`
${cell}
width: 99%;
`,
};
});
export const groupByTransformRegistryItem: TransformerRegistyItem<GroupByTransformerOptions> = {
id: DataTransformerID.groupBy,
editor: GroupByTransformerEditor,
transformation: standardTransformers.groupByTransformer,
name: standardTransformers.groupByTransformer.name,
description: standardTransformers.groupByTransformer.description,
};
......@@ -6,6 +6,7 @@ import { organizeFieldsTransformRegistryItem } from '../components/TransformersU
import { seriesToFieldsTransformerRegistryItem } from '../components/TransformersUI/SeriesToFieldsTransformerEditor';
import { calculateFieldTransformRegistryItem } from '../components/TransformersUI/CalculateFieldTransformerEditor';
import { labelsToFieldsTransformerRegistryItem } from '../components/TransformersUI/LabelsToFieldsTransformerEditor';
import { groupByTransformRegistryItem } from '../components/TransformersUI/GroupByTransformerEditor';
import { mergeTransformerRegistryItem } from '../components/TransformersUI/MergeTransformerEditor';
import { seriesToRowsTransformerRegistryItem } from '../components/TransformersUI/SeriesToRowsTransformerEditor';
......@@ -19,6 +20,7 @@ export const getStandardTransformers = (): Array<TransformerRegistyItem<any>> =>
seriesToRowsTransformerRegistryItem,
calculateFieldTransformRegistryItem,
labelsToFieldsTransformerRegistryItem,
groupByTransformRegistryItem,
mergeTransformerRegistryItem,
];
};
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment