Group By
Introduction
Functions
Group by is to combine original data into groups and calculate categorized data based on aggregation calculations. Currently, there are six types of calculation methods: Sum, AVG, Max, Min, Count, and Dedupe.
Application Scenarios
Group by is mainly used for the same type of data based on aggregated calculations. For example, for student performance statistics, the average scores of each class are calculated based on all student scores.
Preview
There is an example:
Setting Procedure
Data Source Form
We use the Student Scores as the data source for group by:
Creating a Data Stream
Create a data stream in Data Factory.
Selecting a Data Source
Rename the data stream, add one input node, and select Student Scores as the input source.
Adding Group by
Add a Group by data processing node in the design page, and then connect Student Scores to the Group by node.
Node Configuration
The node configuration includes Grouping Field setting and Aggregation Field setting.
1. Grouping Field
Grouping fields are sorted based on the field data are in the form, and it will combine similar data into one group. A group by node supports adding multiple grouping fields, and the added grouping fields support sorting, deleting, and renaming.
Date&Time Fields support setting Group By.
Grouping Fields support the following type of fields:
Field Type | Supported Field | Method |
Text |
| Combine fields with the same data into one group |
Date&Time |
|
|
Member |
| Combine fields with the same data into one group |
Department |
| Combine fields with the same data into one group |
Note: Except for Date&Time fields which need to be grouped separately, the grouping method of other fields will divide the same data into a group.
2. Aggregation Field
Aggregation fields are to make aggregation calculations based on fields. A grouping by node supports adding multiple aggregation fields, and the added fields support sorting, deleting, renaming, and aggregating setting.
Different fields support different calculation methods:
Field Types | Supported Fields | Methods |
Text |
|
|
Number |
|
|
Date&Time |
|
|
Member |
|
|
Department |
|
|
Renaming a Node
In order to better label the calculation rules for each data processing step in the data stream, you can rename data stream nodes.
Saving and Exporting the Node
If you need further calculations, you can add other data processing nodes. If you complete the processing, you need to connect the output node to output the data and save the result.
Demonstration
The data processed by Group By can be displayed through the dashboard.