What is the Group By Function?
Group by is to combine original data into groups and calculate categorized data based on aggregation calculations. Currently, there are six types of calculation methods: Sum, AVG, Max, Min, Count, and Dedupe.
When to Use the Group By Function?
Group by is mainly used for the same type of data based on aggregated calculations. For example, for student performance statistics, the average scores of each class are calculated based on all student scores.
What does it Look Like?
Here is an example:
How to Set up?
Step 1 Choosing Data Source Form
We use the Student Scores as the data source for grouping by:
Step 2 Creating a Data Stream
Create a data stream in Data Factory.
Step 3 Selecting a Data Source
Rename the data stream, add one input node, and select Student Scores as the input source.
Step 4 Adding Group by
Add a Group by data processing node in the design page, and then connect Student Scores to the Group by node.
Step 5 Node Configuration
The node configuration includes Grouping Field setting and Aggregation Field setting.
1. Grouping Field
Grouping fields are sorted based on the field data in the form, and it will combine similar data into one group. A group by node supports adding multiple grouping fields, and the added grouping fields support sorting, deleting, and renaming.
Date&Time Fields support setting Group By.
Grouping Fields support the following types of fields:
Field Type | Supported Field | Method |
Text |
| Combine fields with the same data into one group |
Date&Time |
|
|
Member |
| Combine fields with the same data into one group |
Department |
| Combine fields with the same data into one group |
Note:
Except for Date&Time fields which need to be grouped separately, the grouping method of other fields will divide the same data into a group.
2. Aggregation Field
Aggregation fields are to make aggregation calculations based on fields. A grouping by node supports adding multiple aggregation fields, and the added fields support sorting, deleting, renaming, and aggregating settings.
Different fields support different calculation methods:
Field Types | Supported Fields | Methods |
Text |
|
|
Number |
|
|
Date&Time |
|
|
Member |
|
|
Department |
|
|
Step 6 Renaming a Node
In order to better label the calculation rules for each data processing step in the data stream, you can rename data stream nodes.
Step 7 Saving and Exporting the Node
If you need further calculations, you can add other data processing nodes. If you complete the processing, you need to connect the output node to output the data and save the result.
Demonstration
The data processed by Group By can be displayed through the dashboard.
I have finished reading. 🏆
👉I can't wait to Try for myself.
👉I need more HELP in Discord Forum.