Data Stream
Introduction
Functions
In a data factory, you can create multiple data streams for data processing.
Therefore, data factory is a place where you can process data, and data streams play an important role in data processing.
Application Scenarios
When you use data streams, you can select multiple forms as data sources, freely combine data processing nodes to process multiple forms and finally get outputs. There are six types of data processing nodes:
- Join: combining two forms by appending columns based on matching values to create a new form.
- Union: combining multiple forms by appending rows based on matching values to create a new form.
- Group by: combining original data into groups and calculating categorized data based on aggregation calculations.
- Data Filter: adding filter conditions, filtering out useless data, and processing data as required.
- Field Settings: displaying, hiding, sorting, and renaming fields, and adding calculated fields.
- Pivot: transforming a one-dimensional form into a two-dimensional form.
Use Conditions
Data Factory is an advanced function, and the free version allows you to create a data stream for free. Other versions can be found on the official website.
Where to Create
Select an app and click App Management > Data Factory > New Data Stream.
Designing Data Streams
Click New Data Stream to enter the design page.
The design page can be divided into three parts: node selection area, data stream design area, and node setting area.
- Node Selection Area: You can drag and drop nodes to the data stream design area.
- Data Stream Design Area: When you select a node, the node setting area will pop up from the bottom of the page.
- Node Setting Area: You can configure individual node to enable data streams to process the data based on the configured rules.
Setting Procedure
Data streams consist of input nodes, data processing nodes, and output nodes, and nodes are connected by connectors.
Setting Input Nodes
You can use input nodes to select data you want to process and can drag and drop input nodes to the data stream design area.
After you add data streams, an input node and an output node will be added by default. Then you can click the input node to select forms and fields in the corresponding forms.
Adding and Setting Data Processing Nodes
Drag and drop nodes to the data stream design area. Six types of data processing nodes are provided for processing data.
Each node and its function are as follows:
Node Name | Functions |
Join | Combining two forms by appending columns based on matching values to create a new form. |
Union | Combining multiple forms by appending rows based on matching values to create a new form. |
Group by | Combining original data into groups and calculating categorized data based on aggregation calculations. |
Data Filter | Adding filter conditions, filtering out useless data, and processing data as required. |
Field Settings | Displaying, hiding, sorting, and renaming fields, and adding calculated fields. |
Pivot | Transforming a one-dimensional form into a two-dimensional form. |
Setting Output Nodes
Output nodes are used to store the processed data. You have to set names for output nodes to distinguish different outputs.
Saving Data Streams
Save the above data stream design and set names for data streams.