Data Stream Update Troubleshooting

Introduction

When you use Data Factory, there are several cases where updating data streams fails.

The following are cases that cause data stream updates to fail.

  • Cartesian Product
  • Output data volume exceeding the maximum
  • Incorrect Join Configuration

Cartesian Product

Description

When there is an error in Trigger History as shown in the figure below, it means that there is Cartesian Product, causing the data stream update to fail.

Causes

When the data stream node is a Join node, Cartesian Product my appear.

1. How to determine whether there is Cartesian Product?

The formula to determine whether there is Cartesian Product is:

Data volume after joining > (n + m) * 2 (n indicates the data volume of form A and m indicates the data volume of form B)

2. Example

The following is the result of calculating form data after joining (all three types of connection can lead to update failure):

  • Data volume after joining = 5 * 5 + 4 * 4 = 41
  • (m + n) * 2 = (9 + 9) * 2 = 36
  • 41 > 36
  • The calculation result will trigger Cartesian Product.

Since the data volume after joining is larger than twice the sum of data in Table A and Table B, Cartesian Product will be triggered and the data stream update will fail.

Note: Whether the connection method is Left Join, Right Join, or Inner Join, as long as the calculated results conform to the formula above, Cartesian Product will be triggered.

Solutions

When there is Cartesian Product, you need to adjust the configuration of Data Factory or modify the original data (by deleting or adding data to make the above formula in valid).

For the example above, you can solve the problem by modifying the join configuration.

For example, by setting Specifications as a connection field, you can make Quantity and Price in one form, and Cartesian Product will not be triggered.

Output Volume Exceeding One Million

Description

When there is an error in Trigger History as shown in the figure below, it means that the output volume exceeds the limit, causing the data stream update to fail.

Causes

The final output data volume calculated by Data Factory is limited to one million. If the output volume exceeds the limit, data streams will fail to update.

Solutions

When output volume exceeds the upper limit, you can:

  • Reduce the number of forms in Input nodes or delete some data from input forms to decrease the input volume.
  • Adjust the configuration of data streams to reduce the output volume.

Output Volume Larger than Twice Input Volume

Description

When there is an error in Trigger History as shown in the figure below, it means that the output volume exceeds the limit, causing the data stream update to fail.

Causes

As shown in the figure below, the output volume is three times the input volume, and the data volume output has exceeded the maximum, causing the data stream update to fail.

Solutions

When output volume exceeds the upper limit, you can:

  • Reduce the number of forms in Input nodes or delete some data from input forms to decrease the input volume.
  • Adjust the configuration of data streams to reduce the output volume.

Incorrect Join Configuration

Description

When there is an error in Trigger History as shown in the figure below, it means that the Join connection is not configured correctly.

Causes

When form fields from data sources are deleted, data streams will fail to update if fields are configured as Join fields.

Solutions

Click data streams that fail to update and reconfigure Join nodes.

Was this information helpful?
Yes
NoNo
Need more help? Contact support