Join

../../_images/joinNode.png

The Clario Join node is used to combine two data streams on key attribute(s) via one of four different join types: inner, left, right, and full. The functionality of this node is similar to an SQL Join.

Connecting Sources

When connecting a data source to a Join node, the Select Connectors dialog will appear and require the user to specify Left Source or Right Source. This distinction matters when using a Left or Right Join.

To reassign the sources, terminate the connected links and reconnect the data streams.

Configuration

The Join node has two tabs: Configuration and Summary.

Configuration Tab

../../_images/join_configurationTab.png

The attributes listed in the Left Attribute and Right Attribute list boxes display all the incoming attributes from the respective sources. The Joins list box displays the attribute pairings that define the join keys. At least one join key must be defined.

../../_images/join_joinConnectors.png

To define a join key, select an attribute from both the Left Attribute list box and the Right Attribute list box on which you would like to join. These selected attributes will become highlighted and a line will appear connecting the two with a green plus sign in the middle. Clicking on the plus sign will confirm the connection between the two selections, thus removing the attributes from their respective source list boxes and moving this pairing to the Joins list box. If a red slash appears, the two attributes cannot be joined because of a type mismatch.

To remove a connection between two attributes, click on the connection in the Joins list box and then click the [-].

Each attribute can be joined to at most one other attribute.

The attributes in the Joins list box are listed in the order that they appear on the Left Source.

Settings

Join Type

Select the desired Join Type. The default Join Type is Inner. Descriptions of each Join Type are as follows:

  • Inner: Returns all rows with exact matches* on the join key attributes from both the Left and Right Sources.
  • Left: Returns all rows from the Left Source along with just the rows from the Right Source containing exact matches with the join key attributes.
  • Right: Returns all rows from the Right Source along with just the rows from the Left Source containing exact matches on the join key attributes.
  • Full: Returns all rows from the Left and the Right Source where the join key attributes match, as well as rows from each data set that do not match. Attribute values for the rows from the Left or Right Sources that do not match will have a value of null.

*Join is case sensitive.

Merge Keys

To merge the join key attributes, click the Merge Keys checkbox. If Merge Keys is checked:

  • Inner Join: drops the Right Join Key Attributes from the Output Stream
  • Left Join: drops the Right Join Key Attributes from the Output Stream
  • Right Join: merges Right Join Key values into the Left Join Key Attributes and drops the Right Join Key Attributes from the Output Stream
  • Full Join: merges non-null Right Join Key values into the Left Join Key Attributes and drops the Right Join Key Attributes from the Output Stream

If Merge Keys is unchecked, the Output Stream retains both sets of Join Key attribute fields.

Memory Map

The Memory Map feature is a performance enhancement for joining a large data set (Left Source) to a small data set (Right Source). When using Memory Map, incoming sources do not need to be sorted. The Memory Map feature is only available for Inner or Left joins.

When the Memory Map setting is checked, the data from the Right Source is loaded into memory.

If the Memory Map is unchecked, both incoming data sets must be sorted by each attribute that is used to create the join key.

Summary Tab

../../_images/join_summaryTab.png

The Summary tab lists each outgoing attribute, its source connector, its source attribute, and the outgoing attribute type.

To change the names of the outgoing attributes, click [Rename] and edit the text in the resulting dialog. Attribute names may only consist of the following characters: A-Z, a-z, 0-9, ‘-‘, and ‘_’.

To export the Join configuration, click [Export] and enter a filename before downloading the spreadsheet.

Output Stream

The data stream sent to the Join node’s outgoing connector contains the attributes and types displayed in the Summary tab.