Linear Regression


The Clario Linear Regression node uses linear regression to build a model for either a discrete or continuous dependent attribute. The resulting model equation can be used to create a predictive score based on one or more independent attributes.


The Linear Regression node has three configuration tabs: Dependent Attribute, Weight Attribute, and Predictor Attributes.

Dependent Attribute Tab


The Dependent Attribute tab contains an Available Attribute list box, a Dependent Attribute field, and Settings area for the Attribute Selection Method drop down.

Select the dependent attribute from the Available Attributes and drag and drop it into the Dependent Attribute area (required). Note that the dependent attribute must be a numeric attribute.

Next, choose the Attribute Selection Method. Choices for the Attribute Selection Method are None and Stepwise. If Stepwise is chosen, two additional parameters appear that need to be defined: the ‘Maximum p to Enter’ and ‘Minimum p to Remove’ values.

Weight Attribute Tab


If the incoming data set is weighted, drag and drop the weight attribute into the Weight Attribute Field. The weight attribute must be a non-zero integer value.

Predictor Attributes Tab


Select the desired predictor attribute(s) by dragging and dropping them from the Available Attributes box to the Force-Entry Attributes list box.

If Selection Method is ‘None’ in the Dependent Attribute tab, attributes must be selected for entry into the model. At least one attribute must be placed into the Force-Entry Attributes box.

If the Selection Method is ‘Stepwise’, attributes may be chosen as Candidates or be selected for Force-Entry into the model. At least one attribute must be placed into either the Force-Entry or Candidate Attributes box.



There is one results set with three different tabs (Detailed Results, Step History, and Model Equation) for the Linear Regression node. When Attribute Selection Method is set to None, the Step History is omitted in the results set.

Detailed Results Tab



If stepwise selection method is chosen, steps will appear in this box. Choose a step to see detailed results to the right.

Model Summary

This box contains the following statistics for each model step: R2 and Adjusted R2 (Coefficient of Determination), Standard Error of Estimate, and Dependent Mean.

Analysis of Variance

The Analysis of Variance (ANOVA) table contains the following statistics for each model step: Source of Variance, Degrees of Freedom, Sum of Squares, Mean Squares, F-statistic, and corresponding p-value.


For each model attribute, the following statistics are displayed: Degrees of Freedom, Regression Coefficient, Standard Error, Standardized Coefficient, Model Contribution, t-value, p-value, and Tolerance.

Step History Tab


This tab (stepwise method only) contains one row of data for each step in the model building process. Each step lists the attribute entered or removed along with the step on which it was entered or removed and the resulting model R2 for that step.

Model Equation Tab


This tab contains the model equation for the linear regression. This code can be copied and pasted into the Code Editor of a Transform node.

Output Stream

The output stream contains three attributes: Component, Description, and Value. If the Linear Regression results are written to a file to be used in a scoring application, make sure ‘Full Precision’ is selected as the number format to avoid truncation of model coefficients.

Below is an example of an output stream from Linear Regression.