The following node is available in the Open Source KNIME predictive analytics and data mining platform version 2.7.1. Discover over 1000 other nodes, as well as enterprise functionality at http://knime.com.

PCA Compute

This node performs a principal component analysis (PCA) on the given input data. The directions of maximal variance (the principal components) are extracted and can be used in the PCA Apply node to project the input into a space of lower dimension while preserving a maximum of information.

Dialog Options

Fail if missing values are encountered
If checked, execution fails, when the selected columns contain missing values. By default, rows containing missing values are ignored and not considered in the computation of the principal components.
Columns
Select the columns that are included in the analysis of principal components, i.e. the numerical features of the data.

Ports

Input Ports
0 Input data for the PCA
Output Ports
0 Covariance matrix of the input columns
1 Table containing parameters extracted from the PCA. Each row in the table represents one principal component, whereby the rows are sorted with decreasing eigenvalues, i.e. variance along the corresponding principal axis. The first column in the table contains the component's eigenvalue, a high value indicates a high variance (or in other words, the respective component dominates the orientation of the input data).
Each subsequent column (labeled with the name of the selected input column) contains a coefficient representing the influence of the respective input dimension to the principal component. The higher the absolute value, the higher the influence of the input dimension on the principal component.
The mapping of the input rows to, e.g. the first principal axis, is computed as follows (all done in the PCA Apply node): For each dimension in the original space subtract the dimension's mean value and then multiply the resulting vector with the vector given by this table (the first row in the spectral decomposition table to get the value on the first PC, the second row for the second PC and so on).
2 Model with projection to principal components, used in the PCA Apply node to apply the transformation to, e.g. another validation set.
This node is contained in KNIME Base Nodes provided by KNIME GmbH, Konstanz, Germany.