The following node is available in the Open Source KNIME predictive analytics and data mining platform version 2.7.1. Discover over 1000 other nodes, as well as enterprise functionality at http://knime.com.
This node takes a list of user-defined rules and tries to match them to each row in the input table. If a rule matches, its outcome value is added into a new column. The rules follow a limited SQL-like syntax:
RULE := BEXPR '=>' STRING | NUMBER | COL BEXPR := '(' BEXPR ')' | 'NOT' BEXPR | 'MISSING' COL | AEXPR (BINOP BEXPR)? AEXPR := COL OP COL | NUMBER OP COL | COL OP NUMBER | STRING OP COL | COL OP STRING | COL LOP STRINGLIST BOP := 'AND' | 'OR' | 'XOR' OP := '>' | '<' | '>=' | '<=' | '=' | 'LIKE' LOP := 'IN' STRING := '"' [^"]* '"' NUMBER := [1-9][0-9]*(\.[0-9]+)? COL := '$' [^$]+ '$' STRINGLIST := '(' STRING (',' STRING)* ')'
Rules consist of a condition part (antecedant), that must evaluate to true or false, and an outcome (consequent) that is put into the new column if the rule matches. The most simple rule is a comparison between a column and another column, a fixed number or string. The LIKE operator treats the fixed string as a wildcard pattern (with * and ?) as wildcards, the IN operator compares the column value to a list of strings and evaluates to true if at least one value in the list is equal to the column's value.
The outcome of a rule can either be a fixed string, a fixed number, or a reference to another column. The type of the outcome column is the common super type of all possible outcomes including the default label. If the outcome of a single rule or the default label is a reference to a column, please check the corresponding option below the text field.
Columns are given by their name surrounded by $, numbers are given in the usual decimal representation. Note that strings must not contain double-quotes.
Rules can (and should) be grouped with brackets because there is not pre-defined operator precedence for the boolean operators (comparison operators always take precedence over boolean operators).
Some example rules:
$Col0$ > 5 => "Positive" $Col0$ == "Active" AND $Col1" <= 5 => "Outlier" $Col0$ LIKE "Market Street*" AND ($Col1 IN ("married", "divorced") OR $Col2$ > 40) => "Strange" $Col0$ > 5 => $Col1$
0 | Any datatable |
0 | The input table with an additional column containing the outcome of the matching rule for each row. |