The following node is available in the Open Source KNIME predictive analytics and data mining platform version 2.7.1. Discover over 1000 other nodes, as well as enterprise functionality at http://knime.com.

Bitvector Generator

Generates bitvectors either from a table containing numerical values, or from a string column containing the bit positions to set, hexadecimal or binary strings.

Numeric input (many columns)

In the case of a numerical input the columns correspond to the bit positions in the resulting bitvector, i.e. if only one numerical column is available all bitvectors will have length 1. All numeric columns in the table are considered. There are two options to determine if the bit is set for the value in the corresponding column or not:

Strings (one column)

In the case of a string input only the column containing the string is considered for the generation of the bitvectors. The string is parsed and converted into a bitvector. There are three valid input formats which can be parsed and converted:

Missing values

For numeric data the incoming missing values will result in 0s. For the string input missing values will also result in a missing value in the output table. If a string could not be parsed it will also result in a missing cell in the output table.

Dialog Options

Numeric input
Select if several numeric columns should be converted into a bitvector.
Threshold
If the "numeric input" is checked, specify the global threshold. All values which are above or equal to this threshold will result in a 1 in the bitvector.
Use percentage of the mean
Check, if a percentage of the mean of each column should serve as threshold above which the bits are set.
Percentage
Specify which percentage of the mean a value should have in order to be set.
Parse bitvectors from string column
Check, if the input for the bitvectors is a string column that should be converted into a bitvector (see description above for valid input formats). Uncheck, if the data is a table with numerical data that should be converted into bitvectors. All numerical columns will be considered, all others are irgnored.
String column to be parsed
If the "parse from string column" is checked, select the column containing the strings.
Kind of string representation
Select one of the three valid input formats: HEX (hexadecimal), ID (bit positions) or BIT (binary strings). See description above.
Remove column(s) used for bit vector creation:
If it is checked the generating column(s) (included columns if numeric input was used or the selected string column) are removed. If it is unchecked the generated bitvectors are appended to the input table.

Ports

Input Ports
0 Datatable with numerical data or a string column to be parsed.
Output Ports
0 Datatable with the generated bitvectors.

Views

Statistics View
Provides information about the generation of the bitvectors from the data. In particular this is the number of processed rows, the total number of generated zeros and ones and the resulting ratio of 1s to 0s.
This node is contained in KNIME Base Nodes provided by KNIME GmbH, Konstanz, Germany.