The following node is available in the Open Source KNIME predictive analytics and data mining platform version 2.7.1. Discover over 1000 other nodes, as well as enterprise functionality at
http://knime.com.
Row Sampling
This node extracts a sample (a bunch of rows) from the input data. The dialog enables you to specify the
sample size. The following options are available in the dialog:
Dialog Options
- Absolute
-
Specify the absolute number of rows in the sample. If there are less rows than specified here, all rows are
used.
- Relative
-
The percentage of the number of rows in the sample. Must be between 0 and 100, inclusively.
- Take from top
- This mode selects the top most rows of
the table.
- Linear sampling
-
This mode always includes the first and the last row and selects the remaining rows linearly over the whole
table (e.g. every third row). This is useful to downsample a sorted column while maintaining minimum and
maximum value.
- Draw randomly
-
Random sampling of all rows, you may optionally specify a fixed seed (see below).
- Stratified sampling
-
Check this button if you want stratified sampling, i.e. the distribution of values in the selected column is
(approximately) retained in the output table. You may optionally specify a fixed seed (see below).
- Use random seed
-
If either random or stratified sampling is selected, you may enter a fixed seed here
in order to get reproducible results upon re-execution. If you do not specify a seed,
a new random seed is taken for each execution.
Ports
This node is contained in KNIME Base Nodes
provided by KNIME GmbH, Konstanz, Germany.