The following node is available in the Open Source KNIME predictive analytics and data mining platform version 2.7.1. Discover over 1000 other nodes, as well as enterprise functionality at
http://knime.com.
Variable Based File Reader
This node can be used to read data from an ASCII file or URL location.
The location is read from a scope variable, provided by the input port.
It can be configured to read in various formats.
When you open the node's configuration dialog and provide a variable name
that contains a valid file location, it tries to guess
the reader's settings by analyzing the content of the file.
Check the results of these settings in the preview table. If the
data shown is not correct or an error is reported, you can adjust the
settings manually (see below).
In order for the available variables to have values assigned to them, the
predecessor node should be executed before the settings are adjusted.
In most cases a node will be connected to the port that assigns
different values to variables in a loop.
If a big file is selected (or the analysis of the file takes longer) the
dialog shows a "Start Analysis" button. The analysis is then
not triggered every time you change one setting, but only when you
click this button. You can also cut the analysis short by clicking
the "Stop Analysis" button. Then the analyzer doesn't examine
the entire file to check the column types but stops after determining
the types. Execution of the node could then fail, if an incompatible
data item occurs in the data file.
It is recommended to trigger file analysis before applying the dialog
settings to the node. If you click OK (or Apply) and didn't analyze the
data file before, an analysis is launched that can't be canceled and
that will examine the entire file. Make sure a preview table is visible
before you apply the settings.
Dialog Options
- Location variable name
- Select the name of the variable that
holds the location of the file. When
you press ENTER, the file is analyzed and the settings pre-set.
- Preserve user settings
- If checked, the checkmarks and
column names/types you explictly entered are preserved even if
you select a new file. By default, the analyzer starts with fresh
default settings for each new file location.
- Read row IDs
- If checked, the first column in the file
is used as row IDs. If not checked, default row headers are
created.
- Read column headers
- If checked, the items in the first
line of the file are used as column names.
Otherwise default column names are created.
- Column delimiter
- Enter the character(s) that separate
the data tokens in the file, or select a delimiter from the list.
- Ignore spaces and tabs
- If checked, spaces and the TAB
characters are ignored (not in quoted strings though).
- Java style comment
- Everything between '/*' and '*/' is
ignored. Also everything after '//' until the end of the line.
- Single line comment
- Enter one or more characters that
will indicate the start of a comment (ended by a new line).
- Advanced...
- Opens a new dialog with advanced settings.
There is support for quotes, different decimal separators in
floating point numbers, and character encoding. Also, for
ignoring whitespaces, for allowing rows with too few data items,
for making row IDs unique (not recommended for huge files),
for missing value pattern,
and for limiting the number of rows read in.
- Click on the table header
- If the column header in the
preview table is clicked, a new dialog
opens where column properties can be set: name
and type can be changed (and will be fixed then).
A pattern can be entered that will cause a "missing
cell" to be created when it's read for this column. Additionally,
possible values of the column domain can be updated by selecting
"Domain". And, you can choose to skip this column entirely, i.e. it
will not be included in the output table then.
Ports
Output Ports
0 |
Datatable just read from the file |
This node is contained in KNIME Base Nodes
provided by KNIME GmbH, Konstanz, Germany.