The following node is available in the Open Source KNIME predictive analytics and data mining platform version 2.7.1. Discover over 1000 other nodes, as well as enterprise functionality at
http://knime.com.
Kuhlen Stemmer
This node allows you to reduce terms to their stem. The used
stemming algorithm is the Kuhlen stemmer. The stemmed terms
are stored in the outgoing DataTable, as well as the documents containing
these terms. Be aware that the Kuhle stemmer stems only english
words correctly.
Dialog Options
Deep preprocessing options
- Deep preprocessing
-
If deep preprocessing is checked, the terms contained inside
the documents are preprocessed too, this means that the documents
themselves are changed too, which is more time consuming.
- Document column
-
Specifies the column containing the documents to preprocess.
- Append unchanged documents
-
If checked, the documents contained in the specified "Original
Document column" are appended unchanged even if deep preprocessing
is checked. This helps to keep the original documents in the
output data table without the agonizing pain of joining.
- Original Document column
-
Specifies the column containing the original documents which
can be attached unchanged.
- Ignore unmodifiable tag
-
If checked unmodifiable terms will be preprocessed too.
Ports
Input Ports
0 |
The input table which contains the terms to stem. |
Output Ports
0 |
The output table which contains the stemmed terms.
|
This node is contained in KNIME Textprocessing Plug-in
provided by KNIME GmbH, Konstanz, Germany.