The following node is available in the Open Source KNIME predictive analytics and data mining platform version 2.7.1. Discover over 1000 other nodes, as well as enterprise functionality at http://knime.com.

Stop word Filter

Filters all terms contained in the specified stop word file. Stop words need to be written among each other in one column, in a way that each line contains only one stop words.

Dialog Options

Preprocessing options
Case sensitive
If checked the stop word are matched case sensitive otherwise not.
Use build in list
Specifies if build in stop word list is used or not. If checked the specified path will be ignored.
Stopword lists
Available build in stop word lists. If "Use build in list" is checked, the build in stop word list to use can be specified here.
Selected file
The location of the stop word file. Stop words need to be written among each other in one column, in a way that each line contains only one stop words.
Deep preprocessing options
Deep preprocessing
If deep preprocessing is checked, the terms contained inside the documents are preprocessed too, this means that the documents themselves are changed too, which is more time consuming.
Document column
Specifies the column containing the documents to preprocess.
Append unchanged documents
If checked, the documents contained in the specified "Original Document column" are appended unchanged even if deep preprocessing is checked. This helps to keep the original documents in the output data table without the agonizing pain of joining.
Original Document column
Specifies the column containing the original documents which can be attached unchanged.
Ignore unmodifiable tag
If checked unmodifiable terms will be preprocessed too.

Ports

Input Ports
0 The input table which contains the terms to convert.
Output Ports
0 The output table which contains the preprocessed terms.
This node is contained in KNIME Textprocessing Plug-in provided by KNIME GmbH, Konstanz, Germany.