The following node is available in the Open Source KNIME predictive analytics and data mining platform version 2.7.1. Discover over 1000 other nodes, as well as enterprise functionality at
http://knime.com.
Term co-occurrence counter
The node counts the number of co-occurrences for the given list of
terms within the selected parts e.g. sentence, paragraph, section and
title of the corresponding document.
The order two terms occur is not considered. Thus the occurrence
of T1 followed by T2 is equal to the occurrence of T2 followed
by T1. The output table returns the term pairs in alphabetical
order.
Dialog Options
- Document column
-
The column that contains the document to search for the
term co-occurrences.
- Term column
-
The column that contains the terms to compute the co-occurrence for.
- Co-occurrence level
-
Select the co-occurrence level to be calculated.
They are ordered from more general (document co-occurrence)
to more specific (neighbors). The more general levels include
the more specific levels e.g. the sentence level includes the
neighbor and title co-occurrence calculation.
Notice: The calculation of the more general statistic especially
the document level statistics might result in a very large data
table.
- Check term tags
-
The tags e.g. POS tags of a term are considered when matching terms
if this option is selected. If this option is not selected only
their textual representation is checked when matching terms.
- Sort input table
-
Unselect this option if the input table is already sorted by the
document column.
- Maximum number of parallel proceses
-
Decrease the number of parallel processes in case of memory problems.
Ports
Input Ports
0 |
Input table with a document and term column
|
Output Ports
0 |
Table with the co-occurrence statistics for the input table
|
This node is contained in KNIME Textprocessing Plug-in
provided by KNIME GmbH, Konstanz, Germany.