The following node is available in the Open Source KNIME predictive analytics and data mining platform version 2.7.1. Discover over 1000 other nodes, as well as enterprise functionality at http://knime.com.

Score Erosion

This node uses the Score Erosion algorithm in order to select subsets of items/row that

It is essentially an iterative process that first select the item with the highest score, reduces the scores of all remaining items based on their distance to the selected item, and subsequently selects the next item with the highest score, and so on. With the erosion factor you can adjust of activity should be preferred over activity or the other way round. Details about the algorithm are available in

Maximum-Score Diversity Selection for Early Drug Discovery , Journal of Chemical Information and Modeling, vol. 51, no. 2, pp. 237-247, 2011; Doi: 10.1021/ci100426r .

An example of how to use this node can be found on the example workflow server.

Dialog Options

Number of rows to select
Enter the number of rows that should be selected here (the subset size).
Score column
Select the column containing the scores and if a low or a high score is preferred.
Distance column
Select the column containing the distances between the items here.
Erosion factor
Select a value for the erosion factor here. High values favor diverse subsets, low values favor more active subsets.
Score update mode
The difference mode subtracts the distance to the selected item from all score, whereas the product mode multiplies the scores with the distance.

Ports

Input Ports
0 The input table, containing at least one numeric column with scores for each row, and one distance column.
Output Ports
0 A table containing the selected rows together with their eroded scores.
1 A table containing information about the overall activity and diversity of the selected subset in each internal iteration.
This node is contained in KNIME Optimization extension provided by KNIME GmbH, Konstanz, Germany.