public class IncrementalSparseClusterer extends Object implements SparseMatrixClusterer<IndexClusters>
SparseMatrix
instances internally,
only forgetting rows once they have been clustered and are relatively stable.
The criteria for row removal is cluster stability.
The defenition of cluster stability is maximum f1-score achieving a threshold between
clusters in the previous round and the current round. Once one round of stability is achieved
the cluster is stable and its elements are removed.Modifier and Type | Field and Description |
---|---|
protected double |
threshold |
Constructor and Description |
---|
IncrementalSparseClusterer(SparseMatrixClusterer<? extends IndexClusters> clusterer,
int window) |
IncrementalSparseClusterer(SparseMatrixClusterer<? extends IndexClusters> clusterer,
int window,
double threshold) |
Modifier and Type | Method and Description |
---|---|
protected Map<Integer,IntDoublePair> |
calculateStability(IndexClusters c1,
IndexClusters c2,
gnu.trove.set.TIntSet inactiveRows) |
IndexClusters |
cluster(SparseMatrix data) |
protected void |
detectInactive(IndexClusters oldClusters,
IndexClusters newClusters,
gnu.trove.set.TIntSet inactiveRows,
List<int[]> completedClusters)
Given the old and new clusters, make a decision as to which rows are now inactive,
and therefore which clusters are now completed
|
int[][] |
performClustering(SparseMatrix data) |
protected double threshold
public IncrementalSparseClusterer(SparseMatrixClusterer<? extends IndexClusters> clusterer, int window)
clusterer
- the underlying clustererwindow
- public IncrementalSparseClusterer(SparseMatrixClusterer<? extends IndexClusters> clusterer, int window, double threshold)
clusterer
- the underlying clustererwindow
- threshold
- public IndexClusters cluster(SparseMatrix data)
cluster
in interface DataClusterer<SparseMatrix,IndexClusters>
data
- the data to be clusteredprotected void detectInactive(IndexClusters oldClusters, IndexClusters newClusters, gnu.trove.set.TIntSet inactiveRows, List<int[]> completedClusters)
oldClusters
- newClusters
- inactiveRows
- completedClusters
- protected Map<Integer,IntDoublePair> calculateStability(IndexClusters c1, IndexClusters c2, gnu.trove.set.TIntSet inactiveRows)
public int[][] performClustering(SparseMatrix data)
performClustering
in interface Clusterer<SparseMatrix>