NN
- The type of NearestNeighbours
to useDATA
- The type of datapublic class KMeansConfiguration<NN extends NearestNeighbours<DATA,?,?>,DATA> extends Object implements Cloneable
Modifier and Type | Field and Description |
---|---|
protected int |
blockSize
The size of processing blocks for each thread
|
static int |
DEFAULT_BLOCK_SIZE
The default number of samples per parallel assignment instance.
|
static int |
DEFAULT_NUMBER_ITERATIONS
The default number of iterations.
|
protected NearestNeighboursFactory<? extends NN,DATA> |
factory
The factory for producing the
NearestNeighbours objects used in
assignment. |
protected int |
K
The number of clusters
|
protected int |
niters
The max number of iterations
|
protected ExecutorService |
threadpool
The threadpool for parallel processing
|
Constructor and Description |
---|
KMeansConfiguration()
A completely default configuration used primarily as a convenience
function for reading.
|
KMeansConfiguration(int K,
NearestNeighboursFactory<? extends NN,DATA> nnFactory)
Create configuration for data that will create
K clusters. |
KMeansConfiguration(int K,
NearestNeighboursFactory<? extends NN,DATA> nnFactory,
int niters)
Create configuration for data that will create
K clusters. |
KMeansConfiguration(int K,
NearestNeighboursFactory<? extends NN,DATA> nnFactory,
int niters,
ExecutorService threadpool)
Create configuration for data that will create
K clusters. |
KMeansConfiguration(int K,
NearestNeighboursFactory<? extends NN,DATA> nnFactory,
int niters,
int blockSize,
ExecutorService threadpool)
Create configuration for data with
M dimensions that will
create K clusters. |
Modifier and Type | Method and Description |
---|---|
KMeansConfiguration<NN,DATA> |
clone() |
int |
getBlockSize()
Get the number of samples processed in a batch by a thread.
|
int |
getK()
Get the number of clusters
|
int |
getMaxIterations()
Get the maximum allowed number of iterations.
|
NearestNeighboursFactory<? extends NN,DATA> |
getNearestNeighbourFactory()
Get the factory that produces the
NearestNeighbours during
clustering. |
int |
numClusters()
Get the number of clusters
|
void |
setBlockSize(int blockSize)
Set the number of samples processed in a batch by a thread.
|
void |
setK(int k)
Set the number of clusters
|
void |
setMaxIterations(int niters)
Set the maximum allowed number of iterations.
|
void |
setNearestNeighbourFactory(NearestNeighboursFactory<? extends NN,DATA> factory)
Set the factory that produces the
NearestNeighbours during
clustering. |
void |
setNumClusters(int k)
Set the number of clusters
|
public static final int DEFAULT_BLOCK_SIZE
public static final int DEFAULT_NUMBER_ITERATIONS
protected int K
protected NearestNeighboursFactory<? extends NN extends NearestNeighbours<DATA,?,?>,DATA> factory
NearestNeighbours
objects used in
assignment.protected int blockSize
protected int niters
protected ExecutorService threadpool
public KMeansConfiguration(int K, NearestNeighboursFactory<? extends NN,DATA> nnFactory)
K
clusters.
The algorithm will run for a maximum of
DEFAULT_NUMBER_ITERATIONS
iterations, and make use of all
available processors, processing with blocks of
DEFAULT_BLOCK_SIZE
vectors.
The specified NearestNeighboursFactory
determines the actual type
of k-means that will be performed; it could be exact nearest-neighbours,
or it could be an approximate method, for example based on an ensemble of
kd-trees.
K
- number of clusters to be foundnnFactory
- the factory for producing the NearestNeighbours
.public KMeansConfiguration(int K, NearestNeighboursFactory<? extends NN,DATA> nnFactory, int niters)
K
clusters.
The algorithm will run for a maximum of the given number of iterations,
and will make use of all available processors, processing with blocks of
DEFAULT_BLOCK_SIZE
vectors.
The specified NearestNeighboursFactory
determines the actual type
of k-means that will be performed; it could be exact nearest-neighbours,
or it could be an approximate method, for example based on an ensemble of
kd-trees.
K
- number of clusters to be foundnnFactory
- the factory for producing the NearestNeighbours
.niters
- number of iterationspublic KMeansConfiguration(int K, NearestNeighboursFactory<? extends NN,DATA> nnFactory, int niters, ExecutorService threadpool)
K
clusters.
The algorithm will run for a maximum of the given number of iterations,
and will make use of the provided threadpool, processing with blocks of
DEFAULT_BLOCK_SIZE
vectors.
The specified NearestNeighboursFactory
determines the actual type
of k-means that will be performed; it could be exact nearest-neighbours,
or it could be an approximate method, for example based on an ensemble of
kd-trees.
K
- number of clusters to be foundnnFactory
- the factory for producing the NearestNeighbours
.threadpool
- threadpool to use for parallel processingniters
- number of training iterationspublic KMeansConfiguration(int K, NearestNeighboursFactory<? extends NN,DATA> nnFactory, int niters, int blockSize, ExecutorService threadpool)
M
dimensions that will
create K
clusters. The algorithm will run for a maximum of
the given number of iterations, and will make use of
nThreads
processors, processing with blocks of
DEFAULT_BLOCK_SIZE
vectors.
The specified NearestNeighboursFactory
determines the actual type
of k-means that will be performed; it could be exact nearest-neighbours,
or it could be an approximate method, for example based on an ensemble of
kd-trees.
K
- number of clusters to be foundnnFactory
- the factory for producing the NearestNeighbours
.threadpool
- threadpool to use for parallel processingblockSize
- number of samples per parallel threadniters
- number of training iterationspublic KMeansConfiguration()
public int getK()
public void setK(int k)
k
- the number of clusterspublic int numClusters()
public void setNumClusters(int k)
k
- the number of clusterspublic int getBlockSize()
public void setBlockSize(int blockSize)
blockSize
- the number of samples processed in a batch by a threadpublic int getMaxIterations()
public void setMaxIterations(int niters)
niters
- the maximum allowed number of iterations.public NearestNeighboursFactory<? extends NN,DATA> getNearestNeighbourFactory()
NearestNeighbours
during
clustering.public void setNearestNeighbourFactory(NearestNeighboursFactory<? extends NN,DATA> factory)
NearestNeighbours
during
clustering.factory
- the factory to set