@Reference(type=Inproceedings, author={"Amirthalingam Ramanan","Mahesan Niranjan"}, title="Resource-Allocating Codebook for Patch-based Face Recognition", year="2009", booktitle="IIS", url="http://eprints.ecs.soton.ac.uk/21401/") public class IntRAC extends Object implements SpatialClusters<int[]>, SpatialClusterer<IntRAC,int[]>, CentroidsProvider<int[]>, HardAssigner<int[],float[],IntFloatPair>
During training, data points are selected at random. The first data point is chosen as a centroid. Every following data point is set as a new centroid if it is outside the threshold of all current centroids. In this way it is difficult to guarantee number of clusters so a minimisation function is provided to allow a close estimate of the required threshold for a given K.
This implementation supports int[] cluster centroids.
In terms of implementation, this class is a both a clusterer, assigner and
the result of the clustering. This is because the RAC algorithm never ends;
that is to say that if a new point is being assigned through the
HardAssigner interface, and that point is more than the threshold
distance from any other centroid, then a new centroid will be created for the
point. If this behaviour is undesirable, the results of clustering can be
"frozen" by manually constructing an assigner that takes a
CentroidsProvider (or the centroids provided by calling
getCentroids()) as an argument.
| Modifier and Type | Field and Description |
|---|---|
protected ArrayList<int[]> |
codebook |
protected static int[][] |
distances |
protected int |
nDims |
protected double |
threshold |
protected long |
totalSamples |
CLUSTER_HEADER| Constructor and Description |
|---|
IntRAC()
Sets the threshold to 128
|
IntRAC(double radiusSquared)
Define the threshold at which point a new cluster will be made.
|
IntRAC(int[][] bKeys,
int subSamples,
int nClusters)
Iteratively select subSamples from bKeys and try to choose a threshold
which results in nClusters.
|
| Modifier and Type | Method and Description |
|---|---|
String |
asciiHeader()
Header for ascii input.
|
int |
assign(int[] data)
Assign a single point to a cluster.
|
int[] |
assign(int[][] data)
Assign data to a cluster.
|
IntFloatPair |
assignDistance(int[] data)
Assign a single point to a cluster.
|
void |
assignDistance(int[][] data,
int[] indices,
float[] distances)
Assign data to clusters.
|
byte[] |
binaryHeader()
Header for binary input.
|
protected static double |
calculateThreshold(int[][] samples,
int nClusters) |
IntRAC |
cluster(DataSource<int[]> data)
Perform clustering with data from a data source.
|
IntRAC |
cluster(int[][] data)
Perform clustering on the given data.
|
HardAssigner<int[],?,?> |
defaultHardAssigner()
Get the default hard assigner for this clusterer.
|
int[][] |
getCentroids() |
int |
numClusters()
Get the number of clusters.
|
int |
numDimensions()
Get the data dimensionality
|
int[][] |
performClustering(int[][] data) |
void |
readASCII(Scanner in)
Read internal state from in.
|
void |
readBinary(DataInput dis)
Read internal state from in.
|
int |
size()
The number of centroids; this potentially grows as assignments are made.
|
void |
writeASCII(PrintWriter writer)
Write the content of this as ascii to out.
|
void |
writeBinary(DataOutput dos)
Write the content of this as binary to out.
|
protected double threshold
protected int nDims
protected static int[][] distances
protected long totalSamples
public IntRAC()
public IntRAC(double radiusSquared)
radiusSquared - public IntRAC(int[][] bKeys, int subSamples, int nClusters)
bKeys - All keys to be trained againstsubSamples - number of subsamples to select from bKeys each iterationnClusters - number of clusters to aim forprotected static double calculateThreshold(int[][] samples, int nClusters) throws org.apache.commons.math.MaxIterationsExceededException, org.apache.commons.math.FunctionEvaluationException
org.apache.commons.math.MaxIterationsExceededExceptionorg.apache.commons.math.FunctionEvaluationExceptionpublic IntRAC cluster(int[][] data)
SpatialClusterercluster in interface SpatialClusterer<IntRAC,int[]>data - the data.public IntRAC cluster(DataSource<int[]> data)
SpatialClustererDataSource
could potentially be backed by disk rather in memory.cluster in interface SpatialClusterer<IntRAC,int[]>data - the data.public int numClusters()
SpatialClustersnumClusters in interface SpatialClusters<int[]>public int numDimensions()
SpatialClustersnumDimensions in interface Assigner<int[]>numDimensions in interface SpatialClusters<int[]>public int[] assign(int[][] data)
HardAssignerassign in interface HardAssigner<int[],float[],IntFloatPair>data - the data.public int assign(int[] data)
HardAssignerassign in interface HardAssigner<int[],float[],IntFloatPair>data - datum to assign.public String asciiHeader()
ReadableASCIIasciiHeader in interface ReadableASCIIasciiHeader in interface WriteableASCIIpublic byte[] binaryHeader()
ReadableBinarybinaryHeader in interface ReadableBinarybinaryHeader in interface WriteableBinarypublic void readASCII(Scanner in) throws IOException
ReadableASCIIreadASCII in interface ReadableASCIIin - source to read from.IOException - an error reading inputpublic void readBinary(DataInput dis) throws IOException
ReadableBinaryreadBinary in interface ReadableBinarydis - source to read from.IOException - an error reading inputpublic void writeASCII(PrintWriter writer) throws IOException
WriteableASCIIwriteASCII in interface WriteableASCIIwriter - sink to write toIOException - an error writing to outpublic void writeBinary(DataOutput dos) throws IOException
WriteableBinarywriteBinary in interface WriteableBinarydos - sink to write toIOException - an error writing to outpublic int[][] getCentroids()
getCentroids in interface CentroidsProvider<int[]>public void assignDistance(int[][] data, int[] indices, float[] distances)
HardAssignerassignDistance in interface HardAssigner<int[],float[],IntFloatPair>data - the data.indices - the cluster index for each data point.distances - the distance to the closest cluster for each data point.public IntFloatPair assignDistance(int[] data)
HardAssignerassignDistance in interface HardAssigner<int[],float[],IntFloatPair>data - point to assign.public HardAssigner<int[],?,?> defaultHardAssigner()
SpatialClustersdefaultHardAssigner in interface SpatialClusters<int[]>public int size()
size in interface HardAssigner<int[],float[],IntFloatPair>HardAssigner.size()public int[][] performClustering(int[][] data)
performClustering in interface Clusterer<int[][]>