@Reference(type=Article, author={"Frank Moosmann","Eric Nowak","Fr{\'e}d{\'e}ric Jurie"}, title="Randomized Clustering Forests for Image Classification", year="2008", journal="IEEE PAMI", url="http://dx.doi.org/10.1109/TPAMI.2007.70822") public class IntRandomForest extends Object implements SpatialClusters<int[]>, SpatialClusterer<IntRandomForest,int[]>, HardAssigner<int[],float[],IntFloatPair>
In this implementation the training phase is used to identify the limits of
the data (for which a very small subset may be provided). Once this is known
N decision trees are constructed each with M decisions (see
RandomDecisionTree
). In the clustering phase each feature projected
is assigned a letter for each decision tree.
CLUSTER_HEADER
Constructor and Description |
---|
IntRandomForest()
Makes a default random forest with 32 trees each with 32 decisions.
|
IntRandomForest(int nTrees,
int nDecisions)
Makes a random forest with nTrees each with nDecisions.
|
Modifier and Type | Method and Description |
---|---|
String |
asciiHeader()
Header for ascii input.
|
int |
assign(int[] data)
Uses the
assignWord(int[]) function to construct the word
representing this data point. |
int[] |
assign(int[][] data)
Assign data to a cluster.
|
IntFloatPair |
assignDistance(int[] data)
Assign a single point to a cluster.
|
void |
assignDistance(int[][] data,
int[] indices,
float[] distances)
Assign data to clusters.
|
org.openimaj.ml.clustering.rforest.IntRandomForest.Word[] |
assignLetters(int[][] data)
Push each data point provided to a set of letters, i.e.
|
org.openimaj.ml.clustering.rforest.IntRandomForest.Word |
assignWord(int[] data)
Push a single data point to a set of letters, return the letters as word.
|
byte[] |
binaryHeader()
Header for binary input.
|
IntRandomForest |
cluster(DataSource<int[]> data)
Perform clustering with data from a data source.
|
IntRandomForest |
cluster(int[][] data)
Perform clustering on the given data.
|
HardAssigner<int[],?,?> |
defaultHardAssigner()
Get the default hard assigner for this clusterer.
|
boolean |
equals(Object r) |
int |
getNDecisions() |
int |
getNTrees() |
List<RandomDecisionTree> |
getTrees() |
int |
numClusters()
Get the number of clusters.
|
int |
numDimensions()
Get the data dimensionality
|
int[][] |
performClustering(int[][] data) |
void |
readASCII(Scanner br)
Read internal state from in.
|
void |
readBinary(DataInput dis)
Read internal state from in.
|
void |
setMinMax(int[] min,
int[] max)
The maximum and minimum values for the various dimentions against which
random decisions will be based.
|
void |
setRandomSeed(int random) |
int |
size()
The number of centroids or unique ids that can be generated.
|
void |
writeASCII(PrintWriter writer)
Write the content of this as ascii to out.
|
void |
writeBinary(DataOutput o)
Write the content of this as binary to out.
|
public IntRandomForest()
public IntRandomForest(int nTrees, int nDecisions)
nTrees
- number of treesnDecisions
- number of decisions per treepublic void setMinMax(int[] min, int[] max)
min
- max
- public IntRandomForest cluster(int[][] data)
SpatialClusterer
cluster
in interface SpatialClusterer<IntRandomForest,int[]>
data
- the data.public IntRandomForest cluster(DataSource<int[]> data)
SpatialClusterer
DataSource
could potentially be backed by disk rather in memory.cluster
in interface SpatialClusterer<IntRandomForest,int[]>
data
- the data.public int numClusters()
SpatialClusters
numClusters
in interface SpatialClusters<int[]>
public int numDimensions()
SpatialClusters
numDimensions
in interface Assigner<int[]>
numDimensions
in interface SpatialClusters<int[]>
public int[] assign(int[][] data)
HardAssigner
assign
in interface HardAssigner<int[],float[],IntFloatPair>
data
- the data.public org.openimaj.ml.clustering.rforest.IntRandomForest.Word[] assignLetters(int[][] data)
data
- public org.openimaj.ml.clustering.rforest.IntRandomForest.Word assignWord(int[] data)
data
- to be projectedpublic int assign(int[] data)
assignWord(int[])
function to construct the word
representing this data point. If this exact word has been seen before
(i.e. these letters in this order) the same int is used. If not, a new
int is assigned for this word.assign
in interface HardAssigner<int[],float[],IntFloatPair>
data
- a data point to be clustered to a wordpublic int getNTrees()
public int getNDecisions()
public List<RandomDecisionTree> getTrees()
public String asciiHeader()
ReadableASCII
asciiHeader
in interface ReadableASCII
asciiHeader
in interface WriteableASCII
public byte[] binaryHeader()
ReadableBinary
binaryHeader
in interface ReadableBinary
binaryHeader
in interface WriteableBinary
public void readASCII(Scanner br) throws IOException
ReadableASCII
readASCII
in interface ReadableASCII
br
- source to read from.IOException
- an error reading inputpublic void readBinary(DataInput dis) throws IOException
ReadableBinary
readBinary
in interface ReadableBinary
dis
- source to read from.IOException
- an error reading inputpublic void writeASCII(PrintWriter writer) throws IOException
WriteableASCII
writeASCII
in interface WriteableASCII
writer
- sink to write toIOException
- an error writing to outpublic void writeBinary(DataOutput o) throws IOException
WriteableBinary
writeBinary
in interface WriteableBinary
o
- sink to write toIOException
- an error writing to outpublic void setRandomSeed(int random)
random
- the seed of the java Random
instance used by the
decision treespublic void assignDistance(int[][] data, int[] indices, float[] distances)
HardAssigner
assignDistance
in interface HardAssigner<int[],float[],IntFloatPair>
data
- the data.indices
- the cluster index for each data point.distances
- the distance to the closest cluster for each data point.public IntFloatPair assignDistance(int[] data)
HardAssigner
assignDistance
in interface HardAssigner<int[],float[],IntFloatPair>
data
- point to assign.public HardAssigner<int[],?,?> defaultHardAssigner()
SpatialClusters
defaultHardAssigner
in interface SpatialClusters<int[]>
public int size()
HardAssigner
size
in interface HardAssigner<int[],float[],IntFloatPair>
public int[][] performClustering(int[][] data)
performClustering
in interface Clusterer<int[][]>