ShortProductQuantiser (OpenIMAJ master project 1.3.10 API)

java.lang.Object
- org.openimaj.knn.pq.ShortProductQuantiser

```
@Reference(type=Article,
           author={"Jegou, Herve","Douze, Matthijs","Schmid, Cordelia"},
           title="Product Quantization for Nearest Neighbor Search",
           year="2011",
           journal="IEEE Trans. Pattern Anal. Mach. Intell.",
           pages={"117","","128"},
           url="http://dx.doi.org/10.1109/TPAMI.2010.57",
           month="January",
           number="1",
           publisher="IEEE Computer Society",
           volume="33",
           customData={"issn","0162-8828","numpages","12","doi","10.1109/TPAMI.2010.57","acmid","1916695","address","Washington, DC, USA","keywords","High-dimensional indexing, High-dimensional indexing, image indexing, very large databases, approximate search., approximate search., image indexing, very large databases"})
public class ShortProductQuantiser
extends Object
```
Implementation of a Product Quantiser for vectors/arrays of shorts. Product Quantisers quantise data into a very large number of clusters (large enough that the centroids could not possibly fit into memory - i.e. 2^64 centroids). The Product Quantiser can be used to create compressed representations of high-dimensional vectors, and also as a means to perform efficient nearest-neighbour search over large collections of vectors (which have been effectively compressed using the product quantiser).
This is achieved by breaking down the input vectors into non-overlapping sub-vectors, and applying quantisation to these sub-vectors individually. The number of bins (cluster centroids) for the sub-vectors is small (up to 256 in this implementation), but when combined over all sub-vectors, the number of bins is much larger as it accounts for all combinations of bins across sub-vectors. As only a small set of centroids needs to be held for the sub-vectors, the memory requirements are quite modest. The output of the quantisation action in this implementation is an array of bytes corresponding to the index of the matching centroid for each sub-vector (index numbers are offset by -128 so that 256 centroids indexes can fit in a single byte). The bit-pattern of this byte array could be interpreted as a numeric value of global cluster index, however in practice this is not useful.
Typically the product quantiser is "trained" so that it adapts to the data that is is being applied too. The standard approach to this is to use K-Means, however, this is not required. Insofar as this implementation is concerned, any set of compatible NearestNeighbours implementations can be provided to the constructor. Each of the NearestNeighbours could even potentially have a different number of dimensions (corresponding to the sub-vector lengths).
In the standard case, where you just want to use K-Means to train the Product Quantiser, a set of utility methods can be found in the org.openimaj.knn.pq.ShortProductQuantiserUtilities class which can be found in the clustering sub-project (due to the dependence on the K-Means algorithm).

Author:

Jonathon Hare (jsh2@ecs.soton.ac.uk)

Field Summary

Fields
Modifier and Type Field and Description

protected ShortNearestNeighboursExact[] assigners

protected int ndims

Fields
Modifier and Type	Field and Description
`protected ShortNearestNeighboursExact[]`	`assigners`
`protected int`	`ndims`

Constructor Summary

Constructors
Constructor and Description

ShortProductQuantiser(ShortNearestNeighboursExact[] assigners)
Construct a ShortProductQuantiser with the given nearest-neighbour assigners.

Constructors
Constructor and Description
`ShortProductQuantiser(ShortNearestNeighboursExact[] assigners)` Construct a `ShortProductQuantiser` with the given nearest-neighbour assigners.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`short[]`	`decompress(byte[] qdata)` Decompress the quantised data by replacing each encoded index with the actual centroid subvector.
`byte[]`	`quantise(short[] data)` Quantise the given data using this Product Quantiser.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - assigners
```
protected ShortNearestNeighboursExact[] assigners
```
  - ndims
```
protected int ndims
```
- Constructor Detail
  - ShortProductQuantiser
```
public ShortProductQuantiser(ShortNearestNeighboursExact[] assigners)
```
    Construct a ShortProductQuantiser with the given nearest-neighbour assigners. The number of dimensions of the assigners determines how long each sub-vector is. There is a one-to-one mapping between in the order of assigners and sub-vectors.
    
    Parameters:
    
    assigners - the nearest-neighbour assigners.
- Method Detail
  - quantise
```
public byte[] quantise(short[] data)
```
    Quantise the given data using this Product Quantiser. The output is an array of bytes corresponding to the index of the matching centroid for each sub-vector (index numbers are offset by -128 so that 256 centroids indexes can fit in a single byte).
    
    Parameters:
    
    data - the data to quantise
    
    Returns:
    
    the quantised data.
  - decompress
```
public short[] decompress(byte[] qdata)
```
    Decompress the quantised data by replacing each encoded index with the actual centroid subvector.
    
    Parameters:
    
    qdata - the quantised data
    
    Returns:
    
    the (approximate) decompressed feature

Class ShortProductQuantiser

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

assigners

ndims

Constructor Detail

ShortProductQuantiser

Method Detail

quantise

decompress