public class RetrievalEvaluator extends Object
A retrieval evaluator object computes a variety of standard information retrieval metrics commonly used in TREC, including binary preference (BPREF), geometric mean average precision (GMAP), mean average precision (MAP), and standard precision and recall. In addition, the object gives access to the relevant documents that were found, and the relevant documents that were missed.
BPREF is defined in Buckley and Voorhees, "Retrieval Evaluation with Incomplete Information", SIGIR 2004.
Modifier and Type | Class and Description |
---|---|
static class |
RetrievalEvaluator.Document
This class represents a document returned by a retrieval
system.
|
static class |
RetrievalEvaluator.Judgment
This class represents a relevance judgment of a particular document
for a specific query.
|
Constructor and Description |
---|
RetrievalEvaluator(String queryName,
List<RetrievalEvaluator.Document> retrieved,
Collection<RetrievalEvaluator.Judgment> judgments)
Creates a new instance of RetrievalEvaluator
|
Modifier and Type | Method and Description |
---|---|
double |
averagePrecision()
Returns the average precision of the query.
|
double |
binaryPreference()
The binary preference measure, as presented in Buckley, Voorhees
"Retrieval Evaluation with Incomplete Information", SIGIR 2004.
|
static int[] |
getFixedPoints() |
double[] |
interpolatedPrecision() |
ArrayList<RetrievalEvaluator.Document> |
irrelevantRetrievedDocuments()
This method returns a list of all documents that were retrieved
but assumed to be irrelevant.
|
ArrayList<RetrievalEvaluator.Document> |
judgedIrrelevantRetrievedDocuments() |
protected double |
normalizationTermNDCG(int documentsRetrieved) |
double |
normalizedDiscountedCumulativeGain()
Normalized Discounted Cumulative Gain
|
double |
normalizedDiscountedCumulativeGain(int documentsRetrieved)
Normalized Discounted Cumulative Gain
|
double |
precision(int documentsRetrieved)
Returns the precision of the retrieval at a given number of documents retrieved.
|
double[] |
precisionAtFixedPoints() |
String |
queryName() |
double |
recall(int documentsRetrieved)
Returns the recall of the retrieval at a given number of documents retrieved.
|
double |
reciprocalRank()
Returns the reciprocal of the rank of the first relevant document
retrieved, or zero if no relevant documents were retrieved.
|
ArrayList<RetrievalEvaluator.Document> |
relevantDocuments()
Returns a list of all documents judged relevant, whether they were
retrieved or not.
|
ArrayList<RetrievalEvaluator.Document> |
relevantMissedDocuments()
Returns a list of documents that were judged relevant that
were not retrieved.
|
int |
relevantRetrieved(int documentsRetrieved)
The number of relevant documents retrieved at a particular
rank.
|
ArrayList<RetrievalEvaluator.Document> |
relevantRetrievedDocuments()
Returns a list of retrieved documents that were judged relevant,
in the order that they were retrieved.
|
ArrayList<RetrievalEvaluator.Document> |
retrievedDocuments() |
double |
rPrecision()
Returns the precision at the rank equal to the total number of
relevant documents retrieved.
|
public RetrievalEvaluator(String queryName, List<RetrievalEvaluator.Document> retrieved, Collection<RetrievalEvaluator.Judgment> judgments)
queryName
- retrieved
- A ranked list of retrieved documents.judgments
- A collection of relevance judgments.public static int[] getFixedPoints()
public double[] precisionAtFixedPoints()
getFixedPoints()
.public double[] interpolatedPrecision()
public double precision(int documentsRetrieved)
documentsRetrieved
- The evaluation rank.public double recall(int documentsRetrieved)
documentsRetrieved
- The evaluation rank.public double rPrecision()
public double reciprocalRank()
public double averagePrecision()
Suppose the precision is evaluated once at the rank of each relevant document in the retrieval. If a document is not retrieved, we assume that it was retrieved at rank infinity. The mean of all these precision values is the average precision.
public double binaryPreference()
The binary preference measure, as presented in Buckley, Voorhees "Retrieval Evaluation with Incomplete Information", SIGIR 2004. This implemenation is the 'pure' version, which is the one used in Buckley's trec_eval (v 8 with bpref bugfix).
The formula is: 1/R \sum_{r} 1 - |n ranked greater than r| / min(R, N) where R is the number of relevant documents for this topic, N is the number of irrelevant documents judged for this topic, and n is a member of the set of first R judged irrelevant documents retrieved.
public double normalizedDiscountedCumulativeGain()
Normalized Discounted Cumulative Gain
This measure was introduced in Jarvelin, Kekalainen, "IR Evaluation Methods for Retrieving Highly Relevant Documents" SIGIR 2001. I copied the formula from Vassilvitskii, "Using Web-Graph Distance for Relevance Feedback in Web Search", SIGIR 2006. Score = N \sum_i (2^{r(i)} - 1) / \log(1 + i) Where N is such that the score cannot be greater than 1. We compute this by computing the DCG (unnormalized) of a perfect ranking.public double normalizedDiscountedCumulativeGain(int documentsRetrieved)
Normalized Discounted Cumulative Gain
This measure was introduced in Jarvelin, Kekalainen, "IR Evaluation Methods for Retrieving Highly Relevant Documents" SIGIR 2001. I copied the formula from Vassilvitskii, "Using Web-Graph Distance for Relevance Feedback in Web Search", SIGIR 2006. Score = N \sum_i (2^{r(i)} - 1) / \log(1 + i) Where N is such that the score cannot be greater than 1. We compute this by computing the DCG (unnormalized) of a perfect ranking.documentsRetrieved
- protected double normalizationTermNDCG(int documentsRetrieved)
public int relevantRetrieved(int documentsRetrieved)
documentsRetrieved
- the rankpublic ArrayList<RetrievalEvaluator.Document> retrievedDocuments()
public ArrayList<RetrievalEvaluator.Document> judgedIrrelevantRetrievedDocuments()
public ArrayList<RetrievalEvaluator.Document> irrelevantRetrievedDocuments()
public ArrayList<RetrievalEvaluator.Document> relevantRetrievedDocuments()
public ArrayList<RetrievalEvaluator.Document> relevantDocuments()
public ArrayList<RetrievalEvaluator.Document> relevantMissedDocuments()