Text similarity re-ranker retriever
Stack Serverless
The text_similarity_reranker
retriever uses an NLP model to improve search results by reordering the top-k documents based on their semantic similarity to the query.
Refer to Semantic re-ranking for a high level overview of semantic re-ranking.
To use text_similarity_reranker
, you can rely on the preconfigured .rerank-v1-elasticsearch
inference endpoint, which uses the Elastic Rerank model and serves as the default if no inference_id
is provided. This model is optimized for reranking based on text similarity. If you'd like to use a different model, you can set up a custom inference endpoint for the rerank
task using the Create inference API. The endpoint should be configured with a machine learning model capable of computing text similarity. Refer to the Elastic NLP model reference for a list of third-party text similarity models supported by Elasticsearch.
You have the following options:
Use the built-in Elastic Rerank cross-encoder model via the inference API’s Elasticsearch service. See this example for creating an endpoint using the Elastic Rerank model.
Use the Cohere Rerank inference endpoint with the
rerank
task type.Use the Google Vertex AI inference endpoint with the
rerank
task type.Upload a model to Elasticsearch with Eland using the
text_similarity
NLP task type.- Then set up an Elasticsearch service inference endpoint with the
rerank
task type. - Refer to the example on this page for a step-by-step guide.
- Then set up an Elasticsearch service inference endpoint with the
Scores from the re-ranking process are normalized using the following formula before returned to the user, to avoid having negative scores.
score = max(score, 0) + min(exp(score), 1)
Using the above, any initially negative scores are projected to (0, 1) and positive scores to [1, infinity). To revert back if needed, one can use:
score = score - 1, if score >= 0
score = ln(score), if score < 0
retriever
-
(Required,
retriever
)The child retriever that generates the initial set of top documents to be re-ranked.
field
-
(Required,
string
)The document field to be used for text similarity comparisons. This field should contain the text that will be evaluated against the
inferenceText
. inference_id
-
(Optional,
string
)Unique identifier of the inference endpoint created using the inference API. If you don’t specify an inference endpoint, the
inference_id
field defaults to.rerank-v1-elasticsearch
, a preconfigured endpoint for the elasticsearch.rerank-v1
model. inference_text
-
(Required,
string
)The text snippet used as the basis for similarity comparison.
rank_window_size
-
(Optional,
int
)The number of top documents to consider in the re-ranking process. Defaults to
10
. min_score
-
(Optional,
float
)Sets a minimum threshold score for including documents in the re-ranked results. Documents with similarity scores below this threshold will be excluded. Note that score calculations vary depending on the model used.
filter
-
(Optional, query object or list of query objects)
Applies the specified boolean query filter to the child
retriever
. If the child retriever already specifies any filters, then this top-level filter is applied in conjuction with the filter defined in the child retriever.
Refer to this Python notebook for an end-to-end example using Elastic Rerank.
This example demonstrates how to deploy the Elastic Rerank model and use it to re-rank search results using the text_similarity_reranker
retriever.
Follow these steps:
Create an inference endpoint for the
rerank
task using the Create inference API.PUT _inference/rerank/my-elastic-rerank
{ "service": "elasticsearch", "service_settings": { "model_id": ".rerank-v1", "num_threads": 1, "adaptive_allocations": { "enabled": true, "min_number_of_allocations": 1, "max_number_of_allocations": 10 } } }
- Adaptive allocations will be enabled with the minimum of 1 and the maximum of 10 allocations.
Define a
text_similarity_rerank
retriever:POST _search
{ "retriever": { "text_similarity_reranker": { "retriever": { "standard": { "query": { "match": { "text": "How often does the moon hide the sun?" } } } }, "field": "text", "inference_id": "my-elastic-rerank", "inference_text": "How often does the moon hide the sun?", "rank_window_size": 100, "min_score": 0.5 } } }
This example enables out-of-the-box semantic search by re-ranking top documents using the Cohere Rerank API. This approach eliminates the need to generate and store embeddings for all indexed documents. This requires a Cohere Rerank inference endpoint that is set up for the rerank
task type.
GET /index/_search
{
"retriever": {
"text_similarity_reranker": {
"retriever": {
"standard": {
"query": {
"match_phrase": {
"text": "landmark in Paris"
}
}
}
},
"field": "text",
"inference_id": "my-cohere-rerank-model",
"inference_text": "Most famous landmark in Paris",
"rank_window_size": 100,
"min_score": 0.5
}
}
}
The following example uses the cross-encoder/ms-marco-MiniLM-L-6-v2
model from Hugging Face to rerank search results based on semantic similarity. The model must be uploaded to Elasticsearch using Eland.
Refer to the Elastic NLP model reference for a list of third party text similarity models supported by Elasticsearch.
Follow these steps to load the model and create a semantic re-ranker.
Install Eland using
pip
python -m pip install eland[pytorch]
Upload the model to Elasticsearch using Eland. This example assumes you have an Elastic Cloud deployment and an API key. Refer to the Eland documentation for more authentication options.
eland_import_hub_model \ --cloud-id $CLOUD_ID \ --es-api-key $ES_API_KEY \ --hub-model-id cross-encoder/ms-marco-MiniLM-L-6-v2 \ --task-type text_similarity \ --clear-previous \ --start
Create an inference endpoint for the
rerank
taskPUT _inference/rerank/my-msmarco-minilm-model
{ "service": "elasticsearch", "service_settings": { "num_allocations": 1, "num_threads": 1, "model_id": "cross-encoder__ms-marco-minilm-l-6-v2" } }
Define a
text_similarity_rerank
retriever.POST movies/_search
{ "retriever": { "text_similarity_reranker": { "retriever": { "standard": { "query": { "match": { "genre": "drama" } } } }, "field": "plot", "inference_id": "my-msmarco-minilm-model", "inference_text": "films that explore psychological depths" } } }
This retriever uses a standard
match
query to search themovie
index for films tagged with the genre "drama". It then re-ranks the results based on semantic similarity to the text in theinference_text
parameter, using the model we uploaded to Elasticsearch.