Elasticsearch Relevance Scoring

When multiple documents match a search query, Elasticsearch ranks them by relevance score. The document with the highest score appears first. Understanding how scores work lets you tune search results to match your users' expectations.

The Relevance Score

Every document returned by a query has a _score field — a floating-point number. Higher means more relevant. The score is 0 for filter context and a positive decimal for query context.

Search: "organic coffee"

Results:
  Doc A _score: 3.42  "Premium Organic Coffee Beans"
  Doc B _score: 2.18  "Organic Green Coffee Extract"
  Doc C _score: 1.05  "Coffee Mug for Organic Lovers"
  Doc D _score: 0.91  "Organic Herbal Tea"

→ Sorted highest score first

BM25 — The Scoring Algorithm

Elasticsearch uses BM25 (Best Match 25) as its default relevance formula. It improves on the older TF-IDF algorithm. BM25 considers two main factors:

Factor 1: Term Frequency (TF)
  How many times does the search word appear in the document?
  More occurrences = higher score
  BUT: BM25 applies diminishing returns — the 10th occurrence
       adds less score than the 1st

Factor 2: Inverse Document Frequency (IDF)
  How rare is the search word across all documents?
  Rare words score higher than common words

  "organic" appears in 5,000 of 10,000 products → low IDF (common)
  "arabica" appears in 50 of 10,000 products → high IDF (rare, specific)

  A match on "arabica" is more meaningful than a match on "organic"

BM25 Field Length Normalization

Doc A: "Coffee" (1 word, your search word = 100% of content)
Doc B: "Coffee beans sourced from Ethiopia by hand-picking ..." (50 words)

Both contain "coffee" once.
BM25: Doc A scores higher because the match is more significant
      in a short field than in a long one.

Viewing the Explain Output

Ask Elasticsearch to explain exactly how a document's score was calculated:

GET /products/_explain/101
{
  "query": {
    "match": { "name": "organic coffee" }
  }
}

The response breaks down the score into sub-scores for each search term and shows TF, IDF, and field length values used in the calculation.

Boosting to Control Relevance

Boost specific fields or specific documents to push them higher in results.

Field Boost in multi_match

"fields": ["title^5", "description^2", "tags^1"]

Title matches are worth 5×
Description matches are worth 2×
Tag matches are worth 1×

→ A title match dominates the ranking

Boost inside a Query

GET /products/_search
{
  "query": {
    "bool": {
      "should": [
        { "match": { "name": { "query": "coffee", "boost": 3 } } },
        { "match": { "tags": { "query": "coffee", "boost": 1 } } }
      ]
    }
  }
}

function_score — Custom Scoring Logic

function_score lets you inject custom signals into the relevance calculation — for example, ranking newer products higher or boosting popular items:

GET /products/_search
{
  "query": {
    "function_score": {
      "query": { "match": { "name": "coffee" } },
      "functions": [
        {
          "field_value_factor": {
            "field":    "popularity_score",
            "modifier": "log1p",
            "factor":   1.5
          }
        }
      ],
      "boost_mode": "sum"
    }
  }
}
Base score from text match:       2.10
Popularity boost (log of score):  0.87
Final combined score:             2.97

Decay Functions — Distance-Based Scoring

Decay functions reduce a document's score as a numeric or date value moves away from a target. A job listing posted 3 days ago should score higher than one posted 90 days ago:

"functions": [
  {
    "gauss": {
      "posted_date": {
        "origin": "now",
        "scale":  "7d",
        "decay":  0.5
      }
    }
  }
]
Decay diagram:

Score multiplier
1.0 |*
    | *
0.5 |   *
    |      *
0.0 |         *  *  *
    |---7d---14d---21d--- (days old)

Jobs posted 7 days ago get 0.5× the score boost
Jobs posted 14 days ago get ~0.25× the score boost

rescore — Two-Stage Ranking

For expensive scoring models, run a cheap first pass to get the top 100 candidates, then apply a more precise scoring model on only those 100:

GET /products/_search
{
  "query": { "match": { "name": "coffee" } },
  "rescore": {
    "window_size": 100,
    "query": {
      "rescore_query": {
        "function_score": { ... }
      }
    }
  }
}

The full index only runs the simple match. The expensive function runs on just the top 100 results — far faster than applying it to millions of documents.

Leave a Comment