Elasticsearch Cluster Architecture

A single Elasticsearch instance handles small workloads well. Production systems need multiple nodes working together as a cluster to handle high traffic, large data volumes, and hardware failures.

What a Cluster Is

A cluster is a group of Elasticsearch nodes that share data and coordinate to serve requests. From outside, a cluster looks like one system — you send a query to any node and get the full answer.

         Your Application
               |
               | sends query
               v
    +----------+----------+
    |      CLUSTER        |
    |  +-------+          |
    |  | Node1 |  Master  |  <-- coordinates the cluster
    |  +-------+          |
    |  +-------+          |
    |  | Node2 |  Data    |  <-- stores and searches data
    |  +-------+          |
    |  +-------+          |
    |  | Node3 |  Data    |  <-- stores and searches data
    |  +-------+          |
    +---------------------+
               |
               | returns results
               v
         Your Application

Node Roles

Each node plays one or more roles. In small clusters, one node handles all roles. In large production clusters, dedicated nodes handle specific jobs.

RoleJobConfig Setting
Master-eligibleCan be elected to manage cluster state (index creation, node joins)node.roles: [master]
DataStores shards, runs searches and aggregationsnode.roles: [data]
IngestProcesses documents before indexing (transform, enrich)node.roles: [ingest]
CoordinatingReceives client requests, distributes to data nodes, merges resultsnode.roles: [] (empty)
MLRuns machine learning jobsnode.roles: [ml]

Master Node — The Cluster Manager

The master node keeps track of what exists in the cluster — which indexes exist, how many shards each has, which nodes are alive. This is called the cluster state.

Master node knows:
  - Index: products  → 3 shards on Node2, 3 replicas on Node3
  - Index: logs      → 5 shards spread across Node2, Node3
  - Node2 last seen: 0.5 seconds ago (healthy)
  - Node3 last seen: 0.3 seconds ago (healthy)

If the master dies, the remaining master-eligible nodes hold an election and pick a new master automatically — no human action required.

Master Election with Quorum

To prevent "split-brain" (two nodes each thinking they are master), Elasticsearch requires a majority vote to elect a master:

3 master-eligible nodes → quorum = 2
  Node1 votes for Node2
  Node2 votes for itself
  Node2 gets 2/3 votes → elected master ✔

5 master-eligible nodes → quorum = 3
  Must get 3 votes to win

Split-brain scenario (prevented):
  Network splits cluster into two halves: 2 nodes | 1 node
  2 nodes have quorum → elect a master
  1 node cannot reach quorum → goes to read-only mode
  Only ONE master ever exists ✔

Coordinating Node — The Request Router

Search request from client:
      |
      v
  Coordinating Node
      |
      | broadcasts to all shards
      +-----+-----+
      v     v     v
   Shard1 Shard2 Shard3  (on different data nodes)
      |     |     |
      | local results
      +-----+-----+
      v
  Coordinating Node
  (merges + sorts all results)
      |
      v
  Final response to client

Cluster Health

Every cluster has one of three health states:

StatusMeaning
GreenAll primary shards and replica shards are assigned and active
YellowAll primary shards active, but some replica shards are missing
RedSome primary shards are missing — some data may be unavailable
GET /_cluster/health

{
  "status": "green",
  "number_of_nodes": 3,
  "number_of_data_nodes": 2,
  "active_primary_shards": 8,
  "active_shards": 16,
  "unassigned_shards": 0
}

Node Discovery

New nodes find the cluster through the discovery.seed_hosts setting — a list of existing nodes to contact when starting up:

In elasticsearch.yml:
  cluster.name: my-production-cluster
  discovery.seed_hosts:
    - 10.0.0.1:9300
    - 10.0.0.2:9300
    - 10.0.0.3:9300

When a new node starts, it connects to one of these IPs, joins the cluster, and receives the full cluster state from the master.

Cross-Cluster Search

Large organizations run multiple clusters — one per region, for example. Cross-cluster search lets you query all of them from one request:

GET /india-cluster:products,us-cluster:products/_search
{
  "query": { "match": { "name": "coffee" } }
}

Elasticsearch fans the query out to both clusters, collects results, and merges them before responding.

Leave a Comment