Elasticsearch Cluster Architecture
A single Elasticsearch instance handles small workloads well. Production systems need multiple nodes working together as a cluster to handle high traffic, large data volumes, and hardware failures.
What a Cluster Is
A cluster is a group of Elasticsearch nodes that share data and coordinate to serve requests. From outside, a cluster looks like one system — you send a query to any node and get the full answer.
Your Application
|
| sends query
v
+----------+----------+
| CLUSTER |
| +-------+ |
| | Node1 | Master | <-- coordinates the cluster
| +-------+ |
| +-------+ |
| | Node2 | Data | <-- stores and searches data
| +-------+ |
| +-------+ |
| | Node3 | Data | <-- stores and searches data
| +-------+ |
+---------------------+
|
| returns results
v
Your Application
Node Roles
Each node plays one or more roles. In small clusters, one node handles all roles. In large production clusters, dedicated nodes handle specific jobs.
| Role | Job | Config Setting |
|---|---|---|
| Master-eligible | Can be elected to manage cluster state (index creation, node joins) | node.roles: [master] |
| Data | Stores shards, runs searches and aggregations | node.roles: [data] |
| Ingest | Processes documents before indexing (transform, enrich) | node.roles: [ingest] |
| Coordinating | Receives client requests, distributes to data nodes, merges results | node.roles: [] (empty) |
| ML | Runs machine learning jobs | node.roles: [ml] |
Master Node — The Cluster Manager
The master node keeps track of what exists in the cluster — which indexes exist, how many shards each has, which nodes are alive. This is called the cluster state.
Master node knows: - Index: products → 3 shards on Node2, 3 replicas on Node3 - Index: logs → 5 shards spread across Node2, Node3 - Node2 last seen: 0.5 seconds ago (healthy) - Node3 last seen: 0.3 seconds ago (healthy)
If the master dies, the remaining master-eligible nodes hold an election and pick a new master automatically — no human action required.
Master Election with Quorum
To prevent "split-brain" (two nodes each thinking they are master), Elasticsearch requires a majority vote to elect a master:
3 master-eligible nodes → quorum = 2 Node1 votes for Node2 Node2 votes for itself Node2 gets 2/3 votes → elected master ✔ 5 master-eligible nodes → quorum = 3 Must get 3 votes to win Split-brain scenario (prevented): Network splits cluster into two halves: 2 nodes | 1 node 2 nodes have quorum → elect a master 1 node cannot reach quorum → goes to read-only mode Only ONE master ever exists ✔
Coordinating Node — The Request Router
Search request from client:
|
v
Coordinating Node
|
| broadcasts to all shards
+-----+-----+
v v v
Shard1 Shard2 Shard3 (on different data nodes)
| | |
| local results
+-----+-----+
v
Coordinating Node
(merges + sorts all results)
|
v
Final response to client
Cluster Health
Every cluster has one of three health states:
| Status | Meaning |
|---|---|
| Green | All primary shards and replica shards are assigned and active |
| Yellow | All primary shards active, but some replica shards are missing |
| Red | Some primary shards are missing — some data may be unavailable |
GET /_cluster/health
{
"status": "green",
"number_of_nodes": 3,
"number_of_data_nodes": 2,
"active_primary_shards": 8,
"active_shards": 16,
"unassigned_shards": 0
}
Node Discovery
New nodes find the cluster through the discovery.seed_hosts setting — a list of existing nodes to contact when starting up:
In elasticsearch.yml:
cluster.name: my-production-cluster
discovery.seed_hosts:
- 10.0.0.1:9300
- 10.0.0.2:9300
- 10.0.0.3:9300
When a new node starts, it connects to one of these IPs, joins the cluster, and receives the full cluster state from the master.
Cross-Cluster Search
Large organizations run multiple clusters — one per region, for example. Cross-cluster search lets you query all of them from one request:
GET /india-cluster:products,us-cluster:products/_search
{
"query": { "match": { "name": "coffee" } }
}
Elasticsearch fans the query out to both clusters, collects results, and merges them before responding.
