MongoDB Replication
Replication is the process of keeping copies of the same data on multiple servers. When data exists on multiple machines, the system can continue working even if one server goes down. This makes the database more reliable and prevents data loss during hardware failures or network issues.
In MongoDB, replication is implemented through a feature called a Replica Set.
What is a Replica Set?
A Replica Set is a group of MongoDB servers (called nodes or members) that all hold the same data. Among these members, one is designated as the Primary and the rest are Secondaries.
- Primary — The only member that receives and processes all write operations. Reads can also go to the primary by default.
- Secondary — Copies data from the primary and stays synchronized. Secondaries can serve read operations if configured to do so.
A typical Replica Set has 3 members — one primary and two secondaries. Three members ensure fault tolerance: if one goes down, the remaining two still form a majority and the system continues operating.
How Replication Works — The Oplog
Every write operation on the primary is recorded in a special collection called the oplog (operation log). The oplog is a capped collection — it has a fixed size and older entries are overwritten as new ones come in.
Secondaries continuously read from the primary's oplog and apply the same operations to their own data. This process keeps all members in sync. The time difference between a write on the primary and its application on a secondary is called replication lag.
Automatic Failover
If the primary becomes unavailable (server crash, network failure, or planned maintenance), the remaining members automatically elect a new primary. This election happens within seconds and requires a majority vote among the active members.
Example Scenario
- Replica Set has 3 members: Primary, Secondary1, Secondary2
- Primary crashes
- Secondary1 and Secondary2 hold an election
- One of them becomes the new primary
- Reads and writes continue with minimal interruption
- When the original primary recovers, it rejoins as a secondary
This entire process happens automatically without any human intervention.
Setting Up a Replica Set (Local Example)
In a production environment, replica set members run on separate servers. For local testing and learning, multiple MongoDB instances can run on the same machine using different ports.
Step 1: Start Three MongoDB Instances
# Terminal 1
mongod --port 27017 --dbpath /data/rs0-0 --replSet myReplicaSet
# Terminal 2
mongod --port 27018 --dbpath /data/rs0-1 --replSet myReplicaSet
# Terminal 3
mongod --port 27019 --dbpath /data/rs0-2 --replSet myReplicaSet
Create the data directories first:
mkdir -p /data/rs0-0 /data/rs0-1 /data/rs0-2
Step 2: Connect to One Instance and Initialize the Replica Set
mongosh --port 27017
rs.initiate({
_id: "myReplicaSet",
members: [
{ _id: 0, host: "localhost:27017" },
{ _id: 1, host: "localhost:27018" },
{ _id: 2, host: "localhost:27019" }
]
})
Step 3: Check the Replica Set Status
rs.status()
This shows all members, their current state (PRIMARY or SECONDARY), and replication health information.
Useful Replica Set Commands
| Command | Purpose |
|---|---|
rs.status() | Shows the current state of all replica set members |
rs.isMaster() | Shows which member is the current primary |
rs.conf() | Shows the replica set configuration |
rs.add("host:port") | Adds a new member to the replica set |
rs.remove("host:port") | Removes a member from the replica set |
rs.stepDown() | Forces the current primary to step down and trigger an election |
Read Preferences
By default, all reads go to the primary. Read preference settings allow applications to read from secondary members, distributing the read load across the replica set.
| Read Preference | Behavior |
|---|---|
primary | All reads go to the primary (default) |
primaryPreferred | Reads go to the primary; falls back to secondary if primary is unavailable |
secondary | All reads go to a secondary |
secondaryPreferred | Reads go to a secondary; falls back to primary if no secondary is available |
nearest | Reads go to the member with the lowest network latency |
Setting read preference in a connection string:
mongodb://localhost:27017,localhost:27018,localhost:27019/myDB?replicaSet=myReplicaSet&readPreference=secondaryPreferred
Note on Reading from Secondaries
Reading from secondaries may return slightly stale data if replication lag exists. For applications where data must always be current, reads should go to the primary. For read-heavy analytics workloads where slight staleness is acceptable, secondaries reduce load on the primary.
Write Concern
Write concern controls how many replica set members must acknowledge a write before MongoDB reports it as successful. Higher write concern means more durability but slightly slower writes.
Write Concern Options
// Write is confirmed only by the primary
db.orders.insertOne({ product: "Laptop" }, { writeConcern: { w: 1 } })
// Write is confirmed by the primary and at least one secondary
db.orders.insertOne({ product: "Laptop" }, { writeConcern: { w: 2 } })
// Write is confirmed by the majority of replica set members
db.orders.insertOne({ product: "Laptop" }, { writeConcern: { w: "majority" } })
Using w: "majority" is the recommended setting for important data. It ensures the write is replicated to more than half the members before confirming success — preventing data loss even if the primary fails immediately after the write.
Arbiter Members
An Arbiter is a special replica set member that participates in elections but holds no data. Arbiters are useful when a third full server is not available or too expensive, but a third vote is needed to break election ties.
rs.addArb("localhost:27020")
Arbiters do not store data and cannot become primaries. They only vote in elections.
Replication vs Backup
Replication is not a substitute for backups. Replication copies all data — including accidental deletions. If a collection is accidentally dropped on the primary, it gets dropped on all secondaries too (via the oplog). Backups take a point-in-time snapshot that can be restored to undo such mistakes.
Benefits of Replication
| Benefit | Description |
|---|---|
| High Availability | Automatic failover keeps the database running if a node goes down |
| Data Redundancy | Multiple copies of data prevent single points of failure |
| Read Scalability | Secondary reads distribute query load across members |
| Zero-Downtime Maintenance | Nodes can be upgraded one at a time without service interruption |
Summary
MongoDB replication through Replica Sets maintains synchronized copies of data on multiple servers. One member serves as the primary for all writes, while secondaries replicate changes via the oplog. Automatic elections promote a new primary when the current one fails, ensuring high availability. Read preferences distribute read traffic across members. Write concern settings control how many members must acknowledge a write before it is confirmed. Arbiters participate in elections without storing data. Replication ensures data durability and system resilience in production deployments.
