Microservices Data Management

One of the most important rules in microservices is that each service owns its own data. This means each service has its own database that other services cannot access directly. This rule is called Database per Service. It looks simple on paper, but it creates real design challenges — especially when multiple services need related data.

Why Each Service Needs Its Own Database

When services share a database, any change to that database affects every service connected to it. Adding a column, renaming a table, or changing a data type ripples through all services at once. One team's database change breaks another team's service.

SHARED DATABASE PROBLEM
========================
[Order Service]  ---SQL query--->  +---------------+
[User Service]   ---SQL query--->  | SHARED        |
[Payment Service]---SQL query--->  | DATABASE      |
[Report Service] ---SQL query--->  +---------------+

DBA renames column "user_id" to "customer_id"
  --> Order Service breaks
  --> Payment Service breaks
  --> All services need updates at the same time


SEPARATE DATABASES SOLUTION
============================
[Order Service]   --> [Orders DB]
[User Service]    --> [Users DB]
[Payment Service] --> [Payments DB]
[Report Service]  --> [Reports DB]

Each team changes their own DB freely.
Other services never notice.

Choosing the Right Database per Service

A key advantage of separate databases is that each service picks the database that fits its workload best. Not every service needs the same type of storage.

+------------------------+------------------+------------------------+
| Service                | DB Type          | Why                    |
+------------------------+------------------+------------------------+
| User Service           | PostgreSQL       | Structured data,       |
|                        | (Relational SQL) | relationships          |
+------------------------+------------------+------------------------+
| Product Catalog        | MongoDB          | Flexible product       |
|                        | (Document)       | attributes per item    |
+------------------------+------------------+------------------------+
| Session / Auth         | Redis            | Fast read/write for    |
|                        | (Key-Value)      | tokens                 |
+------------------------+------------------+------------------------+
| Search                 | Elasticsearch    | Full-text search       |
|                        | (Search Engine)  | across millions of docs|
+------------------------+------------------+------------------------+
| Transaction History    | PostgreSQL       | ACID compliance for    |
|                        | (Relational SQL) | financial data         |
+------------------------+------------------+------------------------+
| Event Log / Analytics  | InfluxDB         | Millions of time-      |
|                        | (Time-Series)    | stamped events/second  |
+------------------------+------------------+------------------------+

The Challenge: Joining Data Across Services

In a single database, you join tables. You run one SQL query and get a combined result. With separate databases, joins are impossible at the database level.

Suppose you want to show an order alongside the customer's name. The Order Service has the order. The User Service has the customer name. They live in different databases.

Solution 1: API Calls

Report Service needs "Order + Customer Name"

Step 1: Report Service calls Order Service --> gets order_id, user_id, amount
Step 2: Report Service calls User Service  --> gets name for user_id
Step 3: Report Service combines both       --> builds the report

This works for small data sets. For millions of records, API calls become slow.

Solution 2: Data Replication via Events

When a user updates their name in the User Service, it publishes a UserUpdated event. The Order Service listens to this event and stores a copy of the user's name in its own database.

[User Service]
User changes name "John" --> "Jonathan"
Publishes: UserUpdated { user_id: 99, name: "Jonathan" }

[Order Service] (listens to UserUpdated events)
Updates its local copy: user_id 99 name = "Jonathan"

Now Order Service has the name it needs without calling User Service.

The trade-off: the Order Service's copy may be slightly out of date for a few milliseconds. This is called eventual consistency.

Eventual Consistency Explained

In a microservices system, data across services is not always perfectly synchronized at the exact same moment. But it will be consistent eventually — usually within milliseconds or seconds.

Think of a bank with two branches. You deposit money at Branch A. The other branch learns about it a few seconds later after the systems sync. During those few seconds, Branch B sees the old balance. Then it updates. The final state is consistent — it just takes a moment to get there.

The CQRS Pattern

CQRS stands for Command Query Responsibility Segregation. It separates reading data from writing data, using different models for each.

WRITE SIDE (Command)         READ SIDE (Query)
=================            =================
User places order            Dashboard shows orders
Order Service writes         Read Service reads from
to Orders DB                 a separate Read DB

Orders DB updates            Events update Read DB
in real time                 asynchronously

Optimized for fast writes    Optimized for fast reads
and data integrity           and complex queries

CQRS shines when your system handles many reads but fewer writes, or when read and write operations need very different data formats.

Handling Distributed Transactions

A transaction that spans multiple services is hard to manage. What happens if the Order Service creates an order but the Payment Service fails to charge the card? You have an order with no payment.

Data Management Summary

Every service owns its own database — no sharing.
Each service picks the database type that fits its workload.
Cross-service data needs come through API calls or event-driven replication.
Eventual consistency is normal and acceptable for most use cases.
CQRS separates reads and writes for high-performance systems.

Previous lesson

Back to course

Next lesson