GraphQL N+1 Problem and DataLoader

The N+1 problem is the most common performance trap in GraphQL. It causes your server to fire dozens or hundreds of database queries for a single client request. DataLoader is the standard solution — it batches many small lookups into one efficient database call.

What the N+1 Problem Looks Like

  Schema:
  ────────
  type Post    { id: ID!  title: String!  author: User! }
  type User    { id: ID!  name:  String!               }
  type Query   { posts: [Post!]!                       }

  Naive resolvers:
  ─────────────────
  const resolvers = {
    Query: {
      posts: () => db.query('SELECT * FROM posts')     ← 1 query
    },
    Post: {
      author: (post) =>
        db.query(                                      ← 1 query PER POST
          'SELECT * FROM users WHERE id = $1',
          [post.authorId]
        )
    }
  };

  What happens when 10 posts are returned:
  ──────────────────────────────────────────
  Query 1:  SELECT * FROM posts          → returns 10 posts
  Query 2:  SELECT * FROM users WHERE id = 1
  Query 3:  SELECT * FROM users WHERE id = 2
  Query 4:  SELECT * FROM users WHERE id = 3
  Query 5:  SELECT * FROM users WHERE id = 4
  Query 6:  SELECT * FROM users WHERE id = 5
  Query 7:  SELECT * FROM users WHERE id = 6
  Query 8:  SELECT * FROM users WHERE id = 7
  Query 9:  SELECT * FROM users WHERE id = 8
  Query 10: SELECT * FROM users WHERE id = 9
  Query 11: SELECT * FROM users WHERE id = 10

  Total: 11 queries for 1 client request (1 + N)
  For 100 posts: 101 queries. For 1000 posts: 1001 queries.

DataLoader — Batch and Cache

DataLoader collects all the individual ID lookups that happen during one request tick, then fires a single batch query for all of them at once. It also caches results within the same request, so the same ID is never fetched twice.

  With DataLoader:
  ─────────────────
  Query 1:  SELECT * FROM posts               → returns 10 posts
  Query 2:  SELECT * FROM users WHERE id IN   → 1 query for all authors
            (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

  Total: 2 queries regardless of how many posts there are.

Setting Up DataLoader

  npm install dataloader

  import DataLoader from 'dataloader';

  // Batch function — receives array of IDs,
  // returns array of results in the SAME ORDER
  const userLoader = new DataLoader(async (userIds) => {
    const users = await db.query(
      'SELECT * FROM users WHERE id = ANY($1)',
      [userIds]
    );
    // DataLoader requires results ordered to match input IDs
    return userIds.map(id => users.find(u => u.id === id));
  });

  // Use in resolver:
  Post: {
    author: (post) => userLoader.load(post.authorId)
    //                            ↑ Schedules a load, does not fire yet
  }

  // DataLoader waits until all .load() calls in this tick
  // are registered, then fires ONE batch query for all IDs

DataLoader Must Live in Context

Create a new DataLoader for each request by putting it in context. This ensures the cache is request-scoped — one user's data never leaks into another user's response.

  await startStandaloneServer(server, {
    context: async () => ({
      db: dbPool,
      loaders: {
        user:    new DataLoader(batchUsers),
        product: new DataLoader(batchProducts),
      }
    })
  });

  // Resolver:
  Post: {
    author: (post, _, context) =>
      context.loaders.user.load(post.authorId)
  }

DataLoader Caching Within a Request

  Request: { posts { author { name } comments { author { name } } } }

  Without DataLoader:
  User 5 loaded for post.author         → DB query
  User 5 loaded again for comment.author → DB query (duplicate)

  With DataLoader:
  User 5 loaded for post.author         → scheduled
  User 5 loaded again for comment.author → returns from cache
  Only ONE DB query fires for User 5

Key Points

  • The N+1 problem fires one database query per item in a list instead of one query for all items.
  • DataLoader batches individual .load(id) calls into one batch query per tick.
  • DataLoader also caches results within the same request, eliminating duplicate lookups.
  • Always create DataLoader instances inside the request context, never as global singletons.
  • The batch function must return results in the same order as the input IDs array.

Leave a Comment