GraphQL Rate Limiting

Rate limiting caps how many requests a client can make in a given time window. Without rate limiting, a single user or bot can bombard your GraphQL server with thousands of queries per second, starving other users and potentially crashing your database.

Why Rate Limiting Is Critical for GraphQL

REST APIs rate-limit by URL. GraphQL has one URL. A client can send 1,000 different queries to /graphql per second, each one valid and each one expensive. Rate limiting must account for GraphQL's flexibility.

  REST rate limiting (simple):      GraphQL rate limiting (nuanced):
  ─────────────────────────────     ────────────────────────────────
  GET /products → max 100/min       POST /graphql
  POST /users   → max 10/min        All requests same URL
  Each endpoint counted separately  Must look inside request to judge cost

Strategy 1 – Request Count Rate Limiting

The simplest approach counts HTTP requests per client (by IP or API key). Reject requests above the threshold.

  npm install express-rate-limit

  import rateLimit from 'express-rate-limit';

  const limiter = rateLimit({
    windowMs:  60 * 1000,     // 1-minute window
    max:        100,           // 100 requests per minute per IP
    message:   'Too many requests, please slow down',
    keyGenerator: (req) =>
      req.headers['x-api-key'] || req.ip,  // Identify by API key or IP
  });

  app.use('/graphql', limiter);

Strategy 2 – Complexity-Based Rate Limiting

Counting requests is naive — a simple { hello } query and a complex nested query count the same. Complexity-based rate limiting deducts from a "token bucket" based on query complexity. Heavy queries deplete the bucket faster.

  Token bucket per client:
  ──────────────────────────────────────────────
  Bucket capacity:  1000 tokens
  Refill rate:      100 tokens per minute

  Simple query  { user { name } }           → costs 2 tokens
  Complex query { users { posts { comments → costs 150 tokens
    { author { friends { name } } } } } }

  Client sends 7 complex queries in a minute:
  7 × 150 = 1050 tokens > 1000 → 8th query rejected

  Client sends 400 simple queries in a minute:
  400 × 2 = 800 tokens < 1000 → all allowed

Strategy 3 – Per-User Limits with Redis

IP-based limits are easy to bypass with multiple IPs. Use authenticated user IDs as rate limit keys for more accurate control.

  import { RateLimiterRedis } from 'rate-limiter-flexible';
  import { createClient } from 'redis';

  const redis = createClient();
  const rateLimiter = new RateLimiterRedis({
    storeClient: redis,
    keyPrefix:   'gql_limit',
    points:      100,          // 100 requests
    duration:    60,           // per 60 seconds
  });

  // In Apollo Server context:
  context: async ({ req }) => {
    const userId = getUserIdFromToken(req.headers.authorization);
    const key = userId || req.ip;

    try {
      await rateLimiter.consume(key);
    } catch {
      throw new GraphQLError('Rate limit exceeded. Try again in a minute.',
        { extensions: { code: 'RATE_LIMITED' } });
    }

    return { user: ..., db: ... };
  }

Setting Different Limits by Role

  Role       Requests per minute    Notes
  ────       ───────────────────    ─────
  Guest      20                     Unauthenticated users
  Free user  100                    Standard limit
  Pro user   1000                   Paid tier
  Admin      Unlimited              Internal use
  Bot/API    Custom                 Negotiated per client

Rate Limit Response Headers

Return headers that tell clients how many requests they have left and when the window resets. This allows clients to back off automatically.

  X-RateLimit-Limit:     100
  X-RateLimit-Remaining: 43
  X-RateLimit-Reset:     1717200060   ← Unix timestamp
  Retry-After:           37           ← Seconds until reset (when limited)

Key Points

Rate limiting protects your server from abusive clients, bots, and accidental infinite loops.
Simple request-count limiting is easy to set up but does not account for query complexity.
Complexity-based limiting with a token bucket more accurately reflects the true cost of each request.
Use Redis to store rate limit counters so limits work correctly across multiple server instances.
Return rate limit headers so clients can adapt their request rate automatically.

Previous lesson

Back to course

Next lesson