GraphQL Rate Limiting
Rate limiting caps how many requests a client can make in a given time window. Without rate limiting, a single user or bot can bombard your GraphQL server with thousands of queries per second, starving other users and potentially crashing your database.
Why Rate Limiting Is Critical for GraphQL
REST APIs rate-limit by URL. GraphQL has one URL. A client can send 1,000 different queries to /graphql per second, each one valid and each one expensive. Rate limiting must account for GraphQL's flexibility.
REST rate limiting (simple): GraphQL rate limiting (nuanced): ───────────────────────────── ──────────────────────────────── GET /products → max 100/min POST /graphql POST /users → max 10/min All requests same URL Each endpoint counted separately Must look inside request to judge cost
Strategy 1 – Request Count Rate Limiting
The simplest approach counts HTTP requests per client (by IP or API key). Reject requests above the threshold.
npm install express-rate-limit
import rateLimit from 'express-rate-limit';
const limiter = rateLimit({
windowMs: 60 * 1000, // 1-minute window
max: 100, // 100 requests per minute per IP
message: 'Too many requests, please slow down',
keyGenerator: (req) =>
req.headers['x-api-key'] || req.ip, // Identify by API key or IP
});
app.use('/graphql', limiter);
Strategy 2 – Complexity-Based Rate Limiting
Counting requests is naive — a simple { hello } query and a complex nested query count the same. Complexity-based rate limiting deducts from a "token bucket" based on query complexity. Heavy queries deplete the bucket faster.
Token bucket per client:
──────────────────────────────────────────────
Bucket capacity: 1000 tokens
Refill rate: 100 tokens per minute
Simple query { user { name } } → costs 2 tokens
Complex query { users { posts { comments → costs 150 tokens
{ author { friends { name } } } } } }
Client sends 7 complex queries in a minute:
7 × 150 = 1050 tokens > 1000 → 8th query rejected
Client sends 400 simple queries in a minute:
400 × 2 = 800 tokens < 1000 → all allowed
Strategy 3 – Per-User Limits with Redis
IP-based limits are easy to bypass with multiple IPs. Use authenticated user IDs as rate limit keys for more accurate control.
import { RateLimiterRedis } from 'rate-limiter-flexible';
import { createClient } from 'redis';
const redis = createClient();
const rateLimiter = new RateLimiterRedis({
storeClient: redis,
keyPrefix: 'gql_limit',
points: 100, // 100 requests
duration: 60, // per 60 seconds
});
// In Apollo Server context:
context: async ({ req }) => {
const userId = getUserIdFromToken(req.headers.authorization);
const key = userId || req.ip;
try {
await rateLimiter.consume(key);
} catch {
throw new GraphQLError('Rate limit exceeded. Try again in a minute.',
{ extensions: { code: 'RATE_LIMITED' } });
}
return { user: ..., db: ... };
}
Setting Different Limits by Role
Role Requests per minute Notes ──── ─────────────────── ───── Guest 20 Unauthenticated users Free user 100 Standard limit Pro user 1000 Paid tier Admin Unlimited Internal use Bot/API Custom Negotiated per client
Rate Limit Response Headers
Return headers that tell clients how many requests they have left and when the window resets. This allows clients to back off automatically.
X-RateLimit-Limit: 100 X-RateLimit-Remaining: 43 X-RateLimit-Reset: 1717200060 ← Unix timestamp Retry-After: 37 ← Seconds until reset (when limited)
Key Points
- Rate limiting protects your server from abusive clients, bots, and accidental infinite loops.
- Simple request-count limiting is easy to set up but does not account for query complexity.
- Complexity-based limiting with a token bucket more accurately reflects the true cost of each request.
- Use Redis to store rate limit counters so limits work correctly across multiple server instances.
- Return rate limit headers so clients can adapt their request rate automatically.
