API Security API Logging Monitoring and Anomaly Detection

Rate limiting and authentication stop many attacks at the gate. But some attacks are subtle, slow-moving, or exploit legitimate features in unexpected ways. Logging and monitoring give you visibility into what is happening inside your API — so you can detect attacks that bypass preventive controls and respond before serious damage occurs.

Why Logging Matters for API Security

Without logging:
  An attacker harvests 50,000 user records over three weeks.
  The API behaves normally — no crashes, no visible errors.
  You discover the breach six months later when the data appears for sale.
  No idea what happened, when it happened, or which accounts were affected.

With logging:
  Day 1: Monitoring alerts — single IP accessing 500 different user IDs.
  Day 1: Automated response blocks the IP. Alerts security team.
  Day 1: Investigation begins. Attack stopped with 200 records exposed.
  Full audit trail shows exactly what was accessed, by whom, and when.

Logging does not prevent attacks — it enables detection and response.

What Every API Request Should Log

Minimum Required Log Fields:

Timestamp:
  ISO 8601 format with milliseconds and timezone.
  "2025-03-15T14:23:01.456Z"

Request Identification:
  Unique request ID (generated per request, returned in response header).
  Allows correlating client-reported errors with server-side logs.

Caller Identity:
  IP address and port of the caller.
  User ID (from verified token — if authenticated).
  API key identifier (first 8 chars, not the full key).
  Client application identifier.

Request Details:
  HTTP method (GET, POST, DELETE, etc.)
  URL path and query parameters (after PII removal).
  Content-Type header.
  User-Agent header.
  Referer header.

Response Details:
  HTTP status code.
  Response time in milliseconds.
  Response body size in bytes.

Security Context:
  Authentication result (success, failure, reason).
  Authorization result (allowed, denied, which resource).
  TLS version used.
  Geolocation of IP (country, region).

What Must NOT Be Logged

Sensitive data that must be excluded from logs:

Never log:
  Passwords (even failed attempts — log "password_provided: true/false")
  Full JWT tokens (log only the user_id extracted from them)
  API keys (log only the first 8 characters)
  Credit card numbers (log only masked version: ****1111)
  CVV codes (never log even in masked form)
  SSN/Aadhaar numbers (never in full)
  Password reset tokens
  OTP codes
  Session cookies
  Authorization header values

Log sanitization example:
  Instead of logging: "Authorization: Bearer eyJhbGciOiJIUzI1NiJ9..."
  Log: "auth_method: bearer_token, user_id: 101, token_valid: true"

  Instead of: "Request body: {password: 'MySecret123'}"
  Log: "Request body: {password: '[REDACTED]'}"

Log Storage and Protection

Logs are only useful if they are:

Centralized:
  All API servers write to a central log system.
  Distributed logs across multiple servers are impossible to search.
  Tools: ELK Stack (Elasticsearch+Logstash+Kibana), Splunk, Datadog,
         AWS CloudWatch, Azure Monitor, GCP Cloud Logging.

Tamper-Protected:
  Logs must not be modifiable by the API application itself.
  If an attacker compromises the API server, they should not be able
  to delete or modify logs covering their tracks.
  Solution: Write logs to a separate system the API cannot modify.
            Use append-only log storage with separate access credentials.

Retained Appropriately:
  Security logs: 90 days minimum (regulatory requirements vary).
  Authentication logs: 1-2 years for forensics.
  Access logs: 30-90 days.
  Compliance requirement (PCI-DSS): 1 year retention.

Access-Controlled:
  Logs may contain user IDs, IP addresses, and behavioral data.
  Access to logs should be restricted to authorized personnel.
  Log access should itself be audited.

Security-Relevant Events to Monitor

Authentication Events:
  ✓ Successful logins (user, timestamp, IP, device)
  ✓ Failed login attempts (user, reason, IP)
  ✓ Password changes
  ✓ MFA enabled/disabled
  ✓ API key created, revoked, or used
  ✓ Token refresh events

Authorization Events:
  ✓ All 403 Forbidden responses (with endpoint and user)
  ✓ Access to admin endpoints (all access, not just failures)
  ✓ Privilege escalation attempts

Data Access Events:
  ✓ Access to sensitive endpoints (health data, financial data, PII)
  ✓ Bulk data access (large result sets, export operations)
  ✓ Data modification events (CREATE, UPDATE, DELETE)

Input Validation Events:
  ✓ Malformed request rejections (400 errors)
  ✓ Schema validation failures
  ✓ Requests containing known attack signatures

Infrastructure Events:
  ✓ Rate limit exceeded events
  ✓ Server errors (500 responses)
  ✓ Dependency failures (database unreachable, third-party API errors)
  ✓ Certificate expiry warnings

Anomaly Detection Patterns

Pattern 1: BOLA/IDOR Enumeration
  Signature: User accessing many different object IDs in rapid succession.
  Normal:    User 101 accesses /api/orders/8801 (their own order).
  Attack:    User 101 accesses /api/orders/8801, 8802, 8803... 9500 in 5 minutes.
  Alert rule: User accesses more than 50 different resource IDs in 10 minutes.

Pattern 2: Credential Stuffing
  Signature: High login failure rate from one IP or across many accounts.
  Normal:    1 failed login, then 1 success.
  Attack:    200 failed logins per minute from IP 198.51.100.42.
  Alert rule: More than 20 failed logins per minute from single IP.

Pattern 3: Data Exfiltration
  Signature: Unusually large data volume returned compared to baseline.
  Normal:    User downloads 1-2 reports per session.
  Attack:    User downloads 500 reports in 30 minutes.
  Alert rule: User downloads more data in 1 hour than their 30-day average.

Pattern 4: Impossible Travel
  Signature: Same account authenticated from two geographically impossible locations.
  Example:   User logs in from Mumbai at 10:00 AM.
             Same user logs in from Berlin at 10:05 AM.
             (Impossible — 5 minutes apart, different continents.)
  Alert rule: Same user_id authenticated from locations > 500 km apart
              within 60 minutes.

Pattern 5: After-Hours Admin Access
  Signature: Admin endpoints accessed outside normal business hours.
  Normal:    Admin team uses admin panel 9 AM - 6 PM IST.
  Suspicious: Admin endpoint called at 2:30 AM from unusual IP.
  Alert rule: Admin endpoints accessed between 11 PM and 6 AM.

Pattern 6: Scraping Behavior
  Signature: Requests at machine speed, sequential IDs, no JavaScript.
  Normal:    Human user — variable timing, random navigation, loads images.
  Attack:    500 requests per minute, sequential product IDs, API-only calls.
  Alert rule: More than 200 requests per minute with consistent 50ms intervals.

SIEM Integration

SIEM = Security Information and Event Management.
A SIEM collects logs from all systems, correlates events, and generates alerts.

API Logs → SIEM Platform → Correlation Rules → Alerts → Security Team

Popular SIEMs:
  Splunk, IBM QRadar, Microsoft Sentinel, Elastic SIEM, Datadog Security

Example SIEM correlation rule:
  Name: "Potential API Data Exfiltration"
  Trigger when:
    Same user_id appears in 100+ GET /api/orders/{id} events
    AND all events within 10 minutes
    AND HTTP status = 200 for all
  Alert: HIGH SEVERITY → Notify security team immediately
  Action: Auto-block user_id in API gateway

SIEM also enables:
  Historical forensics after an incident.
  Compliance reporting (who accessed what, when).
  Baseline establishment (what is "normal" for your API).

Alerting and Incident Response

Alert Severity Levels:

Critical (Immediate response — PagerDuty, SMS):
  Active exploitation detected.
  Authentication bypass confirmed.
  Admin account accessed from unknown location.
  Significant data volume exfiltrated.

High (Response within 1 hour):
  Multiple failed authentication attempts across accounts.
  Unusual admin activity after hours.
  Rate limit repeatedly exceeded from same source.

Medium (Response within 4 hours):
  Scanning behavior detected (vulnerability scanning patterns).
  Unusual endpoint access patterns.
  Spike in 500 errors.

Low (Review next business day):
  Minor anomalies in request patterns.
  Single unusual access event.
  Informational security events.

Incident Response Steps:
  1. Detect — alert fires.
  2. Triage — confirm it is a real attack, not a false positive.
  3. Contain — block the attacker (IP, user, API key).
  4. Investigate — determine scope: what was accessed, when, how much.
  5. Remediate — fix the vulnerability that enabled the attack.
  6. Recover — restore normal operations.
  7. Document — post-incident report with lessons learned.

Key Points

Logging enables detection of attacks that bypass preventive controls. Without logs, attacks can go undetected for months.
Log authentication results, authorization decisions, request metadata, and response codes — but never log passwords, full tokens, card numbers, or government IDs.
Store logs in a centralized, tamper-protected system with access controls. Retain security logs for at least 90 days.
Define anomaly detection rules for BOLA enumeration, credential stuffing, impossible travel, bulk data access, and scraping behavior.
Integrate with a SIEM to correlate events across systems and enable automated alerting and response.
Classify alerts by severity and define response time expectations. Critical security events need immediate human response.

Previous lesson

Back to course

Next lesson