What Is Rate Limiting?

What is rate limiting?

Rate limiting is a control that restricts how many requests, actions, or events a client can perform within a defined period. The client might be identified by IP address, account, session, API key, device, token, route, or a combination of signals. When the limit is exceeded, the system may delay, reject, challenge, queue, or downgrade the request.

Rate limiting is used for security, reliability, fairness, and cost control. It can slow login guessing, reduce scraping, protect expensive search endpoints, prevent checkout abuse, stop spam submissions, control public API use, and limit accidental loops from poorly written clients. The goal is not simply to block high volume. The goal is to keep a workflow usable for legitimate users while making abusive repetition ineffective.

How rate limits are structured

A rate limit usually has four parts: scope, threshold, time window, and action. The scope defines what is being counted, such as requests from one IP address, password reset attempts for one account, or write calls from one API key. The threshold defines how many events are allowed. The time window defines the period, such as per minute, per hour, or per day. The action defines what happens after the threshold is crossed.

Different algorithms handle bursts differently. A fixed window is simple but can allow a burst at the boundary between two windows. A sliding window smooths that behavior by considering recent history. A token bucket allows short bursts while enforcing an average rate. A leaky bucket processes requests at a steady rate and queues or drops excess traffic. The best choice depends on whether the workflow should tolerate short bursts, how precise enforcement must be, and how much implementation complexity is acceptable.

Where rate limiting helps

Login and password reset flows often need account-aware limits because attackers can distribute traffic across many networks. Search, availability, pricing, and inventory endpoints may need route-specific limits because each request can be expensive or commercially sensitive. Public APIs need client-specific limits so one integration cannot consume capacity meant for others. Forms and content submission paths may need limits to reduce spam and fake account creation.

Rate limiting can also protect against non-malicious failures. A partner integration may retry too aggressively after an error. A mobile app bug may poll continuously. A crawler may ignore crawl-delay guidance. In those cases, clear response behavior and good logging help the client owner fix the problem without turning the event into a security incident.

Design risks and failure modes

The simplest rate limit is often based on IP address, but IP-only rules can be unfair and easy to bypass. Many legitimate users may share an IP address behind a carrier-grade NAT, corporate proxy, school network, or public Wi-Fi. Attackers, meanwhile, can rotate through proxies or residential networks. IP can be useful, but it should not be the only identity for sensitive workflows.

Account-only limits have their own weakness. They may protect one account from many guesses but miss an attack that tries one password against many accounts. API-key limits help with authenticated clients but do nothing for anonymous abuse before a key is issued. Device or fingerprint-based limits can add context, but they require careful privacy, accuracy, and false-positive review.

Response design matters too. A clear Retry-After header helps legitimate API clients back off. A vague denial may be better for abusive login attempts where too much detail helps attackers tune. Immediate hard blocking can frustrate users, while silent throttling can hide problems from client owners. The action should match the route and risk.

Evidence to review before setting limits

Teams should measure normal behavior before enforcing strict thresholds. Useful evidence includes request rates by route, account, API key, network, user agent, device, response status, and time of day. Also review business outcomes: completed logins, password reset success, search conversion, checkout completion, API error rates, and support tickets.

Look for expensive endpoints and sensitive actions. A homepage can usually tolerate more repetition than a password reset form. A read-only product API may need different limits from a payment, export, or admin endpoint. Login, signup, search, inventory, checkout, and account recovery should usually be evaluated separately rather than placed under one global site-wide rule.

After rollout, monitor both abuse reduction and user harm. Repeated 429 responses, abandoned journeys, elevated support contacts, retry storms, and origin load changes all matter. A limit that lowers traffic but causes legitimate clients to retry harder may make the system less stable.

Operational governance

Every important limit should have an owner, a reason, and a rollback path. Document what the limit protects, which signals it counts, what action it takes, and which metrics prove it is working. Temporary limits added during an incident should expire or be reviewed, because emergency thresholds often become too strict for normal traffic.

Policy changes should be tested in observe-only mode where possible. This shows who would have been limited without affecting users. For APIs, communicate limits in developer documentation and return consistent status codes. For customer-facing workflows, coordinate with support so they can recognize false positives and explain remediation steps.

Rate limiting should be paired with other controls. Authentication, bot detection, caching, queueing, input validation, fraud checks, and capacity planning all reduce pressure in different ways. If rate limiting is the only control, attackers may simply distribute activity until they sit below the threshold.

Practical design checklist

Define the protected route or action before choosing the threshold.
Count by more than one dimension when abuse can rotate through IP addresses.
Use different limits for login, password reset, search, checkout, forms, and APIs.
Decide whether the right action is delay, challenge, reject, queue, or alert.
Monitor false positives, retries, support impact, and downstream business outcomes.

Rate limiting is most effective when it is specific, observable, and adjustable. A good limit reflects how a real workflow behaves, not just how many requests a server can receive.

What Is Rate Limiting?