HTTP rate limiting service

Published:  27/05/2024 12:00


The very first article of the Net7 blog was about a quick in-between HTTP proxy that could perform some given action with the request.

We mostly used it to limit heavy users for APIs as they make up a great abuse target since API clients are almost always automated and primed to easily chain a lot of requests.

The original proxy was written for NodeJS, a pretty good choice considering how effectively and easily it handles streams.

However, the maximum throughput of the proxy wasn't that great: somewhere below 1 quarter of the typical throughput of an Nginx or Apache in event loop mode.

It wouldn't matter for many applications which can't max out the throughput of a front web server anyway by a long shot but we thought of picking something more efficient the next time around.

We can provide rate limiting as a service to any remote destination for a flat cost per month using the software defined in the current post (do note that we need to do the SSL termination in that case).

Use cases

The initial use case was to serve a fixed static page (along with possibly other static assets) for a certain amount of time to website visitors when the request rate or the average response time goes above a configurable threshold.

The application has to be able to work with different websites (we could call these virtual hosts) independently with their own configurable limits and active counters.

When the limits aren't reached or are reset, we proxy the requests to a configurable target — A default target can be configured but otherwise a specific target can be specified per virtual host.

Some extras

Alongside the base requirements we also have:

  • An option to let past visitors (seen within a configurable timeframe) through — Could be rephrased as only blocking new visitors when the website overload conditions are met ;
  • Logging to standard output — Now an industry standard for easy integration in containers or systemd ;
  • Provide some way to know when a website is in overload mode.


The limiter proxy is an HTTP server as we need access to information at the application layer.

The easiest setup is to put it between the "front" web server (the one that does SSL termination) and the actual application.

In some cases, as pictured here below, we're looping back to the front server to go to the application (for instance for PHP):

Schematic showing the flow of requests discussed right above the image

In this specific case the proxy can be bound to localhost only.

With the "return to front server" situation, we don't need a different return port per website, the Host header is passed through by the proxy so virtual hosting will work behind the proxy as well.

Choice of technology

We chose Rust for familiarity with that language and the Warp web framework.

In effect the proxy is faster than Apache or Nginx in terms of request throughput and could probably serve many sites with no issues.

Using Golang would probably be a good choice as well mostly because their HTTP libraries are industry standard and maintained by the Go developpers themselves.


For the moment we use a configuration file loaded when the service starts or restarts.

The Rust crate "config" is really good and can parse and combine multiple formats.

We usually use the toml format, here's an example:

port = 8888
# IP addresses to always let through.
# Careful with and the 
# X-Fowarded-To header being absent.
# Leave empty to disable.
ip_whitelist = [

# Looks for this page in "proxy-static-content":
limiter_active_page = "limiter-active.html"

# Default value for letting past visitors in when 
# limiter is active:
allow_past_visitors = true

# Proxies to this host by default:
proxy_to = ""

# Options for the default limiter, can either use
# max_requests_time (which is an average) or
# the duo of options max_requests over 
# max_requests_time (in seconds) giving a max
# request rate to allow.
# These are all optional, default is to use 
# the classic limiter with values 120/60.
max_requests = 1200
max_requests_time = 60

# How long do we lock access (seconds):
block_duration = 60
# Uncomment to use the response time limiter.
# Value is in milliseconds.
# max_requests_time = 8000


hostnames = [
proxy_to = ""

hostnames = [
proxy_to = ""
allow_past_visitors = false

The service then builds a list of the virtual hosts described in the [hosts.<NAME>] sections which can each have their own parameters for limiting requests.

When limiting is active, we immediately respond with the static page configured for that virtual host.

The proxy can serve static assets directly from a special path, for instance "/proxy-static-assets-dir". Requests to that path are never passed through to applications even when limiting is not active.

In the example above, we do not define different limits per virtual hosts so the default config of a max of 1200 requests over 60 seconds applies.

That value averages to 20 requests per second, which is very low but many complex PHP apps have trouble doing much better and benefit the most from having such a proxy up front.

In many cases the allowed request rate would be much higher, especially when there are a lot of requests for static assets because these get counted in the requests per second metric but they're much faster than requests involving the application, database, etc.

The max averagge response time mode was implemented to solve that issue. The proxy actually ignores requests that take less than 50ms when that mode is active.

Status pages

A protected special endpoint can show the status of the different virtual hosts, including the "default" one.

It currently answers in JSON and could look something like this:

  "host_id": 1,
  "hostnames": [
  "visitors": [
      "response_time": 392,
      "remote_addr": ""
      "response_time": 1287,
      "remote_addr": ""
      "response_time": 204,
      "remote_addr": ""
  "is_limiter_active": false,
  "request_count": 13,
  "average_response_time_ms": 627,
  "secs_since_last_update": 30

Possible improvements

We've since had issues with bots producing intense amounts of requests on pretty costly pages.

It suprisingly happened with the Facebook bot but the usual culprits nowadays are AI bots.

We have many ways of limiting such bots, some of which could be the subject of future articles.

However, it would be nice to be able to rate limit them with the proxy and just respond with a simple HTTP 429 error (Too Many Requests).

Most bots understand that status as "try again later".

They can be recognized by their User-Agent header so the filter would be more akin to rate limiting a list of User Agents rather than IP addresses as we do now.

Another improvement would be to only rate limit some URLs, e.g. the most costly ones that tend to be abused or the login endpoint to implement add-in brute force security.