perf: scale workers + per-tablet rate limiting for 20 concurrent users
The default 2-worker gunicorn could only serve 2 concurrent tablet requests, queueing the rest, and the rate limiter saw every tablet as the same Nginx container IP, so 20 users would have collectively burned through the 100 req/min general bucket. - gunicorn: 5 workers x 4 gthread, --forwarded-allow-ips=*, access log - uvicorn: 4 workers, --proxy-headers, --forwarded-allow-ips=* - RateLimitMiddleware: resolve real client IP from X-Forwarded-For -> X-Real-IP -> request.client.host - Bump rate_limit_general 100 -> 300 req/min/IP (per tablet now) - Flask: ProxyFix(x_for=1, x_proto=1, x_host=1) so request.remote_addr is the tablet IP, not the Nginx IP - APIClient: forward X-Forwarded-For + X-Real-IP to FastAPI for both JSON and multipart/files calls; safe no-op outside request context - 12 new tests (7 server + 5 client) covering header precedence, forwarding behavior and ProxyFix install Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -32,6 +32,25 @@ class RateLimitMiddleware(BaseHTTPMiddleware):
|
||||
self._general_requests: dict[str, list[float]] = defaultdict(list)
|
||||
self._request_count = 0 # Counter for triggering eviction
|
||||
|
||||
@staticmethod
|
||||
def _client_ip(request: Request) -> str:
|
||||
"""Resolve the originating client IP, honoring proxy headers.
|
||||
|
||||
Order of precedence: ``X-Forwarded-For`` (first hop), ``X-Real-IP``,
|
||||
``request.client.host``. Required because Nginx and the Flask client
|
||||
sit between the tablet and the API; without parsing these headers
|
||||
every tablet shares one bucket.
|
||||
"""
|
||||
xff = request.headers.get("x-forwarded-for")
|
||||
if xff:
|
||||
first = xff.split(",")[0].strip()
|
||||
if first:
|
||||
return first
|
||||
real = request.headers.get("x-real-ip")
|
||||
if real:
|
||||
return real.strip()
|
||||
return request.client.host if request.client else "unknown"
|
||||
|
||||
def _clean_window(self, timestamps: list[float], now: float) -> list[float]:
|
||||
"""Remove timestamps outside the current sliding window."""
|
||||
cutoff = now - self.WINDOW_SECONDS
|
||||
@@ -68,7 +87,7 @@ class RateLimitMiddleware(BaseHTTPMiddleware):
|
||||
return True, 0
|
||||
|
||||
async def dispatch(self, request: Request, call_next: Callable) -> Response:
|
||||
client_ip = request.client.host if request.client else "unknown"
|
||||
client_ip = self._client_ip(request)
|
||||
now = time.time()
|
||||
path = request.url.path
|
||||
|
||||
|
||||
Reference in New Issue
Block a user