perf: scale workers + per-tablet rate limiting for 20 concurrent users

The default 2-worker gunicorn could only serve 2 concurrent tablet requests,
queueing the rest, and the rate limiter saw every tablet as the same Nginx
container IP, so 20 users would have collectively burned through the
100 req/min general bucket.

- gunicorn: 5 workers x 4 gthread, --forwarded-allow-ips=*, access log
- uvicorn: 4 workers, --proxy-headers, --forwarded-allow-ips=*
- RateLimitMiddleware: resolve real client IP from
  X-Forwarded-For -> X-Real-IP -> request.client.host
- Bump rate_limit_general 100 -> 300 req/min/IP (per tablet now)
- Flask: ProxyFix(x_for=1, x_proto=1, x_host=1) so request.remote_addr
  is the tablet IP, not the Nginx IP
- APIClient: forward X-Forwarded-For + X-Real-IP to FastAPI for both
  JSON and multipart/files calls; safe no-op outside request context
- 12 new tests (7 server + 5 client) covering header precedence,
  forwarding behavior and ProxyFix install

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-25 12:07:43 +02:00
parent ea8e4687b5
commit 86df67f2e5
8 changed files with 195 additions and 7 deletions
+1 -1
View File
@@ -23,4 +23,4 @@ RUN mkdir -p uploads/images uploads/pdfs uploads/logos uploads/reports
EXPOSE 8000
# Entry point: Alembic upgrade + Uvicorn
CMD ["sh", "-c", "alembic -c migrations/alembic.ini upgrade head && uvicorn main:app --host 0.0.0.0 --port 8000 --workers 2"]
CMD ["sh", "-c", "alembic -c migrations/alembic.ini upgrade head && uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4 --proxy-headers --forwarded-allow-ips='*'"]