Skip to content
EdgeServers
Blog

Gunicorn and Uvicorn in production — the worker tuning we actually apply

May 12, 2026 · 1 min read · by Sudhanshu K.

Python web services in 2026 split into two camps: traditional WSGI (Django, Flask) served via Gunicorn sync workers, and modern ASGI (FastAPI, Starlette, Django 5 async) served via Uvicorn workers under a Gunicorn master. Both can be production-grade. Neither works well at defaults.

The single most common misconfiguration we see: 2 × CPU + 1 workers running against an async ASGI app that should be one Uvicorn worker per core with no concurrency multiplier. The CPU sits idle because every worker is single-threaded and async runs in a single event loop.

The Gunicorn + Uvicorn pattern for FastAPI

gunicorn app.main:app \
  --workers $(( $(nproc) )) \
  --worker-class uvicorn.workers.UvicornWorker \
  --bind 0.0.0.0:8000 \
  --timeout 30 \
  --graceful-timeout 30 \
  --keep-alive 5 \
  --max-requests 10000 \
  --max-requests-jitter 1000 \
  --access-logfile -

One Uvicorn worker per core. Each worker runs a single-threaded event loop. Async concurrency comes from awaiting I/O, not from threading.

The full write-up covers:

  • The sync-worker formula (2 × CPU) + 1 and where it's wrong
  • Why --max-requests matters (slow memory creep, GC fragmentation)
  • --timeout vs --graceful-timeout — and the order they fire
  • When gthread (sync with threads) is the right pick
  • ProxyHeadersMiddleware for the X-Forwarded-For chain
  • Reading gunicorn --print-config to verify what's actually loaded

We ship this configuration on every managed Python service.

Full article available

Read the full article