Gunicorn and Uvicorn in production — the worker tuning we actually apply
May 12, 2026 · 1 min read · by Sudhanshu K.
Python web services in 2026 split into two camps: traditional WSGI (Django, Flask) served via Gunicorn sync workers, and modern ASGI (FastAPI, Starlette, Django 5 async) served via Uvicorn workers under a Gunicorn master. Both can be production-grade. Neither works well at defaults.
The single most common misconfiguration we see: 2 × CPU + 1 workers running against an async ASGI app that should be one Uvicorn worker per core with no concurrency multiplier. The CPU sits idle because every worker is single-threaded and async runs in a single event loop.
The Gunicorn + Uvicorn pattern for FastAPI
gunicorn app.main:app \
--workers $(( $(nproc) )) \
--worker-class uvicorn.workers.UvicornWorker \
--bind 0.0.0.0:8000 \
--timeout 30 \
--graceful-timeout 30 \
--keep-alive 5 \
--max-requests 10000 \
--max-requests-jitter 1000 \
--access-logfile -One Uvicorn worker per core. Each worker runs a single-threaded event loop. Async concurrency comes from awaiting I/O, not from threading.
The full write-up covers:
- The sync-worker formula
(2 × CPU) + 1and where it's wrong - Why
--max-requestsmatters (slow memory creep, GC fragmentation) --timeoutvs--graceful-timeout— and the order they fire- When
gthread(sync with threads) is the right pick - ProxyHeadersMiddleware for the X-Forwarded-For chain
- Reading
gunicorn --print-configto verify what's actually loaded
We ship this configuration on every managed Python service.
Full article available
Read the full article