Celery in production — broker choice, retry semantics, and what Flower actually tells you
May 13, 2026 · 1 min read · by Sudhanshu K.
Celery is one of those tools whose defaults are almost right. Almost. Default retry policy will retry forever. Default acks_late is False, so a worker crash drops the job. Default broker for "quick start" is Redis without persistence enabled, so a Redis restart loses every in-flight task.
This is the production Celery setup we run for managed Python customers — broker choice, worker config, retry policy, and the monitoring that actually surfaces problems.
A baseline task with the safe defaults
from celery import Celery, shared_task
app = Celery('app', broker='redis://redis:6379/0', backend='redis://redis:6379/1')
app.conf.update(
task_acks_late=True,
task_reject_on_worker_lost=True,
task_track_started=True,
worker_prefetch_multiplier=1,
broker_connection_retry_on_startup=True,
)
@shared_task(bind=True, autoretry_for=(IOError,), retry_backoff=True,
retry_jitter=True, retry_kwargs={'max_retries': 5})
def send_email(self, to, subject):
...acks_late=True and prefetch_multiplier=1 together mean: take one task at a time, only ack after success. Combined with retry_backoff and a max_retries cap, jobs don't disappear into a retry loop.
The full write-up covers:
- Redis vs RabbitMQ: when each is the right broker
- Idempotency — the property tasks should have but rarely do
- Routing tasks to dedicated queues for prioritization and isolation
- Flower: the metrics that matter (queue depth, task latency, retry rate)
- Celery beat for scheduled tasks — and the leader-election story that nobody warns you about
- Worker lifecycle: max-tasks-per-child, memory leaks, the OOM-killer interaction
We ship this Celery pattern on every managed Python stack with background work.
Full article available
Read the full article