Pycon Israel 2021

Leaving Celery in the Dust: How to truly scale in production
2021-05-03, 10:30–10:55, General Track 1

When we first developed our system, we picked Celery due to its wide community adoption. When we started scaling our systems, we realized Celery was pulling us back from many different angles. We decided to replace Celery with our own technology.

Back in the days at Intsights, we architectured our platform based on a distributed task queue approach. Looking for an available library and devices to support our approach, we met Celery. According to Celery's documentation, Celery is a simple, flexible, and reliable distributed system to process vast amounts of messages while providing operations with the tools required to maintain such a system. It's a task queue with a focus on real-time processing while also supporting task scheduling.

For Intsights, Celery did not live to its promise. It did not scale and was highly bloated with metrics and communication overhead. Nonetheless, Celery did not introduce enough thread/process safety to handle problematic workloads that might fail, crash or get stuck on special occasions, such as a stuck GIL due to an infinite Regex.

At some point, we realized that our only option is to develop our own solution. We decided to stop chasing Celery bugs and be focused on what fits Intsights best. At first, we were inspired a lot by Celery's design. We implemented result backends, timeouts, and pipelines. We stuck to Celery's terminology to make the migration easier. Later we ditched most practices and introduced our own.

Today, we have a high performant, highly stable, and safe library that supports our use case in a perfect manner. Sergeant is meant to be very simple, very fast, very stable, and safe. Still, many features are missing or left out. We only support Mongo and Redis as backends. We do not guarantee consistency of task order and consumption. These compromises let us stay very simple to maintain and to focus on stability and performance. The library supports only Python 3.6> and provides full type annotations and test coverage.

Session language – English Target audience – Developers, DevOps, R&D