Why I Finally Pulled the Plug on Polars and Moved to DuckDB
I finally hit that point that every engineer eventually reaches with a tool they once loved, that moment where frustration quietly builds over time and then suddenly flips into a decision, not because of one catastrophic failure but because of the accumulation of too many small ones. That was me with Polars. After years of using it, promoting it, and even putting it into production workloads early on, I reached the point where I removed it entirely from a set of critical pipelines and replaced it with DuckDB without much hesitation.
Before anyone jumps in to defend Polars or call this an overreaction, it is worth understanding that this was not a spur-of-the-moment decision. This was the result of years of real-world usage, repeated friction, and a growing realization that what matters most in a production data platform is not theoretical performance or benchmark wins, but consistency, predictability, and trust that the tool behaves the same way every time it runs.
If you rewind back to around 2022 and 2023, I was one of the early adopters pushing Polars pretty hard. I liked what it represented. It was fast, written in Rust, and it gave data engineers a way to process large datasets on a single machine without immediately defaulting to Spark. That alone made it incredibly compelling. At the time, it felt like the future, especially for workloads that did not need distributed compute but still required serious performance.
I did not just experiment with it casually either. I used it across a wide range of workloads, including production systems, and wrote quite a bit about it publicly. I was all in. And to be fair, it delivered in many ways. The performance was impressive, the API was clean, and the ability to stream data through transformations on a single node opened up many interesting architectural possibilities.
But over time, something started to shift. The issues I encountered were not always dramatic, but they were persistent. Memory problems would appear in certain scenarios. Behavior would change depending on the environment or dependency versions. Some issues that clearly affected multiple users would appear in public threads, only to be dismissed or closed in ways that did not inspire confidence. That alone is not enough to abandon a tool, but it plants a seed.
The real turning point came more recently when I rebuilt a set of AWS Lambda functions that relied on Polars to read data from S3, transform it, and write it back. This was not a major refactor. I was not changing infrastructure or introducing new dependencies. I was simply updating some logic, using a pinned version of Polars inside a standard AWS Python base image.
It looked something like this in spirit:
COPY . ./
RUN pip3 install polars==1.31.0
Nothing unusual, nothing experimental, just a straightforward Lambda setup.
- And yet, the next day, things were broken.
This is the kind of failure that drives engineers crazy because it violates expectations. When you change one small piece of logic and something unrelated stops working, you immediately lose confidence in the system. Yes, you can argue that this is part of the reality of software engineering, that Python environments are fragile, that dependency chains are complex, and that these things happen. All of that is true.
But there is also a practical side to this. In the real world, we are dealing with limited time, competing priorities, and systems that need to run reliably without constant babysitting. When a tool repeatedly introduces uncertainty into that equation, it becomes harder to justify keeping it in the stack.
I spent some time trying to debug the issue, even pulling in help to trace through what might be happening under the hood, but I found myself asking a more important question. Why am I doing this again?
This was not the first time I had run into issues with Polars, especially when dealing with S3. There have been multiple reports and examples of inconsistent behavior when reading CSV files from S3, including differences between read_csv and scan_csv, failures related to HEAD requests, credential resolution issues, and even crashes during execution. Some of these issues are still open, some have been closed, and others resurface in slightly different forms.
At a certain point, the details of each individual bug stop mattering. What matters is the pattern.
That pattern, at least in my experience, is that Polars can be incredibly fast and powerful under the right conditions, but those conditions are not always easy to guarantee in real production environments, especially when cloud storage, authentication layers, and containerized runtimes are involved.
So I made the call.
I removed Polars from those pipelines and replaced it with DuckDB.
And almost immediately, things felt simpler.
DuckDB is not perfect, and it is not trying to be everything, but it has a different kind of stability to it. The S3 integrations feel more predictable. The execution model is easier to reason about. When something works, it tends to keep working unless you intentionally change it. That level of consistency matters more to me than squeezing out marginal performance gains in ideal scenarios.
This decision also forced me to step back and think about what I actually value in the tools I choose for building data platforms. It is easy to get caught up in debates about speed, architecture, or language design, but at the end of the day, the job is to deliver reliable systems that produce correct results consistently. Anything that gets in the way of that, no matter how impressive it looks on paper, becomes a liability.
That does not mean I am completely abandoning Polars. There are still scenarios where it makes sense, especially for local data exploration, single-node analytics, or tightly controlled environments where you can manage dependencies and inputs carefully. In those contexts, it can still shine.
But for production workloads where reliability is non-negotiable, I am far more cautious now.
Once you have been burned a few times by the same class of issues, you start to change how you evaluate tools. You stop asking what is fastest and start asking what will fail the least in the environments you actually operate in.
For me, right now, that answer is DuckDB.


