, , ,

The Case of the Mysterious Recursive CTE

I still remember that day. A day that shall live on in infamy in my mind. Well over a decade ago, in the days when SQL Server roamed the land devouring souls on the Altar of Stored Procedures. There was only one tool available at the time. SQL. That’s it. There was one problem that had to be solved.

The answer? A recursive CTE.

At the same time … both a demon of the dark and a shining angel from the heavens. Just depends on your view.

CTEs, and recursive ones at that.

All these years later, in my own age of enlightenment, I look back on such atrocities and ask “Why?” Who would do such a thing, what an unruly evil to unleash on the world. Who knows. That exact CTE is probably still running and breaking the brain of some analyst 15 years later. We can only hope.

What?

A CTE, or Common Table Expression, is a feature in SQL that allows you to create a temporary result set that can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement. A CTE is useful for breaking down complex queries into simpler parts or for reusing a calculation multiple times within a query.

CTEs can make your queries more readable and can also be a means to write more maintainable SQL code.

What does a basic CTE look like?

CTE Benefits

From my perspective SQL Common Table Expressions (CTEs) provide multiple advantages, not all of them necessarily technical in nature.

  • Improved Readability
    • CTEs can break down complex queries into simpler parts. This makes the SQL code easier to understand, especially when you’re dealing with large queries.
  • Modularity
    • You can define a CTE once and then reference it multiple times in the main query. This modular approach can lead to cleaner, DRY (Don’t Repeat Yourself) code.
  • Recursive Queries
    • One of the standout features of CTEs is the ability to perform recursive operations. This is useful when dealing with hierarchical data, like organizational structures or tree-like data models.
  • Temporary Computation Storage
    • You can use CTEs to store intermediate results. This can be handy if a particular computation is going to be used multiple times in the main query. Instead of computing it every time, you can compute it once in the CTE and then reference it.
  • Sequential Processing
    • In scenarios where you need to perform operations in a specific sequence, CTEs can be invaluable. You can define multiple CTEs in a single `WITH` clause and reference them in sequence or use one CTE in the definition of another.
  • Maintainability
    • Because CTEs can make complex queries more readable, they can also make the codebase easier to maintain. If something changes in the logic, you might only need to adjust a specific CTE instead of digging through a lengthy query.
  • Debugging
    • When developing or troubleshooting SQL code, you can isolate specific parts of a query within a CTE, making it easier to test, debug, or optimize.
  • Performance Tuning
    • While CTEs in and of themselves don’t guarantee performance improvements, they can sometimes make it easier to write efficient queries by allowing the developer to structure the query in a way that’s optimal for the database’s query optimizer.
  • Substitution for Subqueries
    • In many cases, CTEs can be used as a more readable alternative to derived tables or subqueries.

Now, time to do the fun stuff. Recursive. I’m always torn. Anytime I see a recursive problem being solved in SQL I get a slight chill running up my back. It just seems more like a “programming”  problem, not a SQL problem. Why? Because SQL is made for working with data in sets. That’s its power.

When we do things row by row, it can get slow and doesn’t really “fit” well with what SQL is good at. Working on groups and sets of data. But, it is powerful for sure. Recursive SQL, sounds like something Gandalf would use. Let’s see an example, one from an old job of mine, 10+ years ago in a manufacturing context.

The Problem to Solve.

What if we have Bill Of Materials (BOM) that is of unknown depth to us?

A Bill Of Materials (BOM) typically lists parts and components and how many of each you need to make a product. For our example, let’s consider a scenario where we’re building a computer. The BOM might break down components like the motherboard, CPU, RAM, etc. Some of these components might further break down into sub-components.

First, let’s define a simple table structure for the BOM:

Now, let’s insert some data:

Now, we can create a recursive CTE to fetch the entire hierarchy:

The above recursive CTE starts with top-level items and then repeatedly joins the BOM table with itself, navigating down the hierarchy level by level, until all parts and sub-parts are covered. The Level column indicates the depth in the hierarchy, with 1 being the top level.
The key to note is how in the CTE, it joins on itself. 
And there you have it, in all its glory, a recursive CTE. Utterly wonderful, and utterly useless. Don’t do it.