Exploring Graphs in Rust. Yikes.

I’ve been a dog licking my wounds for some time now. Over on my Substack newsletter, I’ve been doing a small series on DSA (Data Structures and Algorithms). I tackled some of the easier stuff first, like Linked Lists, Binary Search, and the like. What’s more, I actually did most of it in Rust, since I’ve possibly, maybe slightly, every so slightly, fallen in love with Rust.

Like most relationships, it vacillates between pure adoration and utter hatred, depending on the problem at hand. When I did a recent article on Graphs, Queues, and BSF, I attempted it in Rust, and was struck a mighty blow, that borrow checker had me down. It seemed doable, but at the time, under time pressure to get the Newsletter out, I reverted to Python and moved on.

Alas, I’m back again, a glutton for punishment. This time I thought I should try another crack at parsing a graph with Rust, but in a real-life situation, no more made-up stuff.  Actual data, actual graph, here we go. All code is on GitHub.

Read more

Conceptual Introduction to Delta Lake.

Old Dog Learn New Tricks? Starburst (Trino) Galaxy and other thoughts.

Sometimes I think Data Engineering is the same as it was 10+ years ago when I started doing it, and sometimes I think everything has changed. It’s probably both. In some ways, the underlying concepts have not moved an inch, some certain truths and axioms still rule over us all like some distant landlord, requiring us to pay the piper at a moment’s notice. Still, with all those things that haven’t changed, the size, velocity, and types of data have exploded. Data sources have run wild, multiple cloud providers, and a plethora of tooling. 

So yes, maybe in a lot of ways Data Engineering has changed, or at least how we do something is a new and wild frontier, with beasts around every corner waiting to devour us in our ignorance. Never mind the wild groups of zealots roaming around seeking converts to their cause and spitting on those unwilling to bend.

Probably like many of you, I’ve had a healthy skepticism of all things new, at least until they have proved themselves out over some time. This is both a good and a bad habit. It can protect you from undo harm and foolishness, but can also be lost opportunity when you pass over the diamond in the rough. I for one, think that if something is worth its weight in salt, it is usually clear, and its obvious value can be discerned quite readily.

Read more

4 Ways To Setup Your Data Engineering Game.

One of my greatest pleasures in life is watching the r/dataengineering Reddit board, I find it very entertaining and enlightening on many levels. It gives a fairly unique view into the wide range of Data Engineering companies, jobs, projects people are working on, tech stacks, and problems that are being faced.

One thing I’ve come to realize over the years, working on many different Data Teams, and backed up by a casual observation of discussions on Reddit and other places, is that despite us living in the age of ChatGPT, Data Engineering teams generally seem to lag far behind in most areas of the Development Lifecycle.

So, to fix all the problems in the entire world and save humanity and Data Engineers from themselves, I give you the gift of telling you how to do your job. You’re welcome.

Read more