Golang – Useful for everyday Data Engineering?

I periodically try to pick up a new programming language on my journey through Data Engineering life. There are many reasons to do that, personal growth, boredom, seeing what others like, and helping me think differently about my code. Golang has been on my list for at least a year. I don’t hear much about it in the Data Engineering world myself, at least in the places I haunt like r/dataengineering and Linkedin.

I know tools like Kubernetes and Docker are written with Go, so it must be powerful and wonderful. But, what about Data Engineering work … and everyday Data Engineering work at that, is Go useful as an everyday tool for everyday simple Data Engineering tasks? Read on my friend.

Read more

Data Quality – Great Expectations for Data Engineers

Mmmm … Data Quality … it is a thing these days. I look forlornly back to the ancient days of SQL Server when nobody cared about such things. Alas, we live in a different world, where hundreds of terabytes of data are the norm, and Data Quality becomes a thing. I’ve been meaning to give Great Expectations a poke for like a year, but just haven’t had the time or inclination to do so, but times are changing, and so should I.

I’m not really planning on giving an in-depth guide to Data Quality with Great Expectations, what I’m more interested in are topics like, how easy is it to set up and use, what’s the overhead, what are the main features and concepts and are they easy to understand. I find this sort of review of Data Engineering tools to be more helpful than simply a regurgitation of the documentation.

Read more

Databricks Access Control – The 3 Most Important Steps

It’s not often I yearn for the good old days of SQL Server, but I’ve had a few of those moments lately. Some things I miss, some I don’t, and it’s probably because I’m getting old and crusty, stuck in my ways, by permissioning is one of those topics where I think about the good old days. Data access control and permissions are topics that we all kinda ignore as not that important … until we actually are trying to do something with them. Then all of sudden we start complaining about complexity and why this isn’t easier. That’s a good way to describe my reaction to having to work on Databricks Access control.

But, I learned a few things and I think they will be helpful for someone. Read on for the basics of handling permissions and access control in Databricks and Delta Lake.

Read more