Introduction to Delta Lake on Apache Spark … for Data Engineers

If you’re anything like me when someone says Delta Lake you think DataBricks. But, the mythical Delta Lake is an open source project, available to anyone running Apache Spark. It seems also too good to be true, ACID transactions on the Spark scale? Incredible. This is the future, it has to be. The lines of what is a data warehouse have been starting to blur for a long time, I have a feeling Delta Lake will be the death blow to the traditional DW … or its rebirth??

Read more

Why Data Engineers Should Care about DataBricks IPO.

Some poor Data Engineer is sweating and typing away in a dark closet … moving data, solving bugs, just trying to get through the day. Why should the ‘ole Data Engineer care about the huff-a-luff around the billion dollar series recently done by DataBricks? I mean what possible reverence could it have on the day to day life of a Data Engineer and why should they care at all? You ever heard of that proverbial light at the end of the tunnel is actually a train steaming your way ready to pulverize you? That’s why.

Read more