I don’t know what it is about Apache Arrow, that GOAT of data engineering, that snuck in like a weasel through the backdoor, we all woke up one day and found out Arrow is the Atalas of the data world, holding up the systems we depend on and take for granted. One name you might, or might not have heard rattling around is “Apache Arrow Flight” or “Arrow Flight.”

It pops up a little in blogs, talks, and the READMEs of some GitHub repos… I imagine all the vibe-coding Chads just nodding furiously on Zoom calls, like wise old owls who know exactly what’s being talked about. Admit it, you don’t know what Arrow Flight is.

  • Apache Arrow Flight is a high-performance data transport framework built on top of Apache Arrow and gRPC that allows applications to move large datasets across networks much faster than traditional technologies like JDBC, ODBC, CSV exports, or REST APIs

Read more

I had not been thinking much about Kafka lately, but depending on who you ask, Kafka is either sitting comfortably at the top of the streaming world or beginning a slow decline into abstraction. The truth is probably somewhere in the middle. I’ve seen a number of newish streaming tools, many Rust-based, come in like a splash and leave nothing but a few bubbles.

Kafka is entrenched in the streaming world, and this has created a problem for the Lake House architecture. The integration isn’t “as easy as pie,” … far from it.

Read more