Databricks launched a new open source product at the Spark AI Summit 2019 called Delta Lake. Delta Lake touts that it brings ACID transactions to Apache Spark and big data workloads.

Delta Lake sits on top of your existing Data Lake and provides ACID transactions, scalable metadata handling (by treating the metadata as data and distributing it across clusters), and unified streaming and batch data processing.

By employing Delta Lake correctly, you should see improved reliability in your Data Lake, as well as efficiency in data processing.

Besides that, Delta Lake seems to simply be Databricks Delta on top of your data lake.

I have half a decade of experience working with data science and data engineering in a variety of fields both professionally and in academia. I ahve demonstrated advanced skills in developing machine learning algorithms, econometric models, intuitive visualizations and reporting dashboards in order to communicate data and technical terminology in an easy to understand manner for clients of varying backgrounds.

Write A Comment