Unity Catalog is an open source governance catalog for data and AI, developed by Databricks and open sourced earlier this week at the company’s Data + AI Summit.
Companies can use it to govern structured and unstructured data, and machine learning models, notebooks, dashboards, and files.
It offers interoperability with any data format and compute engine and supports all of the major cloud platforms. With Unity Catalog, companies can manage data from a number of sources in one place, including sources like MySQL, PostgreSQL, Amazon Redshift, Snowflake, Azure SQL, Azure Synapse, and Google BigQuery, to name a few.
Companies can define access policies once and then apply them across different clouds and platforms, simplifying access management and governance.
Unity Catalog was first created at Databricks in 2021 as an offering for its customers, and it is being open sourced so that more companies can benefit from it.
“We’re excited to open source Unity Catalog and release the code,” said Ali Ghodsi, co-founder and CEO of Databricks. “We’ll continue to evolve the open standard in close collaboration with our partners.”
Matt Dugan, VP Data Platforms, AT&T, added: “With the announcement of Unity Catalog’s open sourcing, we are encouraged by Databricks’ step to make lakehouse governance and metadata management possible through open standards. The flexibility to utilize interoperable tools with our data and AI assets, with consistent governance, is core to the AT&T data platform strategy.”
Read about other Open-Source Projects of the Week…