Ibis 8.0 lets data teams write code once across different engines


Voltron Data has announced the release of Ibis 8.0, an update to its popular Python dataframe API, which has been downloaded over 10 million times. Ibis enables developers to run code across various data platforms by choosing the most suitable query engine for specific tasks. 

The latest version introduces the first dedicated streaming backends for Apache Flink and RisingWave, alongside its existing variety of batch execution engines. This expansion allows for a unified experience in batch and streaming data processing within a single Python dataframe API, enhancing the flexibility and capability of data analytics tasks.

“Finally developers can write code once and use it across local, batch, CPU, GPU, and now real-time query engines. Ibis is leading the charge to break down the barriers between batch and stream processing execution engines. This is a big step toward a modular and composable data ecosystem across all paradigms,” said Josh Patterson, co-founder and CEO of Voltron Data. 

Ibis is an independently governed open-source project, enjoying support from Voltron Data and contributions from an array of entities across the data platform spectrum, such as Google, Starburst Data, and RisingWave. 

With the release of version 8.0, Ibis now supports 20 different query engines, accommodating a wide range of data processing needs from small-scale queries with DuckDB to large, distributed preprocessing/ETL jobs with engines like BigQuery, Spark, Theseus, and more. Additionally, it integrates seamlessly with two streaming engines, Apache Flink and RisingWave, without necessitating any code alterations by the users.

The development of Ibis is particularly focused on improving user experience and functionality, as explained by Zhenzhong “Z” Xu, vice president of engineering at Voltron Data. The enhancements in the Ibis API, including new features like ML preprocessing, benefit every supported backend, enabling users to work with a single, familiar dataframe API without being restricted to any specific backend. 

This approach allows for a more versatile and efficient data processing environment but also encourages the open-source community to contribute to the Ibis ecosystem, broadening the scope and utility of Python-based data analytics across various data platforms.

“As the Ibis API improves and adds new functionality like ML preprocessing, every backend it supports improves with it. Users can learn a single familiar dataframe API without being locked into any backend. The open source community can add Ibis ecosystem integrations to make working with data in Python better on any data platform Ibis supports,” said Xu.

 



Source link