Since version 43.0.0
datafusion ballista extending core components and functionalities.
This example demonstrate extending datafusion ballista to support delta.rs read operations.
Note
This project has been part of Datafusion Ballista show case series
Important
This is just a showcase project, it is not meant to be maintained.
Setting up standalone ballista:
use ballista::prelude::{SessionConfigExt, SessionContextExt};
use ballista_delta::{BallistaDeltaLogicalCodec, BallistaDeltaPhysicalCodec};
use datafusion::{
common::Result,
execution::SessionStateBuilder,
prelude::{SessionConfig, SessionContext},
};
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<()> {
let config = SessionConfig::new_with_ballista()
.with_ballista_logical_extension_codec(Arc::new(BallistaDeltaLogicalCodec::default()))
.with_ballista_physical_extension_codec(Arc::new(BallistaDeltaPhysicalCodec::default()));
let state = SessionStateBuilder::new()
.with_config(config)
.with_default_features()
.build();
let table = deltalake::open_table("./data/people_countries_delta_dask")
.await
.unwrap();
let ctx = SessionContext::standalone_with_state(state).await?;
ctx.register_table("demo", Arc::new(table)).unwrap();
ctx.sql("select * from demo").await?.show().await?;
Ok(())
}
Other examples show extending client, scheduler and executor for cluster deployment.