Skip to content

Datafusion Ballista support for Delta Table (showcase project)

Notifications You must be signed in to change notification settings

milenkovicm/ballista_delta

Repository files navigation

Datafusion Ballista Support For Delta Table

Since version 43.0.0 datafusion ballista extending core components and functionalities.

This example demonstrate extending datafusion ballista to support delta.rs read operations.

Note

This project has been part of Datafusion Ballista show case series

Important

This is just a showcase project, it is not meant to be maintained.

Setting up standalone ballista:

use ballista::prelude::{SessionConfigExt, SessionContextExt};
use ballista_delta::{BallistaDeltaLogicalCodec, BallistaDeltaPhysicalCodec};
use datafusion::{
    common::Result,
    execution::SessionStateBuilder,
    prelude::{SessionConfig, SessionContext},
};
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<()> {
    let config = SessionConfig::new_with_ballista()
        .with_ballista_logical_extension_codec(Arc::new(BallistaDeltaLogicalCodec::default()))
        .with_ballista_physical_extension_codec(Arc::new(BallistaDeltaPhysicalCodec::default()));

    let state = SessionStateBuilder::new()
        .with_config(config)
        .with_default_features()
        .build();

    let table = deltalake::open_table("./data/people_countries_delta_dask")
        .await
        .unwrap();

    let ctx = SessionContext::standalone_with_state(state).await?;

    ctx.register_table("demo", Arc::new(table)).unwrap();

    ctx.sql("select * from demo").await?.show().await?;


    Ok(())
}

Other examples show extending client, scheduler and executor for cluster deployment.

Releases

No releases published

Packages

No packages published

Languages