Skip to content
/ hudi-rs Public
forked fromapache/hudi-rs

A native Rust library for Apache Hudi, with bindings into Python

License

Notifications You must be signed in to change notification settings

Xuanwo/hudi-rs

Repository files navigation

Hudi logo

A native Rust library for Apache Hudi, with bindings to Python

hudi-rs ci hudi-rs codecov join hudi slack follow hudi x/twitter follow hudi linkedin

Thehudi-rsproject aims to broaden the use ofApache Hudifor a diverse range of users and projects.

Source Installation Command
PyPi pip install hudi
Crates.io cargo add hudi

Example usage

Note

These examples expect a Hudi table exists at/tmp/trips_table,created using thequick start guide.

Python

Read a Hudi table into a PyArrow table.

fromhudiimportHudiTable

hudi_table=HudiTable("/tmp/trips_table")
records=hudi_table.read_snapshot()

importpyarrowaspa
importpyarrow.computeaspc

arrow_table=pa.Table.from_batches(records)
result=arrow_table.select(
["rider","ts","fare"]).filter(
pc.field("fare")>20.0)
print(result)

Rust

Add crate hudi with datafusion feature to your application to query a Hudi table.
cargo new my_project --bin&&cdmy_project
cargo add tokio@1 datafusion@39
cargo add hudi --features datafusion

Updatesrc/main.rswith the code snippet below thencargo run.

usestd::sync::Arc;

usedatafusion::error::Result;
usedatafusion::prelude::{DataFrame,SessionContext};
usehudi::HudiDataSource;

#[tokio::main]
asyncfnmain()->Result<()>{
letctx =SessionContext::new();
lethudi =HudiDataSource::new("/tmp/trips_table").await?;
ctx.register_table("trips_table",Arc::new(hudi))?;
letdf:DataFrame= ctx.sql("SELECT * from trips_table where fare > 20.0").await?;
df.show().await?;
Ok(())
}

Work with cloud storage

Ensure cloud storage credentials are set properly as environment variables, e.g.,AWS_*,AZURE_*,orGOOGLE_*. Relevant storage environment variables will then be picked up. The target table's base uri with schemes such ass3://,az://,orgs://will be processed accordingly.

Contributing

Check out thecontributing guidefor all the details about making contributions to the project.

About

A native Rust library for Apache Hudi, with bindings into Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 83.7%
  • Python 7.6%
  • Shell 7.0%
  • Makefile 1.7%