GitHub - Xuanwo/hudi-rs: A native Rust library for Apache Hudi, with bindings into Python

A native Rust library for Apache Hudi, with bindings to Python

Thehudi-rsproject aims to broaden the use ofApache Hudifor a diverse range of users and projects.

Source	Installation Command
PyPi	`pip install hudi`
Crates.io	`cargo add hudi`

Example usage

Note

These examples expect a Hudi table exists at/tmp/trips_table,created using thequick start guide.

Python

Read a Hudi table into a PyArrow table.

fromhudiimportHudiTable

hudi_table=HudiTable("/tmp/trips_table")
records=hudi_table.read_snapshot()

importpyarrowaspa
importpyarrow.computeaspc

arrow_table=pa.Table.from_batches(records)
result=arrow_table.select(
["rider","ts","fare"]).filter(
pc.field("fare")>20.0)
print(result)

Rust

Add crate hudi with datafusion feature to your application to query a Hudi table.

cargo new my_project --bin&&cdmy_project
cargo add tokio@1 datafusion@39
cargo add hudi --features datafusion

Updatesrc/main.rswith the code snippet below thencargo run.

usestd::sync::Arc;

usedatafusion::error::Result;
usedatafusion::prelude::{DataFrame,SessionContext};
usehudi::HudiDataSource;

#[tokio::main]
asyncfnmain()->Result<()>{
letctx =SessionContext::new();
lethudi =HudiDataSource::new("/tmp/trips_table").await?;
ctx.register_table("trips_table",Arc::new(hudi))?;
letdf:DataFrame= ctx.sql("SELECT * from trips_table where fare > 20.0").await?;
df.show().await?;
Ok(())
}

Work with cloud storage

Ensure cloud storage credentials are set properly as environment variables, e.g.,AWS_*,AZURE_*,orGOOGLE_*. Relevant storage environment variables will then be picked up. The target table's base uri with schemes such ass3://,az://,orgs://will be processed accordingly.

Contributing

Check out thecontributing guidefor all the details about making contributions to the project.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.cargo		.cargo
.github		.github
crates		crates
python		python
release		release
.asf.yaml		.asf.yaml
.commitlintrc.yaml		.commitlintrc.yaml
.gitignore		.gitignore
.licenserc.yaml		.licenserc.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
cliff.toml		cliff.toml
codecov.yml		codecov.yml
rust-toolchain.toml		rust-toolchain.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Example usage

Python

Rust

Work with cloud storage

Contributing

About

Releases

Packages

Languages

License

Xuanwo/hudi-rs

Folders and files

Latest commit

History

Repository files navigation

Example usage

Python

Rust

Work with cloud storage

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages