Skip to content

Nessie Airflow Provider

build codecov PyPI version


Documentation: https://projectnessie.github.io/nessie_provider

Source Code: https://github.com/projectnessie/nessie_provider


Usage

To use in airflow install via pip pip install airflow-provider-nessie. See Nessie Documentation for instructions on starting and using a Nessie server.

Operators and Hooks

To interact with Nessie from an airflow DAG you have the following options:

  • Nessie Hook: register as a connection w/ Airflow and store your Nessie url and credentials
  • Create reference operator: Create a Branch or Tag as part of an airflow DAG
  • Delete reference operator: Delete a Branch or Tag as part of an airflow DAG
  • Commit operator: commit objects to the Nessie database on a given branch
  • Merge operator: merge one branch into another

These can be seen in action by looking at the Example DAGs. The basic_nessie.py DAG shows each operator in action and the spark_nessie_iceberg.py DAG shows a more complicated example of performing an iceberg transaction in Nessie from the Spark operator.

Development

Setup environement

You should have Pipenv installed. Then, you can install the dependencies with:

pipenv install --dev

After that, activate the virtual environment:

pipenv shell

Run unit tests

You can run all the tests with:

make test

Alternatively, you can run pytest yourself:

pytest

Format the code

Execute the following command to apply isort and black formatting:

make format

License

This project is licensed under the terms of the Apache Software License 2.0.