- What is BigFlow?
- Getting started
- Installing Bigflow
- Help me
- BigFlow tutorial
- CLI
- Configuration
- Project structure and build
- Deployment
- Workflow & Job
- Starter
- Technologies
- Development
BigFlow is a Python framework for data processing pipelines onGCP.
The main features are:
- Dockerized deployment environment
- Powerful CLI
- Automated build,deployment, versioningandconfiguration
- Unified project structure
- Support for GCP data processing technologies—Dataflow(Apache Beam) andBigQuery
- Project starter
Start from installing BigFlow on your local machine. Next, go through the BigFlowtutorial.
Prerequisites.Before you start, make sure you have the following software installed:
You can install thebigflow
package globally, but we recommend
installing it locally withvenv
,in your project's folder:
python -m venv.bigflow_env
source.bigflow_env/bin/activate
Install thebigflow
PIP package:
pip install bigflow[bigquery,dataflow]
Test it:
bigflow -h
Read more aboutBigFlow CLI.
To interact with GCP you need to set a default project and log in:
gcloud configsetproject<your-gcp-project-id>
gcloud auth application-default login
Finally, check if your Docker is running:
docker info
You can ask questions on ourgitter channelorstackoverflow.