Skip to content

allegro/bigflow

Repository files navigation

BigFlow

Documentation

  1. What is BigFlow?
  2. Getting started
  3. Installing Bigflow
  4. Help me
  5. BigFlow tutorial
  6. CLI
  7. Configuration
  8. Project structure and build
  9. Deployment
  10. Workflow & Job
  11. Starter
  12. Technologies
  13. Development

Cookbook

What is BigFlow?

BigFlow is a Python framework for data processing pipelines onGCP.

The main features are:

Getting started

Start from installing BigFlow on your local machine. Next, go through the BigFlowtutorial.

Installing BigFlow

Prerequisites.Before you start, make sure you have the following software installed:

  1. Python= 3.8
  2. Google Cloud SDK
  3. Docker Engine

You can install thebigflowpackage globally, but we recommend installing it locally withvenv,in your project's folder:

python -m venv.bigflow_env
source.bigflow_env/bin/activate

Install thebigflowPIP package:

pip install bigflow[bigquery,dataflow]

Test it:

bigflow -h

Read more aboutBigFlow CLI.

To interact with GCP you need to set a default project and log in:

gcloud configsetproject<your-gcp-project-id>
gcloud auth application-default login

Finally, check if your Docker is running:

docker info

Help me

You can ask questions on ourgitter channelorstackoverflow.