Skip to content

🔮 A refreshing functional take on deep learning, compatible with your favorite libraries

License

Notifications You must be signed in to change notification settings

explosion/thinc

Repository files navigation

Thinc: A refreshing functional take on deep learning, compatible with your favorite libraries

From the makers ofspaCyandProdigy

Thincis alightweight deep learning librarythat offers an elegant, type-checked, functional-programming API forcomposing models, with support for layers defined in other frameworks such asPyTorch, TensorFlow and MXNet.You can use Thinc as an interface layer, a standalone toolkit or a flexible way to develop new models. Previous versions of Thinc have been running quietly in production in thousands of companies, via both spaCyandProdigy.We wrote the new version to let userscompose, configure and deploy custom modelsbuilt with their favorite framework.

tests Current Release Version PyPi Version conda Version Python wheels Code style: black Open demo in Colab

🔥 Features

  • Type-checkyour model definitions with custom types and mypyplugin.
  • WrapPyTorch,TensorFlowandMXNetmodels for use in your network.
  • Concisefunctional-programmingapproach to model definition, using composition rather than inheritance.
  • Optional custom infix notation viaoperator overloading.
  • Integratedconfig systemto describe trees of objects and hyperparameters.
  • Choice ofextensible backends.
  • Read more →

🚀 Quickstart

Thinc is compatible withPython 3.6+and runs onLinux,macOSand Windows.The latest releases with binary wheels are available from pip.Before you install Thinc and its dependencies, make sure that yourpip,setuptoolsandwheelare up to date. For the most recent releases, pip 19.3 or newer is recommended.

pip install -U pip setuptools wheel
pip install thinc

See theextended installation docsfor details on optional dependencies for different backends and GPU. You might also want to set up static type checkingto take advantage of Thinc's type system.

⚠️If you have installed PyTorch and you are using Python 3.7+, uninstall the packagedataclasseswithpip uninstall dataclasses,since it may have been installed by PyTorch and is incompatible with Python 3.7+.

📓 Selected examples and notebooks

Also see the/examplesdirectory and usage documentationfor more examples. Most examples are Jupyter notebooks – to launch them on Google Colab(with GPU support!) click on the button next to the notebook name.

Notebook Description
intro_to_thinc
Open in Colab
Everything you need to know to get started. Composing and training a model on the MNIST data, using config files, registering custom functions and wrapping PyTorch, TensorFlow and MXNet models.
transformers_tagger_bert
Open in Colab
How to use Thinc,transformersand PyTorch to train a part-of-speech tagger. From model definition and config to the training loop.
pos_tagger_basic_cnn
Open in Colab
Implementing and training a basic CNN for part-of-speech tagging model without external dependencies and using different levels of Thinc's config system.
parallel_training_ray
Open in Colab
How to set up synchronous and asynchronous parameter server training with Thinc andRay.

View more →

📖 Documentation & usage guides

Documentation Description
Introduction Everything you need to know.
Concept & Design Thinc's conceptual model and how it works.
Defining and using models How to compose models and update state.
Configuration system Thinc's config system and function registry.
Integrating PyTorch, TensorFlow & MXNet Interoperability with machine learning frameworks
Layers API Weights layers, transforms, combinators and wrappers.
Type Checking Type-check your model definitions and more.

🗺 What's where

Module Description
thinc.api User-facing API.All classes and functions should be imported from here.
thinc.types Customtypes and dataclasses.
thinc.model TheModelclass. All Thinc models are an instance (not a subclass) ofModel.
thinc.layers The layers. Each layer is implemented in its own module.
thinc.shims Interface for external models implemented in PyTorch, TensorFlow etc.
thinc.loss Functions to calculate losses.
thinc.optimizers Functions to create optimizers. Currently supports "vanilla" SGD, Adam and RAdam.
thinc.schedules Generators for different rates, schedules, decays or series.
thinc.backends Backends fornumpyandcupy.
thinc.config Config parsing and validation and function registry system.
thinc.util Utilities and helper functions.

🐍 Development notes

Thinc usesblackfor auto-formatting, flake8for linting and mypyfor type checking. All code is written compatible withPython 3.6+,with type hints wherever possible. See thetype referencefor more details on Thinc's custom types.

👷‍♀️ Building Thinc from source

Building Thinc from source requires the full dependencies listed in requirements.txtto be installed. You'll also need a compiler to build the C extensions.

git clone https://github /explosion/thinc
cdthinc
Python -m venv.env
source.env/bin/activate
pip install -U pip setuptools wheel
pip install -r requirements.txt
pip install --no-build-isolation.

Alternatively, install in editable mode:

pip install -r requirements.txt
pip install --no-build-isolation --editable.

Or by settingPYTHONPATH:

exportPYTHONPATH=`pwd`
pip install -r requirements.txt
Python setup.py build_ext --inplace

🚦 Running tests

Thinc comes with anextensive test suite.The following should all pass and not report any warnings or errors:

Python -m pytest thinc#test suite
Python -m mypy thinc#type checks
Python -m flake8 thinc#linting

To view test coverage, you can runPython -m pytest thinc --cov=thinc.We aim for a 100% test coverage. This doesn't mean that we meticulously write tests for every single line – we ignore blocks that are not relevant or difficult to test and make sure that the tests execute all code paths.