A framework for writing Airbyte Connectors.
The Airbyte Python CDK is a framework for rapidly developing production-grade Airbyte connectors. The CDK currently offers helpers specific for creating Airbyte source connectors for:
The CDK provides an improved developer experience by providing basic implementation structure and abstracting away low-level glue boilerplate.
This document is a general introduction to the CDK. Readers should have basic familiarity with the Airbyte Specification before proceeding.
Generate an empty connector using the code generator. First clone the Airbyte repository then from the repository root run
cd airbyte-integrations/connector-templates/generator
./generate.sh
then follow the interactive prompt. Next, find all TODO
s in the generated project directory -- they're accompanied by lots of comments explaining what you'll need to do in order to implement your connector. Upon completing all TODOs properly, you should have a functioning connector.
Additionally, you can follow this tutorial for a complete walkthrough of creating an HTTP connector using the Airbyte CDK.
See the concepts docs for a tour through what the API offers.
HTTP Connectors:
Singer connectors:
Simple Python connectors using the barebones Source
abstraction:
We assume python
points to python >=3.8.
Setup a virtual env:
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]" # [dev] installs development-only dependencies
python -m pytest -s unit_tests
mypy airbyte_cdk
. MyPy
configuration is in mypy.ini
.mypy <files to check>
to only check specific files. This is useful as the CDK still contains code that is not compliant.type_check_and_test.sh
script bundles both type checking and testing in one convenient command. Feel free to use it!If the iteration you are working on includes changes to the models, you might want to regenerate them. In order to do that, you can run:
./gradlew :airbyte-cdk:python:format
This will generate the files based on the schemas, add the license information and format the code. If you want to only do the former and rely on
pre-commit to the others, you can run the appropriate generation command i.e. ./gradlew generateComponentManifestClassFiles
.
All tests are located in the unit_tests
directory. Run python -m pytest --cov=airbyte_cdk unit_tests/
to run them. This also presents a test coverage report.
When developing a new feature in the CDK, you may find it helpful to run a connector that uses that new feature. You can test this in one of two ways:
In order to get a local Python connector running your local CDK, do the following.
First, make sure you have your connector's virtual environment active:
# from the `airbyte/airbyte-integrations/connectors/<connector-directory>` directory
source .venv/bin/activate
# if you haven't installed dependencies for your connector already
pip install -e .
Then, navigate to the CDK and install it in editable mode:
cd ../../../airbyte-cdk/python
pip install -e .
You should see that pip
has uninstalled the version of airbyte-cdk
defined by your connector's setup.py
and installed your local CDK. Any changes you make will be immediately reflected in your editor, so long as your editor's interpreter is set to your connector's virtual environment.
Pre-requisite: Install the airbyte-ci
CLI
You can build your connector image with the local CDK using
# from the airbytehq/airbyte base directory
airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build
Note that the local CDK is injected at build time, so if you make changes, you will have to run the build command again to see them reflected.
Pre-requisite: Install the airbyte-ci
CLI
To run acceptance tests for a single connectors using the local CDK, from the connector directory, run
airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test
There can be some time where you do not have access to the API (either because you don't have the credentials, network access, etc...) You will probably still want to do end-to-end testing at least once. In order to do so, you can emulate the server you would be reaching using a server stubbing tool.
For example, using mockserver, you can set up an expectation file like this:
{
"httpRequest": {
"method": "GET",
"path": "/data"
},
"httpResponse": {
"body": "{\"data\": [{\"record_key\": 1}, {\"record_key\": 2}]}"
}
}
Assuming this file has been created at secrets/mock_server_config/expectations.json
, running the following command will allow to match any requests on path /data
to return the response defined in the expectation file:
docker run -d --rm -v $(pwd)/secrets/mock_server_config:/config -p 8113:8113 --env MOCKSERVER_LOG_LEVEL=TRACE --env MOCKSERVER_SERVER_PORT=8113 --env MOCKSERVER_WATCH_INITIALIZATION_JSON=true --env MOCKSERVER_PERSISTED_EXPECTATIONS_PATH=/config/expectations.json --env MOCKSERVER_INITIALIZATION_JSON_PATH=/config/expectations.json mockserver/mockserver:5.15.0
HTTP requests to localhost:8113/data
should now return the body defined in the expectations file. To test this, the implementer either has to change the code which defines the base URL for Python source or update the url_base
from low-code. With the Connector Builder running in docker, you will have to use domain host.docker.internal
instead of localhost
as the requests are executed within docker.
Publish CDK Manually
workflow from master using release-type=major|manor|patch
and setting the changelog message.