Airbyte Notifications and Webhooks: Effortless ETL Jobs Monitoring (original) (raw)
Airbyte wants to limit the amount of time you spend managing your data movement pipelines. While we try to automate as much as possible on behalf of the user, user actions are still sometimes required to support reliable pipelines. Notifications are a good way to ensure you provide the minimum level of attention to your pipeline.
I’m the engineering manager of the team building the UX. We received a lot of feedback on our notification system over the last year. The community wanted notifications to be more detailed, more configurable, and finally, they wanted them to be more flexible through the introduction of webhooks.
We’ve been working on those requests during the past quarter. As we are about to release Airbyte 1.0, I wanted to demonstrate what you can currently do with Airbyte’s notification system and how it supports your reliability goals.
What are Airbyte notifications?
Airbyte notifications allow you to set up your connection and forget about it. You can be confident that if anything requires your attention, Airbyte will notify you. You have better things to do than actively check the status of your data pipeline.
Airbyte notification types
Airbyte notifications are designed to alert you on significant events that might require user actions. By default, we can send you notifications when a job fails and you should correct its configuration (credentials can change for example); when the schema of a source changes and you need to decide if you want to synchronize new fields or streams. Another important type of notification is connector update, Airbyte frequently updates its connectors to support new functionality and keep up with sources; we make sure those updates are seamless, but sometimes user action is really required to validate the upgrade. Airbyte will always notify you when such an update is required.
Finally, Airbyte can also notify you when a job has been successfully completed, it can be useful to know that everything runs smoothly and allow you to track when your data has been refreshed.
Available notifications channels: email and Slack
You can set up Airbyte notifications in the settings UI. We offer Slack notifications and, for Airbyte Cloud customers, email notifications. Email notifications will be sent to the email associated with the workspace.
For Slack notifications, you first need to create a Slack app in your Slack account. You can then get a webhook url provided by Slack. Slack notifications are particularly useful to monitor sync failures, you’ll know immediately that your attention is required for your data pipeline to continue working. Email notifications are particularly useful for events that are important to address but not urgent: an upcoming connector version upgrade for example.
While notifications inform you of what happened to your data pipelines, they can also empower you to automatically trigger an operation to react to those events. You can do that with webhooks.
Custom notification channels with Airbyte webhooks
Airbyte Webhooks are part of Airbyte integration options; if you haven’t already, check out our public API guide. On the one hand, the public API allows you to change your Airbyte configuration and pull information out of Airbyte. On the other hand, webhooks are a way for Airbyte to proactively push information to your server.
To use an Airbyte webhook, you need your own service that we’ll call after a significant event. The event types we support are the same as for notifications.
In the notification settings UI, you can provide any webhook url, not just a Slack one. Airbyte automatically formats Webhook messages to look nice in Slack and also provides generic information that can be consumed by any webhook server. This can be useful to build your own integration. For example, if your team uses Microsoft Teams instead of Slack, you can send event notifications to Microsoft Teams following this guide.
Another example, you can trigger your own transformation jobs when an Airbyte job completes. Or you can integrate them with your own notification infrastructure through Zapier or IFTTT.
How to read Airbyte success and failure notifications?
When a sync is completed, the notification sent by Airbyte will contain a pointer to the connection and some stats from the synchronization operation. Those statistics are the number of records and bytes extracted and loaded. The loaded values are the most important ones, they tell you how many new records have been sent to the destination. In most situations, the extracted ones will be similar, do not worry if they are not, it is because Airbyte had to retry the read operation on the source to extract all the records.
In case of failures, the notification will also include the error message provided by the source or the destination. The UI will have more information that will allow you to debug the connection.
Conclusion
Across different Airbyte use cases, Airbyte provides you with multiple ways to be informed and react to any event that happens in your data pipeline. Slack and email notifications are perfect for non-technical practitioners who manage small data pipelines. Webhooks are an excellent option to automate further the reaction to any ETL event and integrate Airbyte with other tools. They’re an excellent option for larger teams that use many different tools as part of their data pipeline.
Notifications and webhooks have the same goal: reduce the amount of time you have to spend configuring and maintaining your data pipeline and make you confident that your data pipelines are running properly.
What other great readings
This article is part of a series of articles we wrote around all the reliability features we released for Airbyte to reach the 1.0 status. Here are some other articles about those reliability features:
- Handling large records
- Checkpointing to ensure uninterrupted data syncs
- Load balancing Airbyte workloads across multiple Kubernetes clusters
- Resumable full refresh, as a way to build resilient systems for syncing data
- Refreshes to reimport historical data with zero downtime
- Supporting very large CDC syncs
- Monitoring sync progress and solving OOM failures
- Automatic detection of dropped records
Don't hesitate to check on what was included in Airbyte 1.0's release!