Webhook is a powerful way to integrate systems and enable real-time data transfer between applications. They are an HTTP callback, allowing developers to receive data from external systems when specific events occur. In this article, we will explore how webhooks can be used to transfer data from Fivetran to Snowflake, using Postman to test and transfer data in JSON format.
Before we dive into the technical details, let’s briefly discuss what Fivetran and Snowflake are.
Fivetran is a cloud-based data integration platform that enables businesses to transfer data from various sources to a variety of data warehouses, including Snowflake. It offers pre-built connectors for more than 235+ data sources, making it easy to connect and transfer data from multiple sources in real-time.
Snowflake, on the other hand, is a cloud-based data warehouse that provides high-performance analytics and data processing. It allows businesses to store and analyze large volumes of data from multiple sources, enabling data-driven decision-making.
To begin, we need to create a webhook in Fivetran that will receive data from our external system, in this case, Postman.
Click on Connectors tab, then Add New connector. Search for "Webhooks"
2. Select Destination Schema and Table name.
3. Unpacked means if our data is going to be xml or JSON, fivetran will only unpack it upto one layer meaning any nested json objects or arrays will have variant column in snowflake and simpler attributes will be in same table having their respective column/datatype.
4. Copy your Webhook URL, this URL will be required to send our requests,
5. Data will only be trasnferred only when conector sync runs. Until then all data sent to webook URL needs be stored somewhere. If required we can give our S3 Bucket location otherwise we can also use Fivetran's own container service which will be managed by fivetran.
NOTE: For sensitive data it is recommended to use your own private location to store intermediate data since we don't have any control on fivetran's container service for data storage.
6. Next we can specify if we want any authentication in our wehbook URL requests mechanism. We will be using Token Based mechanism.
NOTE : Copy "Secrets" somewhere safe. We will be using this secret along with our data to authenticate ourself to fivetran's webhook server.
7. We can now save our configuration, After testing fivetran will ensure all parameters we added are correct.
In this step fivetran will also make sure that if we have configured our own storage option (AWS, Azure or GCP), fivetran can access those locations. With the webhook created, Fivetran will send data to these locations first whenever the configured event occurs.
8. If you have sensitive data like account details we can hash or block those information at fivetran's end.
Fivetran will make sure that blocked/hashed data can not be seen in clear text. For more information Please look at data retention policy of fivetran.
After this step our Connector is ready for its Initial sync but before that we need to configure our Snowflake Destination.
You can use script below to configure Database, Role, User, warehouse will be used by Fivetran.
begin;
-- create variables for user / password / role / warehouse / database (needs to be uppercase for objects)
set role_name = 'FIVETRAN_ROLE';
set user_name = 'FIVETRAN_USER';
set user_password = 'password123';
set warehouse_name = 'FIVETRAN_WAREHOUSE';
set database_name = 'FIVETRAN_DATABASE';
-- change role to securityadmin for user / role steps
use role securityadmin;
-- create role for fivetran
create role if not exists identifier($role_name);
grant role identifier($role_name) to role SYSADMIN;
-- create a user for fivetran
create user if not exists identifier($user_name)
password = $user_password
default_role = $role_name
default_warehouse = $warehouse_name;
grant role identifier($role_name) to user identifier($user_name);
-- set binary_input_format to BASE64
ALTER USER identifier($user_name) SET BINARY_INPUT_FORMAT = 'BASE64';
-- change role to sysadmin for warehouse / database steps
use role sysadmin;
-- create a warehouse for fivetran
create warehouse if not exists identifier($warehouse_name)
warehouse_size = xsmall
warehouse_type = standard
auto_suspend = 60
auto_resume = true
initially_suspended = true;
-- create database for fivetran
create database if not exists identifier($database_name);
-- grant fivetran role access to warehouse
grant USAGE
on warehouse identifier($warehouse_name)
to role identifier($role_name);
-- grant fivetran access to database
grant CREATE SCHEMA, MONITOR, USAGE
on database identifier($database_name)
to role identifier($role_name);
-- change role to ACCOUNTADMIN for STORAGE INTEGRATION support (only needed for Snowflake on GCP)
use role ACCOUNTADMIN;
grant CREATE INTEGRATION on account to role identifier($role_name);
use role sysadmin;
commit;
In Destination tab select snowflake as destination and then give all necessary parameters.
Make sure that you can login with fivetran's user and should be able to use warehouse and must have access to a database. If you have used above script then, it should be working as is.
2. Fivetran is going to check all filled details and will pass all tests. This will make sure that fivetran has sufficient access to write data to snowflake.
We will be using postman to generate data to our webhook url and using our secret.
In Postman we will be sending POST request on our Webhook URL provided to us earlier by fivetran.
In Header tab, we need to give X-Fivetran-Secret-Token as key and our secret as value.
In Body tab, we will be using JSON data, as it can be seen in JSON body, it consists of nested JSON objects and array, lets see how fivetran deals with this data. After this we can send our request.
We can now start our Initial sync and after sync will be completed we should be able to see several tables created by fivetran in our snowflake database.
After some time we can see fivetran database has a new schema named "Webhooks" and table name is "Test".
Take a close look at columns of our table. Some of them are Variant and some are Number. Let's have a look at our data.
Nested JSON objects and arrays are having Variant column while outermost attributes are in varchar and number datatype.