Transforming Data with Daasity

Get an overview of how to transform the raw data in your data warehouse into analysis-ready models

Introduction

You can use Daasity to transform your raw data into useful analytical models. You can do this by using custom transform code created by your team, or you can run a variety of pre-built SQL Code blocks maintained by Daasity.

When you transform data with Daasity, you will be using two different types of files:

You will deploy and manage all of these files within a custom code repository in GitHub, which we set up for you when you sign up for a Daasity account. Read more about code repositories here.

Once you have some SQL files and at least one script manifest YAML file in your repository, you can execute the code using a workflow, which you create and manage within the Daasity app to orchestrate your transformations. You can specify in the app how often a workflow should run and whether any data extractions should run before running the transformation code. You can get all the details on workflows here.

Example of running a simple transform script

We'll walk through the steps of testing your very first transformation script after getting access to your code repository.

Step 1: Create a custom SQL file in your custom repository

From the Code Repository page, navigate to your custom repository.

Add a file to your repository called demo_test.sql with the following contents, and commit the changes:

demo_test.sql
-- make sure the schema doesn't exist
DROP SCHEMA IF EXISTS daasity_demo;

-- create the schema
CREATE SCHEMA daasity_demo;

-- create a table
CREATE TABLE daasity_demo.demo
(
    id      VARCHAR
  , value   INTEGER
)
;

-- insert values
INSERT INTO daasity_demo.demo
VALUES
(1001,1),
(1002,2),
(1003,3)
;

-- select count and sum
SELECT COUNT(value), SUM(value)
FROM daasity_demo.demo
;

This code will create the table demo in schema daasity_demo in your data warehouse and insert some dummy data.

Step 2: Create a new script manifest file in your custom repository

Add a script manifest yml file to your repository called demo_test.yml with the following contents:

demo_test.yml
version:  '2.0.0'

sections:
  test:
    scripts:
      - "demo_test.sql"

You can read our script manifest file article to understand the components of this code, but the gist is that it will run the demo_test.sql script you created in Step 1.

Step 3: Set up a workflow to run the script manifest file

Navigate to the Workflows section of the Daasity app and create a new workflow:

Name the workflow:

In the Data Transformation section, toggle on "Run transform scripts":

In the Data Transformation section, select the branch you used when creating the files in steps 1 & 2, and select the demo_test.yml script manifest file:

Click "Create" in the upper right corner of the screen

Step 4: Run the workflow & check your data

On the Workflows page, find the workflow you just created in the Configured Workflows section.

Then hover over the row, and click the "Run" button:

On the next screen, click the "Run" button in the upper right-hand corner:

Once the transformation part of the workflow has completed, check your data warehouse. You should now have a daasity_demo.demo table with 3 rows of data.

If the table did not get created, check your demo_test.sql and demo_test.yml files to make sure they match exactly the code from steps 1 and 2, and try running the workflow once more.

This is an extremely simple example, but this is the basic process for all transformation that will occur within Daasity. When you're ready to set up more advanced workflows that incorporate data extractions or that run regularly at set intervals, read our Getting Started with Workflows and Creating Workflows articles.

Running pre-built Daasity SQL blocks

The above example shows how you can run your own custom SQL code. But if you're using our Transform code feature, you also have the option of running pre-built transform code that is maintained by Daasity.

This code lives in the Daasity shared repository, which you get access to when you opt-in to our Transform code license. This shared repository gives you access to all of the code you would need to model your data using our Daasity Data Models.

You can run the shared code by referencing the shared scripts from your script manifest files. Doing so will always run the most up-to-date version of the Daasity code. Alternatively, you could use the Daasity code as a starting point and customize it to your business needs within your own custom code repository.

Using test warehouses

Our test warehouse feature makes it easy for you test your transformation scripts without putting your production data in jeopardy. From within the Daasity app, you can make a clone of your production warehouse within minutes that you can use to test your transform scripts. Learn more about test warehouses here.

Last updated