DBT Testing Tools Gap

As explained in a previous post, we had a unique request from our client to physically (i.e. not logically) delete records from a data warehouse. DBT as our tool of choice for advanced data transformation was used to incrementally load the data. Due to the unusual request and implementation we wanted to ensure that the model is well tested. In addition, we felt that post hand over of our work the client will have the most confidence in any future modifications.

Naturally, the first step is to evaluate the tools at our disposal. dbt comes with a set of testing mechanisms which fall into two categories:

While those mechanisms are powerful tools and come especially handy for validation and sanity checks, during development time something is missing. We felt a gap in the testing tools available to us during model development.

There is no tool to let us deterministically re-create a set of conditions, execute our code and validate the output. That is a basic definition of what tests are, yet we struggled to achieve the same with the so-called tests in dbt.

With my software engineer hat ? on, I decided to explore the possibility of employing a beloved agile practice – BDD.

Behaviour Driven Development put quite simply makes you specify in plain (ish) English what your software should do. Thanks to the many testing frameworks we can then turn that spec into an automated test, executed on every build!

BDD prescribes a standardised format for writing down scenarios, this format is known as Gherkin.  Each scenario is broken down into a set of Given, When and Then sentences which serve as building blocks for scenario assembly.

Scenario: I can buy milk if I have enough money
Given £5 in my wallet
When I purchase milk for £2.40
Then I will have £2.60 remaining in my wallet
And milk in the fridge

The charm of the this example is that it is very plainly conveying the functionality of the software and does not require any understanding of software development or programming languages.

Each of the above lines, known as “steps”, are mapped to a method in our test code. This method constitutes the step implementation – what that implementation does is completely up to us. For instance, the step Given £5 in my wallet might be implemented as a JSON request to a server which loads our wallet with £5 or it could also be a database request to our mock/ephemeral database instance. Steps can take parameters to improve their reusability. E.g. we should not have one step that deposits £5 into our wallet and another one which deposits £10, but a single step with an amount being passed in as a parameter!

Note: Same best coding practices apply to any test code we write! So we should aim to minimise the number of unique steps. Parameters and tables should be used to drive reusability and decrease code repetition. 

To write gherkin style tests one needs a good understanding of what the software does and a text editor.

Likewise, to read and understand those scenarios, all that is needed is an understanding of the application domain.

In my previous role, a QA lead and I introduced a workflow that enabled any developer and any QA engineer to work independently on a feature:

    1. QA and Dev get together and come up with the scenarios
    2. QA & Dev get a blessing from the PO confirming that those scenarios truthfully represent the feature
    3. Dev approaches implementation
    4. QA works on the implementation of any test code
    5. Dev & QA merge their code and discover that everything works perfectly ? – Yes that is often not true, usually a small set of iterations of the production and test code are required to achieve correctness.

We saw a significantly reduced number of issues being carried over to the next sprint and therefore velocity improved! In addition, the number of bugs dropped, code started to look better, and developers felt productive again!

For those of us that are familiar with Test Driven Development, this is exactly what I described with my example above. However, TDD is often explained in the context of unit tests and while that is absolutely fine, it is often difficult to write those tests if you have no existing code! Further, writing unit tests is more time consuming and especially difficult when compared to gherkin scenarios!

So why was BDD suddenly required to test our dbt models? Well, frankly speaking, it is not! However, with dbt tests it is not possible to:

  • assert the exact values recorded in each column against the input
  • test a more complicated scenario consisting of multiple steps with incremental dbt runs in between
  • communicate in plain text the specification of your models to somebody else

The particular use case (as explained in this post) we were solving for was:

  • receive a bunch of messages
  • transform using dbt
  • assert on the values recorded in the target
  • schema
  • record a bunch more messages
  • transform using dbt
  • assert that there are fewer records now in the
  • target schema and their values

Here is what this looks like in gherkin:

Pretty close to the summary above, right? ?

Next, we actually had to implement the steps. Thankfully, python comes with a wonderful BDD framework called behave, which provides all the plumbing to run BDD tests.

We started with the implementation of the environment.py – this is a special file that is executed in order to bootstrap your environment. Knowing that we would be running dbt which requires a profiles.yml file anyway, we decided to process that for any database operations that we had to do outside of dbt.

def get_dbt_config() -> (str, str):
    profiles_file = os.environ.get('DBT_PROFILES_FILE',\
os.path.expanduser("~/.dbt/profiles.yml"))
    profile_name = os.environ.get('DBT_PROFILE', 'default')
    return profiles_file, profile_name</pre>


def before_all(context):
profiles_file, profile_name = get_dbt_config()
db_config = get_db_config(profiles_file, profile_name)
context.db_connection = connect_db(db_config)

Next, a good testing practice is to generate your input data as randomly as possible. Therefore we wrote a realistic random message generator using the Faker python module and overwrote some values with the user supplied ones from the above table. Using the db connection, bootstrapped in our environment.py and conveniently made available in the context, we could easily insert the generated records.

Running dbt from python is easy using the subprocess module. We only ensured that we are passing in a couple of variables which we could use in any macros:

dbt_vars = {
    "run_type": "integration",
    "schema_name": schema_name
}</pre>
refresh_param = "--full-refresh" if run_type.lower() == FULL_REFRESH_PARAM else ""

cmd = f"dbt run {refresh_param} --profiles-dir {profiles_dir} --profile {profile_name} --vars '{dbt_vars}'"

Finally, we also needed to assert each value that was persisted. Well that is also easy once you have a database connection in your context!

def the_table_contains_the_following_records(context, table_name: str):
table: behave.model.Table = context.table
columns = ",".join(table.headings)

cursor = context.db_connection.cursor()
try:
    records = cursor.execute(
        f"select {columns} from {table_name} order by event_type, account_id, object_id, object_sub_id asc").fetchall()

If you are using a database backend which you can run locally with e.g. docker then you can build a fully ephemeral environment for every build! Alternatively a test database needs to be setup.

Finally, I am aware of alternative data transformation tools such as dataform and their approach to testing. However, even though dataform’s unit test feature is further along when compared to dbt test capabilities, they still lack the ability to perform multi-step complex scenarios.

Share on facebook
Share on twitter
Share on linkedin

More on the topic