Using dbt with Dagster, part two: Load dbt models as Dagster assets#

This is part two of the Using dbt with Dagster software-defined assets tutorial.

At this point, you should have a fully-configured dbt project that's ready to work with Dagster.

In this step, you'll finally begin integrating dbt with Dagster! To do this, you'll:


Step 1: Load the dbt models as assets#

In this step, you'll load the dbt models into Dagster as assets using the dagster-dbt library.

Open the __init__.py file, located in /tutorial_template/tutorial_dbt_dagster/assets, and add the following code:

from dagster_dbt import load_assets_from_dbt_project

from dagster import file_relative_path


DBT_PROJECT_PATH = file_relative_path(__file__, "../../jaffle_shop")

dbt_assets = load_assets_from_dbt_project(project_dir=DBT_PROJECT_PATH, key_prefix=["jaffle_shop"])

Let's discuss what this example is doing, specifically the load_assets_from_dbt_project function. This function loads dbt models into Dagster as assets, creating one Dagster asset for each model.

When invoked, this function:

  1. Compiles your dbt project,
  2. Parses the metadata provided by dbt, and
  3. Generates a set of software-defined assets reflecting the models in the project. These assets share the same underlying op, which will invoke dbt to run the models represented by the loaded assets.

load_assets_from_dbt_project is one of two ways you can load dbt models into Dagster, which we recommend for small dbt projects. For larger projects, we recommend using load_assets_from_dbt_manifest to load models from a dbt manifest.json file.

Let's take a look at the arguments we've supplied:

  • project_dir, which is the path to the dbt project
  • key_prefix, which is a prefix to apply to all models in the dbt project

Step 2: Define a Dagster code location#

Next, you'll define the code location for your Dagster project. A code location, created using the Definitions object, is a collection of definitions in a Dagster project, such as assets, resources, and so on.

Assets loaded from dbt require a dbt resource, which is responsible for firing off dbt CLI commands. Using the DbtCli resource, we can supply a dbt resource to the dbt project.

Open the __init__.py file, located in /tutorial_template/tutorial_dbt_dagster, and add the following code:

import os

from dagster_dbt import DbtCli
from tutorial_dbt_dagster import assets
from tutorial_dbt_dagster.assets import DBT_PROJECT_PATH

from dagster import Definitions, load_assets_from_modules

resources = {"dbt": DbtCli(project_dir=DBT_PROJECT_PATH)}

defs = Definitions(assets=load_assets_from_modules([assets]), resources=resources)

Let's take a look at what's happening here:

  • In the resources key, we've provided configuration info for the DbtCli resource.
  • Added all assets in the assets module and the resources mapped to the resources key to the Definitions object. This supplies the resource we created to our assets.
  • Using load_assets_from_modules, we've added all assets in the assets module as definitions. This approach allows any new assets we created to be automatically included in the code location instead of needing to manually add them one by one.

Step 3: View the assets in the Dagster UI#

In this step you'll start Dagster's web UI.

  1. To start the UI, run the following in /tutorial_template:

    dagster dev
    

    Which will result in output similar to:

    Serving dagster-webserver on http://127.0.0.1:3000 in process 70635
    
  2. In your browser, navigate to http://127.0.0.1:3000. The page will display the assets:

    Asset graph in the UI, containing dbt models loaded as Dagster assets

What's next?#

At this point, you've loaded your dbt models into Dagster as assets, supplied them with a dbt resource, and viewed them in the UI's asset graph. The next step is to add upstream Dagster assets and kick off a run that materializes them.