Before the introduction of the DefinitionsAPI, definitions were grouped into repositories, and there could be many repostories in a particular code location. Refer to the Repositories documentation for info on this previous API and mental model.
A code location is a collection of Dagster definitions loadable and accessible by Dagster's tools, such as the CLI, UI, and Dagster Cloud. A code location comprises:
A reference to a Python module that has an instance of Definitions in a top-level variable
A Python environment that can successfully load that module
Definitions within a code location have a common namespace and must have unique names. This allows them to be grouped and organized by code location in tools.
A single deployment can have one or multiple code locations.
Code locations are loaded in a different process and communicate with Dagster system processes over an RPC mechanism. This architecture provides several advantages:
When there is an update to user code, the Dagster webserver/UI can pick up the change without a restart.
You can use multiple code locations to organize jobs, but still work on all of your code locations using a single instance of the webserver/UI.
The Dagster webserver process can run in a separate Python environment from user code so job dependencies don't need to be installed into the webserver environment.
Each code location can be sourced from a separate Python environment, so teams can manage their dependencies (or even their Python versions) separately.
Definitions can be included in a Python file like my_file.py or a Python module. If using the latter, the Definitions object should be defined in the module's top-level __init__.py file.
Refer to the Running Dagster locally guide for more info about local development, including how to configure your local instance.
Dagster can load a file directly as a code location. In the following example, we used the -f argument to supply the name of the file:
dagster dev -f my_file.py
This command loads the definitions in my_file.py as a code location in the current Python environment.
You can also include multiple files at a time, where each file will be loaded as a code location:
dagster dev -f my_file.py -f my_second_file.py
Dagster can also load Python modules as code locations. When this approach is used, Dagster loads the definitions defined at the top-level of the module, in a variable containing the Definitions object of its root __init__.py file. As this style of development eliminates an entire class of Python import errors, we strongly recommend it for Dagster projects deployed to production.
In the following example, we used the -m argument to supply the name of the module:
dagster dev -m your_module_name
This command loads the definitions in the variable containing the Definitions object in the named module - defined as the root __init__.py file - in the current Python environment.
You can also include multiple modules at a time, where each module will be loaded as a code location:
dagster dev -m your_module_name -m your_second_module
To load definitions without supplying command line arguments, you can use the pyproject.toml file. This file, included in all Dagster example projects, contains a tool.dagster section with a module_name variable:
[tool.dagster]
module_name = "your_module_name" ## name of project's Python module
code_location_name = "your_code_location_name" ## optional, name of code location to display in the Dagster UI
When defined, you can run this in the same directory as the pyproject.toml file:
The dagster_cloud.yaml file is used to create and deploy code locations for Cloud deployments. Each code location entry in this file has a code_source property, which is used to specify how a code location is sourced. Code locations can be sourced from a Python file or module:
To load a code location from a Python file, use the python_file property in your dagster_cloud.yaml:
The workspace.yaml file is used to load code locations for open source (OSS) deployments. This file specifies how to load a collection of code locations and is typically used in advanced use cases. Refer to the Open source deployment guides for more info.
If you used @repository in previous Dagster versions, you might be interested in how Definitions and repositories differ. Check out the following table for a high-level comparison:
Definitions (Recommended)
Repositories
Minimum Dagster version
1.1.7
0.6
Description
Created by using the Definitions object assigned to a top-level variable