Selecting & Excluding#
Cosmos allows you to filter to a subset of your dbt project in each DbtDag / DbtTaskGroup using the select and exclude parameters in the RenderConfig class.
Since Cosmos 1.3, the
selectorparameter is available inRenderConfigwhen using theLoadMode.DBT_LSto parse the dbt project into Airflow.Since Cosmos 1.13, the
selectorparameter is available inRenderConfigwhen using theLoadMode.DBT_MANIFESTto parse the dbt project into Airflow.
Using select and exclude#
The select and exclude parameters are lists, with values like the following:
tag:my_tag: include/exclude models with the tagmy_tagconfig.meta.some_key:some_value: include/exclude models withconfig.meta_some_key: some_valueconfig.materialized:table: include/exclude models with the configmaterialized: tablepath:analytics/tables: include/exclude models in theanalytics/tablesdirectory. In this example,analytics/tableis a relative path, but absolute paths are also supported.+node_name+1(graph operators): include/exclude the node with namenode_name, all its parents, and its first generation of children (dbt graph selector docs)+/path/to/model_g+(graph operators): include/exclude all the nodes in the absolute path/path/to/model_g, their parents and children. Relative paths are also supported.+tag:nightly(graph operators): include/exclude all nodes that have tagnightlyand their parents.+config.materialized:view(graph operators): include/exclude all the nodes that have the materializationviewand their parents@node_name(@ operator): include/exclude the node with namenode_name, all its descendants, and all ancestors of those descendants. This is useful in CI environments where you want to build a model and all its descendants, but you need the ancestors of those descendants to exist first.tag:my_tag,+node_name(intersection): include/excludenode_nameand its parents if they have the tagmy_tag(dbt set operator docs)['tag:first_tag', 'tag:second_tag'](union): include/exclude nodes that have eithertag:first_tagortag:second_tagresource_type:<resource>: include/exclude nodes with the resource typeseed, snapshot, model, test, source. For example,resource_type:sourcereturns only nodes where resource_type == SOURCEsource:my_source: include/exclude nodes that have the sourcemy_sourceand are of resource_typesourcesource:my_source+: include/exclude nodes that have the sourcemy_sourceand their childrensource:my_source.my_table: include/exclude nodes that have the sourcemy_sourceand the tablemy_tableexposure:my_exposure: include/exclude nodes that have the exposuremy_exposureand are of resource_typeexposureexposure:+my_exposure: include/exclude nodes that have the exposuremy_exposureand their parents
Note
If you’re using the dbt_ls parsing method, these arguments are passed directly to the dbt CLI command.
If you’re using the dbt_manifest parsing method, Cosmos will filter the models in the manifest before creating the DAG. This does not directly use dbt’s CLI command, but should include all metadata that dbt would include.
If you’re using the custom parsing method, Cosmos does not currently read the dbt_project.yml file. You can still select/exclude models if you’re selecting on metadata defined in the model code or .yml files in the models directory.
Examples:
from cosmos import DbtDag, RenderConfig
jaffle_shop = DbtDag(
render_config=RenderConfig(
select=["tag:my_tag"],
)
)
from cosmos import DbtDag
jaffle_shop = DbtDag(
render_config=RenderConfig(
select=["config.schema:prod"],
)
)
from cosmos import DbtDag
jaffle_shop = DbtDag(
render_config=RenderConfig(
select=["path:analytics/tables"],
)
)
from cosmos import DbtDag, RenderConfig
jaffle_shop = DbtDag(
render_config=RenderConfig(
select=["tag:include_tag1", "tag:include_tag2"], # union
)
)
from cosmos import DbtDag, RenderConfig
jaffle_shop = DbtDag(
render_config=RenderConfig(
select=["tag:include_tag1,tag:include_tag2"], # intersection
)
)
from cosmos import DbtDag, RenderConfig
jaffle_shop = DbtDag(
render_config=RenderConfig(
exclude=["node_name+"], # node_name and its children
)
)
from cosmos import DbtDag, RenderConfig
jaffle_shop = DbtDag(
render_config=RenderConfig(
select=["@my_model"], # selects my_model, all its descendants,
# and all ancestors needed to build those descendants
)
)
Using selector#
Note
Only currently supported using the LoadMode.DBT_LS (since Cosmos 1.3) or LoadMode.DBT_MANIFEST (since Cosmos 1.13).
If select and/or exclude are used with selector, dbt will ignore the select and exclude parameters.
The selector parameter is a string that references a dbt YAML selector already defined in a dbt project.
Examples:
from cosmos import DbtDag, RenderConfig, LoadMode
jaffle_shop = DbtDag(
render_config=RenderConfig(
selector="my_selector", # this selector must be defined in your dbt project
load_method=LoadMode.DBT_LS,
)
)
from cosmos import DbtDag, RenderConfig, LoadMode
jaffle_shop = DbtDag(
project_config=ProjectConfig(
manifest_path=DBT_ROOT_PATH / "jaffle_shop" / "target" / "manifest.json",
project_name="jaffle_shop",
),
render_config=RenderConfig(
selector="nightly_models", # this selector must be defined in your dbt project
load_method=LoadMode.DBT_MANIFEST,
),
)
jaffle_shop_remote = DbtDag(
project_config=ProjectConfig(
manifest_path="s3://cosmos-manifest-test/manifest.json",
manifest_conn_id="aws_s3_conn",
project_name="jaffle_shop",
),
render_config=RenderConfig(
selector="nightly_models", # this selector must be defined in your dbt project
load_method=LoadMode.DBT_MANIFEST,
),
)
Using selector with LoadMode.DBT_MANIFEST#
Since Cosmos 1.13, the selector parameter is also supported when using the LoadMode.DBT_MANIFEST parsing method.
When using this combination, Cosmos will read the preprocessed YAML selectors from the manifest file and use them to filter the dbt nodes to include in the Airflow DAG or Task Group.
The YAML selection parser expects the selectors to be defined in the dbt project and will parse the preprocessed selectors found in the manifest file. Modifying the selector definitions in the manifest file in any way may lead to undefined behavior.
The parser may or may not catch invalid selector definitions if the selectors in the manifest are altered.
The YAML selection parsing logic is based off the spec defined in the dbt documentation.
All graph operators and set operators are supported.
Parsing of the default and indirect_selection keywords is not currently supported.
In the event the dbt YAML selector specification changes, Cosmos will attempt to keep up to date with the changes, but there may be a lag between dbt releases and Cosmos releases. Once a new Cosmos version is released with the updated selector parsing logic, users should update their Cosmos version to ensure compatibility with the latest dbt selector specification. For subsequent updates to the YAML selector parser, existing YAML selector caches will be invalidated the next time the DAG is parsed.