Workflows

Workflows are server-side logic that can schedule and combine server tasks and worker tasks to automate complex operations.

Workflows are created from a workflow template chosen from a set maintained by the server administrators, plus data coming from user input.

See the Workflows for an overview, and below for technical details.

Workflow noop

This is a workflow that does nothing, and is mainly used in tests.

  • task_data: empty

Workflow sbuild

This workflow takes a source package and creates sbuild work requests (see Sbuild task) to build it for a set of architectures.

  • task_data:

    • input (required): see Task PackageBuild

    • target_distribution (required string): vendor:codename to specify the environment to use for building. It will be used to determine distribution or environment, depending on backend.

    • backend (optional string): see Task PackageBuild

    • architectures (required list of strings): list of architectures to build. It can include all to build a binary for Architecture: all

    • build_logs_collection (Single lookup with default category debian:package-build-logs, optional): collection where build logs should be retained; if unset, build logs are not added to any collection

    • environment_variant (optional string): variant of the environment we want to build on, e.g. buildd; appended during environment lookup for target_distribution above.

    • build_profiles (optional, default unset): select a build profile, see Task PackageBuild.

    • binnmu (optional, default unset): build a binNMU, see Task PackageBuild.

    • retry_delays (optional list): a list of delays to apply to each successive retry; each item is an integer suffixed with m for minutes, h for hours, d for days, or w for weeks.

The source package will be built on the intersection of the provided list of architectures and the architectures supported in the Architecture: field of the source package. The architecture all packages are built in an amd64 environment.

The workflow may also apply a denylist of architectures if it finds a debian:suite collection corresponding to the build distribution/environment, and that suite provides one.

If build_logs_collection exists, then the workflow adds update-collection-with-data and update-collection-with-artifacts event reactions to each sbuild work request to record their build logs there. See Category debian:package-build-logs.

If retry_delays is set, then the workflow adds a corresponding on_failure retry-with-delays action to each of the sbuild work requests it creates. This provides a simplistic way to retry dependency-wait failures. Note that this currently retries any failure, not just dependency-waits; this may change in future.

Workflow update_environments

This workflow schedules work requests to build tarballs and images, and adds them to a debian:environments collection.

  • task_data:

    • vendor (required): the name of the distribution vendor, used to look up the target debian:environments collection

    • targets (required): a list of dictionaries as follows:

      • codenames (required): the codename of an environment to build, or a list of such codenames

      • codename_aliases (optional): a mapping from build codenames to lists of other codenames; if given, add the output to the target collection under the aliases in addition to the build codenames. For example, trixie: [testing]

      • variants (optional): an identifier to use as the variant name when adding the resulting artifacts to the target collection, or a list of such identifiers; if not given, the default is not to set a variant name

      • backends (optional): the name of the debusine backend to use when adding the resulting artifacts to the target collection, or a list of such names; if not given, the default is not to set a backend name

      • architectures (required): a list of architecture names of environments to build for this codename

      • mmdebstrap_template (optional): a template to use to construct data for the Mmdebstrap task

      • simplesystemimagebuild_template (optional): a template to use to construct data for the SimpleSystemImageBuild task

For each codename in each target, the workflow creates a group. Then, for each architecture in that target, it fills in whichever of mmdebstrap_template and simplesystemimagebuild_template that are present and uses them to construct child work requests. In each one, bootstrap_options.architecture is set to the target architecture, and bootstrap_repositories[].suite is set to the codename if it is not already set.

The workflow adds one event reaction to each child work request as follows for each combination of the codename (including any matching entries from codename_aliases), variant (variants, or [null] if missing/empty), and backend (backends, or [null] if missing/empty). {vendor} is the vendor from the workflow’s task data, and {category} is debian:system-tarball for mmdebstrap tasks and debian:system-image for simplesystemimagebuild tasks:

on_success:
  - action: "update-collection-with-artifacts"
    artifact_filters:
      category: "{category}"
    collection: "{vendor}@debian:environments"
    variables:
      - codename: {codename}
      - variant: {variant}  # omit if null
      - backend: {backend}  # omit if null

Workflow autopkgtest

This workflow schedules autopkgtests for a single source package on a set of architectures.

  • task_data:

    • prefix (string, optional): prefix this string to the item names provided in the internal collection

    • source_artifact (Single lookup, required): see Autopkgtest task

    • binary_artifacts (Multiple lookup, required): see Autopkgtest task

    • context_artifacts (Multiple lookup, optional): see Autopkgtest task

    • vendor (string, required): the distribution vendor on which to run tests

    • codename (string, required): the distribution codename on which to run tests

    • backend (string, optional): see Autopkgtest task

    • architectures (list of strings, optional): if set, only run on any of these architecture names

    • include_tests, exclude_tests, debug_level, extra_environment, needs_internet, fail_on, timeout: see Autopkgtest task

Tests will be run on the intersection of the provided list of architectures (if any) and the architectures provided in binary_artifacts. If only Architecture: all binary packages are provided in binary_artifacts, then tests are run on amd64.

The workflow creates an Autopkgtest task for each concrete architecture, with task data:

  • input.source_artifact: {source_artifact}

  • input.binary_artifacts: the subset of {binary_artifacts} that are for the concrete architecture or all

  • input.context_artifacts: the subset of {context_artifacts} that are for the concrete architecture or all

  • host_architecture: the concrete architecture

  • environment: {vendor}/match:codename={codename}

  • backend: {backend}

  • include_tests, exclude_tests, debug_level, extra_environment, needs_internet, fail_on, timeout: copied from workflow task data parameters of the same names

Any of the lookups in input.source_artifact, input.binary_artifacts, or input.context_artifacts may result in promises, and in that case the workflow adds corresponding dependencies. Binary promises must include an architecture field in their data.

Each work request provides its debian:autopkgtest artifact as output in the internal collection, using the item name {prefix}autopkgtest-{architecture}.

Todo

The selection of the host architecture for architecture-independent binary packages should be controlled by pipeline instructions. A similar mechanism might also control multiarch tests, such as testing i386 packages on an amd64 testbed.

Event reactions

The event_reactions field on a workflow is a dictionary mapping events to a list of actions. Each action is described with a dictionary where the action key defines the action to perform and where the remaining keys are used to define the specifics of the action to be performed. See section below for details. The supported events are the following:

  • on_creation: event triggered when the work request is created

  • on_unblock: event triggered when the work request is unblocked

  • on_success: event triggered when the work request completes successfully

  • on_failure: event triggered when the work request fails or errors out

Supported actions

send-notification

Sends a notification of the event using an existing notification channel.

  • channel: name of the notification channel to use

  • data: parameters for the notification method

update-collection-with-artifacts

Adds or replaces artifact-based collection items with artifacts generated by the current work request.

  • collection (Single lookup, required): collection to update

  • name_template (string, optional): template used to generate the name for the collection item associated to a given artifact. Uses the str.format templating syntax (with variables inside curly braces).

  • variables (dict, optional): definition of variables to prepare to be able to compute the name for the collection item. Keys and values in this dictionary are interpreted as follows:

    • Keys beginning with $ are handled using JSON paths. The part of the key after the $ is the name of the variable, and the value is a JSON path query to execute against the data dictionary of the target artifact in order to compute the value of the variable.

    • Keys that do not begin with $ simply set the variable named by the key to the value, which is a constant string.

    • It is an error to specify keys for the same variable name both with and without an initial $.

  • artifact_filters (dict, required): this parameter makes it possible to identify a subset of generated artifacts to add to the collection. Each key-value represents a specific Django’s ORM filter query against the Artifact model so that one can run work_request.artifact_set.filter(**artifact_filters) to identify the desired set of artifacts.

Note

When the name_template key is not provided, it is expected that the collection will compute the name for the new artifact-based collection item. Some collection categories might not even allow you to override the name. In this case, after any JSON path expansion, the variables field is passed to the collection manager’s add_artifact, so it may use those expanded variables to compute its own item names or per-item data.

As an example, you could register all the binary packages having Section: python and a dependency on libpython3.12 out of a sbuild task with names like $PACKAGE_$VERSION by using this action:

action: 'update-collection-with-artifacts'
artifact_filters:
  category: 'debian:binary-package'
  data__deb_fields__Section: 'python'
  data__deb_fields__Depends__contains: 'libpython3.12'
collection: 'internal@collections'
name_template: '{package}_{version}'
variables:
  package: 'deb_fields.Package'
  version: 'deb_fields.Version'

update-collection-with-data

Adds or replaces a bare collection item based on the current work request.

This is similar to update-collection-with-artifacts, except that of course it does not refer to artifacts. This can be used in situations where no artifact is available, such as in on_creation events.

  • collection (Single lookup, required): collection to update

  • category (string, required): the category of the item to add

  • name_template (string, optional): template used to generate the name for the collection item. Uses the str.format templating syntax (with variables inside curly braces, referring to keys in data).

  • data (dict, optional): data for the collection item. This may also be used to compute the name for the item, either via substitution into name_template or by rules defined by the collection manager.

Note

When the name_template key is not provided, it is expected that the collection will compute the name for the new bare collection item. Some collection categories might not even allow you to override the name.

retry-with-delays

This action is used in on_failure event reactions. It causes the work request to be retried automatically with various parameters, adding a dependency on a newly-created Delay task.

The current delay scheme is limited and simplistic, but we expect that more complex schemes can be added as variations on the parameters to this action.

  • delays (list, required): a list of delays to apply to each successive retry; each item is an integer suffixed with m for minutes, h for hours, d for days, or w for weeks.

The workflow data model for work requests gains a retry_count field, defaulting to 0 and incrementing on each successive retry. When this action runs, it creates a Delay task with its delay_until field set to the current time plus the item from delays corresponding to the current retry count, adds a dependency from its work request to that, and marks its work request as blocked on that dependency. If the retry count is greater than the number of items in delays, then the action does nothing.

Workflow implementation

On the Python side, a workflow is orchestrated by a subclass of Workflow, which derives from BaseTask and has its own subclass hierarchy.

When instantiating a “Workflow”, a new WorkRequest is created with:

  • task_type set to "workflow"

  • task_name pointing to the Workflow subclass used to orchestrate

  • task_data set to the workflow parameters instantiated from the template (or from the parent workflow)

This WorkRequest acts as the root of the WorkRequest hierarchy for the running workflow.

The Workflow class runs on the server with full database access and is in charge of:

  • on instantiation, laying out an execution plan under the form of a directed acyclic graph of newly created WorkRequest instances.

  • analyzing the results of any completed WorkRequest in the graph

  • possibly extending/modifying the graph after this analysis

WorkRequest elements in a Workflow can only depend among each other, and cannot have dependencies on WorkRequest elements outside the workflow. They may depend on work requests in other sub-workflows that are part of the same root workflow.

All the child work requests start in the blocked status using the deps unblock strategy. When the Workflow WorkRequest is ready to run, all the child WorkRequest elements that don’t have any further dependencies can immediately start.

WorkflowTemplate

The WorkflowTemplate model has (at least) the following fields:

  • name: a unique name given to the workflow within the workspace

  • workspace: a foreign key to the workspace containing the workflow

  • task_name: a name that refers back to the Workflow class to use to manage the execution of the workflow

  • task_data: JSON dict field representing a subset of the parameters needed by the workflow that cannot be overridden when instantiating the root WorkRequest

The root WorkRequest of the workflow copies the following fields from WorkflowTemplate:

  • workspace

  • task_name

  • task_data, combining the user-supplied data and the WorkflowTemplate-imposed data)

Group of work requests

When a workflow generates a large number of related/similar work requests, it might want to hide all those work requests behind a group that would appear a single step in the visual representation of the workflow. This is implemented by a group key in the workflow_data dictionary of each task.

Advanced workflows / sub-workflows

Advanced workflows can be created by combining multiple limited-purpose workflows.

Sub-workflows are integrated in the general graph of their parent workflow as WorkRequests of type workflow.

From a user interface perspective, sub-workflows are typically hidden as a single step in the visual representation of the parent’s workflow.

Cooperation between workflows is defined at the level of workflows. Individual work requests should not concern themselves with this; they are designed to take inputs using lookups and produce output artifacts that are linked to the work request.

Sub-workflow coordination takes place through the workflow’s internal collection (which is shared among all sub-workflows of the same root workflow), providing a mechanism for some work requests to declare that they will provide certain kinds of artifacts which may then be required by work requests in other sub-workflows.

On the providing side, workflows use the update-collection-with-artifacts event reaction to add relevant output artifacts from work requests to the internal collection, and create promises to indicate to other workflows that they have done so. Providing workflows choose item names in the internal collection; it is the responsibility of workflow designers to ensure that they do not clash, and workflows that provide output artifacts have a optional prefix field in their task data to allow multiple instances of the same workflow to cooperate under the same root workflow.

On the requiring side, workflows look up the names of artifacts they require in the internal collection; each of those lookups may return nothing, or a promise including a work request ID, or an artifact that already exists, and they may use that to determine which child work requests they create. They use lookups in their child work requests to refer to items in the internal collection (e.g. internal@collections/name:build-amd64), and add corresponding dependencies on work requests that promise to provide those items.

Sub-workflows may depend on other steps within the root workflow while still being fully populated in advance of being able to run. A workflow that needs more information before being able to populate child work requests should use workflow callbacks to run the workflow orchestrator again when it is ready. (For example, a workflow that creates a source package and then builds it may not know which work requests it needs to create until it has created the source package and can look at its Architecture field.)