OneWorkflow Concepts

Sesam Workflows Knowledge Centre Help library Using Sesam with Python OneWorkflow Concepts

Overview

This page explains the core concepts that are specific to OneWorkflow and necessary to understand in order to use OneWorkflow.

User-defined Computational Workflows

The main purpose of OneWorkflow is to enable user-defined computational workflows. What is meant by this term?

A computational workflow is a set of computational steps to be executed to solve a particular computational problem, the input data required for each step and the ordering constraints of the steps, if any.

A user-defined computational workflow is then, as its name indicates, a custom computational workflow defined by the end user to solve a particular computational problem or class of problems.

In contrast, a predefined computational workflow is a computational workflow that is predefined by an application. Predefined computational workflows are often impossible to extend or customize and therefore cumbersome to integrate in the real analysis workflow of the user. The key premise of user-defined computational workflows is that they should be easy to define and customize to the real analysis workflow of the user, they enable a much higher level of automation and quality control, and they produce results that are reproducable.

Command

The basic constituent of a computational workflow is a command. A command contains input to a particular application, and when executed from Python will run this application with the given input. Commands can be run individually or composed in order to automate more complex workflows. See the WorkerCommand section for more information.

Flow Model

To perform computational work in an efficient manner, the workflow typically needs to be decomposed into individual chunks of computational work that can be performed independently.

A chunk of work will be referred to as a task. Execution of these tasks may require some orchestration, both to ensure that tasks are performed in the correct order, but also ensuring that tasks that can be done in parallel are executed in parallel when computational resources are available. This logic, which is application specific, is referred to as the workflow semantics.

The purpose of the OneWorkflow Flow Model is to enable the client to compose its computational workflow (including its workflow semantics) in such a way that allows it to be executed autonomously by a general-purpose computational service. The computational service may reside anywhere; both locally and in the cloud. This allows the user to choose the execution environment that best fits the needs; local execution if the resources available on the user's desktop are sufficient, cloud execution if more resources are needed.

The figure below shows the composition pattern of workflow primitives of the OneWorkflow Flow Model. Each primitive is explained below.

Workflow composition

For simple computational work, like running a particular application with given input, direct execution of a WorkerCommand is the recommended approach, see WorkerCommand. Service execution should be considered for more demanding computational work, see the Job paragraph below.


Job

A Job is the Flow Model construct that represents an entire complex computational workflow. A Job is executed in batch by a computational service.

Computational work is assigned to the Job by assigning a Work Item to the Job.Work property.

This way of decomposing computational work is needed when using a computational service to execute the work. Offloading computational work to a computational service is most convenient for work that is either long running or requires more computational resources than are available on the local computer.

Work Item

Work Item is an abstract concept that has several specializations, or Work Item types. Each work item type has different execution semantics.

The current OneWorkflow Flow Model allows composite workflows to be modelled using the following Work Item types:

Work Item TypeDescription
WorkUnitAn atomic unit of computational work.
ParallelWorkA collection of work items that can be processed independently.

WorkUnit carries an arbitrary message to the Worker. The message is application specific and should contain sufficient information for the worker to do its work.A WorkUnit then acts as an envelope for the message to the worker. Each WorkUnit can be processed individually, which makes the compute service free to distribute the processing among the available compute resources.

ParallelWork is a composite WorkItem containing other Work Items, typically Work Units. The work items of ParallelWork will be processed independently and in parallel when resources are available to do so. ParallelWork has an optional ReductionTask WorkUnit that will be executed after all the regular work items of ParallelWork has been completed.

Result

Result wraps application specific output data from the Worker in. When the Worker completes processing the Work Unit, it returns a Result.

WorkerCommand

The WorkerCommand is the most basic constituent of a computational workflow. It contains input to a particular application, and can be used in different ways:

  • Direct execution: It can be executed directly using the execute_command primitive. This will run the associated application on the local computer, and is a straight forward way to run a particular application from Python.
  • Service execution: It can be used as part of a more complex workflow to be run by one of the OneWorkflow Computational Services

When executed by a OneWorkflow computational service, WorkerCommand is the worker message type used by OneWorkflow, and it represents an imperative message to the worker to perform a particular task. WorkerCommands are specfific to the computational task to be performed. The following general purpose commands are part of the OneWorkflow command library:

CommandDescription
CompositeExecutableCommandCommand to run a list of commands in sequence or in parallel.
PythonCommandCommand to run a given Python script.
ExecutableCommandCommand to run a given executable with given arguments.

Worker

The Worker in OneWorkflow receives a WorkerCommand from the client and runs the task specified by it.

OneWorkflow Computational Services

The OneWorkflow Computational services are responsible for orchestrating the execution of Jobs submitted by the client. The following execution environments are supported by OneWorkflow: The OneWorkflow Computational services are responsible for orchestrating the execution of Jobs submitted by the client. The following execution environments are supported by OneWorkflow:

  • Local: The computational service runs on the local PC of the user.
  • Cloud: The computational service runs in the cloud.

The same API is supported by both computational services, which enables jobs to be specified independent of the execution environment.