Skip to content

Process types

balis edited this page Dec 11, 2013 · 12 revisions

Dataflow

This is the default process behavior: (i) wait for all data input signals; (ii) invoke the task function passing the inputs; (iii) emits all output signals (which are supposed to be returned by the function).

Inputs:

  • Zero or more data inputs
  • In the case there are zero data inputs, there should be defined the firingInterval attribute which specifies (in miliseconds) the interval rate at which the function of the process should be invoked.

Outputs:

  • At least one data output

Function:

  • ins and outs arrays contain, respectively, all data inputs and information on all data outputs (ins[0] contains in1, outs[0] contains out1, etc.).

Foreach (serial/parallel)

A foreach process waits for any of its data input signals, invokes its function passing this signal, and emits the corresponding output signal. In serial version of the foreach the input signals are processed synchronously and in order, while in parallel foreach they are processed asynchronously, without order.

Inputs:

  • At least one data input
  • Data inputs must be assigned to the first N ports

Outputs:

  • Exactly the same number of data outputs as data inputs
  • Data outputs must be assigned to the first N ports

Function:

  • ins and outs arrays contain exactly one element: the currently processed data input and corresponding data output.

Splitter

A splitter process consumes a single data input and emits a sequence of data outputs. The function associated with the splitter must behave in a special way: its successive invocations should return consecutive data outputs, and when there are no more data outputs left, the function should return null. A typical application for this pattern is the splitting of data into chunks. Note that a regular dataflow process can also be used for this: it would have to generate all the chunks in one step and produce an array of signals representing these chunks. Splitter is useful in particular in combination with the next input port, in which case it allows one to control the pace at which the chunks are produced.

Inputs:

  • Exactly ONE data port (data to be splitted); it MUST be the first input port
  • Must have control NEXT port (triggers emission of the next chunk)
  • May have control DONE port (commands the process to finish execution immediately after processing of the current chunk is finished)

Outputs:

  • Must have exactly ONE data port (emits consecutive chunks of input); it must be the first output port
  • May have control NEXT port (emitted after a chunk is emitted to the data port)
  • Must have control DONE port (emitted when there no more chunks)

Function:

  • ins and outs always contain the single input and the single output data element of the task.
  • Each invocation of f(x) should return the next chunk of data x or null if there are no more chunks
  • The process does not specify how to split the data - it's baked into the the function (e.g. split file into lines, collection into items, etc.)

Choice

A choice process behaves similarly to dataflow but in each firing it may emit only some or none of its output signals. To this end, the function of the choice task must explicitly set a flag to denote which of the outputs should be emitted. This behavior is very useful for such patterns as conditional execution, data filtering or data routing

Inputs:

  • At least one data input

Outputs:

  • At least one data output

Function:

  • ins and outs arrays contain, respectively, all data inputs and information on all data outputs (ins[0] contains in1, outs[0] contains out1, etc.).
  • The function should set flag "condition": "true" in all elements of the outs array that should be emitted.
Clone this wiki locally