A C++ Workflow Management System for e-scienze

Both in the scientific and business environments, many efforts have been done in order to propose a workflow description language based on the Petri Nets formalism. The Grid Workflow Description Language (GWorkflowDL) - which has been introduced by the Fraunhofer FIRST research group - is one of the most interesting workflow languages based on the formal semantics of Petri Nets used in the scientific environments. Actually, it is an XML-based language which uses the High Level Petri Nets (HLPN) modeling formalism. In the business environment, the Yet Another Workflow Language (YAWL) has been defined by extending the Petri Net formalisms with several features in order to overcome the limitations of Petri Nets in describing business processes. Unfortunally, these languages are practically not interoperable.

GWorkflowDL

The GWorkflowDL consists of two parts:

a generic part, used to define the structure of the workflow, reflecting the data and control flow in the application,
a middleware-specific part (extensions) that defines how the workflow should be executed in the context of a specific Grid computing middleware.

Fig. 1

The generic part of the workflow can be represented in the GWorkflowDL language as:

<workflow>
   <place ID="p1">
      <token>
         <data><t1 xsd:type="xs:int">3</t1></data>
      </token>
   </place>
   <place ID="p2">
      <token>
         <data><t2 xsd:type="xs:int">2</t2></data>
      </token>
   </place>
   <place ID="q0"/>
   <transition ID="sum">
      <inputPlace placeID="p1" edgeExpression="a1"/>
      <inputPlace placeID="p2" edgeExpression="a2"/>
      <outputPlace placeID="q0" edgeExpression="b"/>
       <operation /> <!-- generic operation --> 
   </transition>
</workflow>

In the workflow there are no information about the operation associated to the Petri Net transition T. More detailed information are provided by the concrete part of the description. A concrete operation could be the invocation of a Web Service, or the remote execution of a program or the invocation of a local routine.

The enactment process, performed by the WfMS, is responsable of mapping abstract operations - associated to a Petri Net transisions - to concrete operations. For example the plus operation represented in the above workflow can be mapped to a Web Service invocation as described in the following concrete workflow:

<workflow>
   <place ID="p1">
      <token>
         <data><t1 xsd:type="xs:int">3</t1></data>
      </token>
   </place>
   <place ID="p2">
      <token>
         <data><t2 xsd:type="xs:int">2</t2></data>
      </token>
   </place>
   <place ID="q0"/>
   <transition ID="sum">
      <inputPlace placeID="p1" edgeExpression="a1"/>
      <inputPlace placeID="p2" edgeExpression="a2"/>
      <outputPlace placeID="q0" edgeExpression="b"/>
       <op:operation xmlns:op="http://www.gridworkflow.org/gworkflowdl/operation">
         <op:wsOperation wsdl="http://localhost/plus?wsdl" operationName="plus">
            <op:in>n1</op:in>
            <op:in>n2</op:in>
            <op:out>q1</op:out>
         </op:wsOperation>
      </op:operation>
   </transition>
</workflow>

Or, for example, mapped to the local method invocation as depicted in the following concrete workflow:

<workflow>
   <place ID="p1">
      <token>
         <data><t1 xsd:type="xs:int">3</t1></data>
      </token>
   </place>
   <place ID="p2">
      <token>
         <data><t2 xsd:type="xs:int">2</t2></data>
      </token>
   </place>
   <place ID="q0"/>
   <transition ID="sum">
      <inputPlace placeID="p1" edgeExpression="a1"/>
      <inputPlace placeID="p2" edgeExpression="a2"/>
      <outputPlace placeID="q0" edgeExpression="b"/>
       <op:operation xmlns:op="http://www.gridworkflow.org/gworkflowdl/operation">
         <op:pyOperation operation="b = a1 + a2" />
      </op:operation>
   </transition>
</workflow>

A concrete workflow written in GWorkflowDL can be used by a Petri Nets-based engine in order to execute the workflow.

Language Coverters

As stated in the previous section, in the scientific environment several workflow languages have been defined. In order to achieve at languages interoperability converters have been provided. It's not always possible to translate a workflow description to a different formalism, but where it is possibile, it can be done automatically by means of parsers and compilers. As a proof of concept, we present a case study about the automatic conversion of DAG-based workflows to the Petri Net-based Grid Workflow Description Language.

JDL to GWorkflowDL converter

The Job Description Language (JDL) describes a job to be executed on the Grid. The JDL adopted within the gLite middleware is based upon Condor's Classified Advertisements (ClassAds): a record-like structure composed of a finite number of attributes separated by semi-colon (;). In gLite the JDL language is used in order to describe simple jobs and also simple DAG-based workflows. The following example depicts a JDL workflow:

[
  Type = "dag";
  VirtualOrganisation = "infngrid";
  Rank = -other.GlueCEStateEstimatedResponseTime;
  //Requirements = (other.GlueCEInfoHostName == "ce06-lcg.cr.cnaf.infn.it");
  Requirements = other.GlueCEStateStatus == "Production";
  Nodes = [
    nodeA = [
      Description = [
        JobType = "normal";
        Executable = "/usr/bin/wc";
        Arguments = "-w rfc2616.txt";
        StdOutput = "file1.txt";
        InputSandbox = { "rfc2616.txt"};
        OutputSandbox = {"file1.txt"};
      ];
    ];
    nodeB = [
      Description = [
        JobType = "normal";
        Executable = "/bin/cat";
        Arguments = "rfc0821.txt";
        StdOutput = "file2.txt";
        InputSandbox = {"rfc0821.txt"};
        OutputSandbox = {"file2.txt"};
      ];
    ];
    nodeC = [
      Description = [
        JobType = "normal";
        Executable = "/usr/bin/wc";
        Arguments = "-w file2.txt";
        StdOutput = "file3.txt";
        InputSandbox = {root.Nodes.nodeB.OutputSandbox[0]};
        OutputSandbox = {"file3.txt"};
      ];
    ];
    nodeD = [
      Description = [
        JobType = "normal";
        Executable = "/usr/bin/diff";
        Arguments = "-s file1.txt file3.txt";
        StdOutput = "result.txt";
        InputSandbox = {root.Nodes.nodeA.OutputSandbox[0], root.Nodes.nodeC.OutputSandbox[0]};
        OutputSandbox = {"result.txt"};
      ];
    ];
  Dependencies = { {nodeB, nodeC}, {nodeC, nodeD}, {nodeA, nodeD} };
  ];
]

The JDL defines 5 nodes (father, son1, son2, son3, final) each of them represent a job. The dependencies among nodes are defined in the "dependencies" tag, the graphical representation of the workflow is shown in the following figure.

Fig. 2

DAG in Fig. 2 can be translated into a Petri Net considering the following rules:

a DAG node can be represented by a Petri Net transition,
the flow of data among DAG nodes is modeled using data tokens

The result of such conversion is depicted in figure below (each node of the DAG has been converted into a Petri Net transition, and places have been introduced in order to manage the flow of data between transitions):

Fig. 3

SCUFL to GWorkflowDL converter

Scufl is an XML-based language which allows to model DAG-based workflows. Scufl is the description language used by the Taverna WfMS and in particular by the FreeFluo engine. Unlike JDL, Scufl allows to define several control structures which make the language more expressive. As Scufl and GWorkflowDL are XML-based the conversion is made possible by means of an Extensible Stylesheet Language Transformations (XSLT). We do not want to go deep into the translation process because the converter is still under development. However, we are currently able to translate simple Scufl workflows which interact with arbitrary Web Services as depicted in Fig. 4. The workflow shows an example of a conditional execution where the input value of the source node condition determines the execution of the left branch -- when the input value is false -- or the right one, otherwise. The left branch performs a temperature conversion from Fahrenheit to Celsius using a Web Service, the right one the inverse operation. The result of the conversion is stored into the corresponding sink node, tempF or tempC.

Fig. 4

The Fig. 4 shows how the workflow can be converted into a Petri Net-based description which keeps the same semantics. However, the majority of Scufl workflows uses local bindings to the Java platform making the conversion process even more difficult.

Workflow Engine »

CppWfMS

A C++ Workflow Management System for e-science

GWorkflowDL

Language Coverters

JDL to GWorkflowDL converter

SCUFL to GWorkflowDL converter