Introduction

The main goals of the CppWfMS project is the development of a Workflow Management System (WfMS) for the e-scienze with the following properties:

  • Mainly written in C++;
  • Based on Petri Nets formalisms;
  • Running on the top of the EGEE/gLite Grid middleware.
  • The C++ language has been chosen because of its strong type checking, expressivity and availability of libraries. The modeling formalism used to describe workflows is based on Petri Nets. Petri Nets have been chosen because of their formal semantics and the availability of several analysis tools. The CppWfMS has been tested on the top of the EGEE/gLite Grid middleware beacuse of its large adoption and relatively mature infrastucture.

    Petri Nets Formalism

    The Petri Nets formalism is a mathematical representations (modeling language) of discrete distributed systems introduced in the 1962 by Carl Adam Petri in his Ph.D thesis. It graphically depicts the structure of a distributed system as a directed bipartite graph. The execution of Petri nets is non-deterministic (well suited for modeling the concurrent behavior of distributed systems).

    Formal Definition

    A Place/Transition net is a structure N = (P, T , Pre, Post) where:

  • P is a set of places represented by circles, |P| = m;
  • T is a set of transitions represented by bars, |T| = n;
  • Pre : P × T → N is the pre­incidence function that specifies the arcs directed from places to transitions;
  • Post : T × P → N is the post­incidence function that specifies the arcs directed from transitions to places.
  • A Petri Net can be represented graphically as depicted in the following figure (where p1, p2 and q0 are places, sum is a transition; place p1 and p2 contains tokens):

    Fig. 1
    A marking is a function M : P → N that associates to each place a non negative number of tokens. The initial marking is also called M0. In the previous example, the initial marking is, M0 = [110].

    Dynamic Behavior

    The evolution of the state of a Petri Net is the conseguence of transition firing. A transition is enabled if its each input place contains at least as many tokens as the weight of the arc indicates. In Figure 2 the transition sum is enabled by means of the tokens placed in places p1 and p2.

    Fig 2.Fig. 3
    An enabled transition can fire in a non-deterministic way between time 0 and infinity. In Figure 3 is depicted the result of the firing of the sum transition; firing transforms the state: (i) subtracting the input arc weights from the token counts of the input places; (ii) adding the output arc weights to the token counts of the output places.

    Petri Nets can be used in order to model processes behaviour such as: sequence, choice and concurrency. The state and the evolution of the net can be modeled in a formal by means of the state equation, reachability (coverability) graph. The behavioral properties depends on the structure of the net and the initial marking: reachability, boundedness and safeness, liveness.

    High Level Petri Nets (HLPN) and Workflows

    Petri Nets formalism is very powerful, but in its first definition is not suitable for describing complex processes. For this reasons several extensions have been provided to the original model. For example, the Colored Petri Net is an extension which outcomes the indistinguibility of tokens of the original definition introducing the concept of type. Other extensions introduce the concept of time and hierarchy. The Petri Nets model extended with color, time and hierarchy is usually addressed with the term High Level Petri Nets (HLPN).

    Fig. 4

    Alike other formalisms (such as Directed Acyclic Graphs and Pi-Calculus) HLPN capture both the data and the control flow of a workflow. Recent studies have demostrated that the modling capabilities of HLPN outperforms other formalisms tanks to the following properties:

  • the formal semantics despite the graphical nature,
  • state-based structure (as opposed to the event-based one),
  • the availability of many analysis techniques.

  • gLite Grid Middleware

    gLite is the next generation middleware for Grid computing. Born from the collaborative efforts from academic and industrial research centers as part of the EGEE Project. The gLite Grid services follow a Service Oriented Architecture which facilitate interoperability among Grid services and allow easier compliance with upcoming standards. The architecture is not bound to specific implementations; services are expected to work together and they can be deployed and used independently. The gLite service decomposition has been largely influenced by the work performed in the LCG project.

    The main component in the gLite middleware is the Workload Management System (WMS) which takes care of finding the best available resources considering a set of users requirements and preferences (such as CPU architecture, OS, current load).

    Workflow Management System Architecture Overview »