-->

Friday, 28 November 2008

How to use NGinn rules engine

1. Required libraries

To use NGinn.RippleBoo engine in your application you need to add references to the following libraries:

  • NGinn.RippleBoo

  • Rhino.DSL.dll

  • Boo.Lang.dll

  • Boo.Lang.Compiler.dll

  • NLog.dll



2. Invoking RippleBoo

The code below shows how to configure rule repository and how to execute some rules.

using System;
using System.Collections.Generic;
using NGinn.RippleBoo;

class TestMe
{
private RuleRepository _repos;

public TestMe()
{
_repos = new RuleRepository();
_repos.BaseDirectory = "c:\\rules";
_repos.ImportNamespaces.Add("System");
}

public void RunSomeRules()
{
Dictionary<string, object> variables = new Dictionary<string,object>();
variables["Email"] = "my@email.com";
variables["Timestamp"] = DateTime.Now;

Dictionary<string, object> context = new Dictionary<string,object>();
context["Output"] = Console.Output;

_repos.EvaluateRules("some_rules.boo", variables, context);
}
}


RuleRepository class stores common configuration properties for your rules and allows you to call rules stored as '*.boo' files in base directory. It compiles the rule scripts and caches them so subsequent evaluations are fast. If rule script changes, it will be automatically recompiled. You should create rule repository once and hold it as long as needed.

Rule evaluation is done in 'RunSomeRules' method. To execute rules you call RuleRepository.EvaluateRules, passing the rule file name and two dictionaries.
First one contains variables that can be referenced from rules through 'Variables' object. The second one contains 'context' object, they can be referenced from rules throug 'Context' object.

Example rule:
ruleset "SomeRules":
rule "R1":
when Variables.Email.EndsWith("mydomain.com")
action:
Context.Output.WriteLine("Email from my domain")



This rule references the 'Email' variable and the 'Output' context object. Please note that in rules you don't have to quote the variable names - it's because of Boo language's IQuackFu magic interface.
RuleRepository.EvaluateRules method is thread safe.

Thursday, 27 November 2008

Rules engine improved

First attempts to use the RippleBoo rules engine in real software showed that version 0.1 wasn't very useful, so I had to prepare version 0.2.
First of all, the structure of rule definition was changed - now it's more descriptive:



rule "SPAM":
label "Spam? - move to spam"
when IS_SPAM()
except_rule "Friendly_spam"
action:
MOVE_TO "Spam"
else_rule "WORK"



What we have here:

  • declaration of rule "SPAM"
  • label - for documentation
  • when - this is rule condition
  • except_rule - this is the 'exception' rule - containing an exception for the rule condition. Our 'SPAM' rule will be fired when its condition is true and the exception does not fire
  • action - executed when rule fires
  • else_rule - successor when rule condition is not satisfied


You should read it like so: when IS_SPAM() returs true, move message to 'SPAM' folder, except for messages where "Friendly spam" rule applies. If IS_SPAM() returns false, do nothing but proceed to rule "WORK"
That's basically how Ripple Down Rules work. You should note that only one rule will be fired - the one with satisfied condition and no exceptions to apply. This can be a problem when you want to execute some code each time rule condition is satisfied, no matter if there are exceptions or not. In such case you can either put your code in rule condition, or use special 'side_effect' block:



rule "VERY_IMPORTANT":
label "Important? - mark high priority"
when __msg.From == "customer_care@mybank.com"
side_effect:
__msg.Priority = "High"



The 'side_effect' will be executed just after rule condition evals to true but BEFORE checking exception rules. In contrast, the 'action' block will be executed only when rule conditions eval to true AND no exceptions apply (no rule is fired when evaling exception subtree).


Here's an example rule definition file, containing simple email message processing rules. NGinn.RippleBoo engine allows you to declare your own 'local' variables and helper functions that can be used in rules:

#variable alias
__msg = Variables.Message

#helper function - check if message is spam
IS_SPAM = def() :
return __msg.Subject.IndexOf("[--spam--]") >= 0

#helper - move message to specified folder
MOVE_TO = def(folder):
Context.MessageDb.MoveMessage(__msg, folder)


ruleset "Email_default_rules":

rule "SPAM":
label "Spam? - move to spam"
when IS_SPAM()
except_rule "Friendly_spam"
action:
MOVE_TO "Spam"
else_rule "WORK"

rule "Friendly_spam":
label "Interesting subject? - read!"
when __msg.Subject.IndexOf("enlarge") >= 0
action:
MOVE_TO "Useful_spam"

rule "WORK":
label "Work? - move to WORK"
when __msg.From.EndsWith("mycompany.com")
action:
MOVE_TO "Work"
else_rule "VERY_IMPORTANT"


rule "VERY_IMPORTANT":
label "Important? - mark high priority"
when __msg.From == "customer_care@mybank.com"
side_effect:
__msg.Priority = "High"




And here's a graphical representation of the ruleset defined above.



The picture is automatically generated from rule definition, using the GraphViz tool (useful, but very user-unfriendly, unix-style program).

Other features


What is important, we can define several rulesets in single file. First ruleset will be the default one, but RippleBoo allows you to call also the other rulesets.
You can also call other rulesets from your actions, by executing
goto_ruleset "another ruleset"

Think of secondary rulesets as sub-procedures that can be called from the main procedure.

There is also an option to execute rules from external file
goto_file "another_rules.boo"

This will execute rules from another file.
Remember, you call goto_ruleset or goto_file from an action block inside some rule. Only one action will be executed, so you don't need to worry about continuation after goto - because there will be no continuation. Simply - there is no return from goto_ruleset or goto_file.



OK, I'll shed some light on using RippleBoo in your programs in next posts, because now I'm getting sick of code formatting at this blog engine. Does anyone know why it sucks so much and what can I do so it stops messing with my html?

Friday, 24 October 2008

Rules engine for NGinn

Today I have added a first working version of a rules engine to NGinn. The source code is in 'NGinn.RippleBoo' folder. 

The RippleBoo engine implements algorighm called 'Ripple Down Rules' - basically it is a binary decision tree. Each rule has simple "if then " structure, where condition is a boolean expression and action is a block of instructions. Apart from that, rule defines what will be the next rule to evaluate by specifying successor rule in positive case and successor rule in negative case. 

When rule condition evals to true, its action is executed and next rule to evaluate will be the 'positive' successor rule. When condition evals to false, action will not be executed and next rule to evaluate will be the 'negative' successor. In effect, we get a binary decision tree, but we don't have to worry about its completeness because it is guaranteed that at least one rule will fire no matter what are the conditions (because the first rule is always true).

Rules were implemented in Boo language using the RhinoDSL library from the Rhino-tools package. Rhino DSL is a library for building DSLs (domain specific languages) in Boo. Here's a link to its author's blog: http://ayende.com/Blog/archive/2007/12/03/Implementing-a-DSL.aspx. The guy has done a great work and many interesting examples of DSLs can be found there.

Below is an example ruleset in my "rule definition language". BTW, it's also a valid Boo script:


Ruleset "MyRules"


rule "R1", "R2", null, V.Counter < 9:
log.Info("AAA");

rule "R2", "R3", null, V.Counter < 8:
log.Info ("R2")

rule "R3", "R4", null, V.Counter < 7:
log.Info ("R3")

rule "R4", null, "R5", V.Counter == 1:
log.Info ("R4")

rule "R5", "R6", null, 1 == 1:
log.Info ("R5: Counter is ${V.Counter}")

rule "R6", "X", null, 2 % 2 == 0:
log.Info ("Rule six: {0}", date.Now)

rule "X", null, null, date.Today > date.Parse('2008-10-11'):
log.Warn("The X Rule!!!")


Sorry for the formatting, I'll fix that in spare time. And a short explanation of what each 'rule' means.
'rule' keyword defines a new rule. It has 5 parameters:
  • rule Id
  • id of positive successor rule (null if there is no successor)
  • id of negative successor rule (null if there is no successor)
  • condition
  • and action (action starts in new line, after last colon - because Boo allows such syntax).

So this entry:

rule "R6", "X", null, 2 % 2 == 0:
log.Info ("Rule six: {0}", date.Now)

means 'define rule R6 that will fire if expression "2 % 2 == 0" evals to true. If it is true, execute action that writes current date to log file. Next rule to evaluate will be "X", or none if the rule doesn't fire'

Currently rules engine is a completely standalone project, but I plan to integrate it into NGinn process engine.It will be used in many places, certainly as a part of process logic, but also for message routing and preprocessing. 

The main problem is that Boo is not yet used in NGinn, except for the RippleBoo project. Currently Script.Net language is the main script environment for NGinn processes and I wouldn't like to mix these two languages. So probably only one is here to stay, and chances are it will be Boo. Script.Net is more elastic and easier to use, but Boo is more mature, better tested and documented. Main issue with Boo is that it's a compiled language, so it will require more effort to integrate it with NGinn engine which is very 'dynamic' in nature. 


Wednesday, 8 October 2008

BPMN - a close family

I have always considered BMPN (Business Process Modelling Notation) to be the 'best' language in its domain - very expressive and well thought out, able to describe real world situations without using strange hacks and without oversimplification. However, I have never thought about implementing it in NGinn - full BPMN 1.1 implementation seemed too complex to be an open-source project objective, especially single-person project.

So I have turned to less complex ideas after reading a bit about YAWL and decided to implement similar language for .Net. But it turns out that YAWL (and NGinn as a consequence) use very similar concepts that can be found in BPMN. That's because all of them are all based on Petri nets, but differ in everything that was added over basic Petri-net specification. For example, BPMN defines several control structures based on non-local events, such as errors (exception handling), compensation and cancellation - quite useful. NGinn has no special constructs for exception handling and no notion of compensating. But when we analyze what 'workflow patterns' can be imlemented in these languages, it turns out that there are no patterns in BPMN that could not be implemented in NGinn or YAWL. It's only a matter of convenience - for example, error handling or compensating is easy to do in BPMN and not so obvious in NGinn (custom logic required). Maybe a material for 2.0 version.

Here's a link to a very nice website about BPMN - Dive Into BPM. Enjoy the dive!

Tuesday, 7 October 2008

Today I'd like to describe some examples of processes that are known to be working in current version of nginn. The focus is on control structures, not the actual task functionality (which is very incomplete as for now). I have selected rather complex and not very obvious examples because the basic ones such as parallelism (AND-split), sequences and decisions (XOR-splits), well, should just work or there wouln't be much to talk about.

Deferred choice with a timeout

This is a very common pattern - deferred choice with a timeout. It can be used for adding some time limits to manual or other tasks. When token is placed in 'start', both tasks are enabled - 'eval_candidate' and 'timeout'. When 'eval_candidate' completes first, timeout is cancelled. When timeout completes first (deadline is reached), eval_candidate is cancelled. 

Deferred choice - complex situation


This proces is an example of more complex implicit choice. There are two places with implicit choice (p1 and p2), each having two possible tasks. However, they share the t2 task. Functionality here is that system enables all tasks: t1, t2 and t3 after tokens arrive at p1 and p2. This construction ensures that either t1 and t3 can complete, or t2 can complete. When t2 completes, t1 and t3 will be cancelled. When t1 completes, t2 will be cancelled and t3 will stay enabled. When t3 completes, t2 will be cancelled and t1 will stay enabled.

OR-join with 'escaping' tokens

This is rather a complex example, so I was very happy to see it working. What we have here. First of all, there's t1 task with an OR-split. The split can choose V1 or V2 path, or both of them. The eval_candidate4 task is a corresponding OR-join.

The catch here is that we have a deferred choice in place p1, and eval_candidate3 task can 'steal' token from p1, effectively moving it out of OR-join's scope. Situations where either V1 or V2 path is chosen are not very interesting. However, if both V1 and V2 are chosen, the eval_candidate4 OR-join should wait for two tokens to arrive before eval_candidate4 can be enabled. But if eval_candidate3 steals the token, eval_candidate4 should 'change its mind' and wait for one token only. Why? Because no more tokens can arrive in such situation, so all possible OR-join's input paths don't contain more tokens.

OR-join with tokens 'stolen' by a cancellation (cancel sets)

Here the situation is similar to the previous case - we have an OR-split and OR-join and two paths V1 and V2. However, there's this little red arrow from t3 to p2. This arrow is a cancellation (cancel set), meaning that when t3 completes all tokens should be removed from p2 (effectively cancelling the eval_candidate2 task). 

Effect is that when both V1 and V2 are chosen, you need to complete eval_candidate and eval_candidate2 before eval_candidate4 will be enabled. Alternatively, you can complete t3, then you will not have to do eval_candidate2. After you complete eval_candidate2, completing t3 has no side-effects.

Short update about current development status

Recently I have made some important changes to the NGinn engine and feel that it's getting close to what I'd like to achieve. Here's the list of most important changes made:

  1. ProcessInstance class makeover. Most important change is that tokens no longer have an identity. At the beginning it was assumed that each token is an independent object and tracking the relation between tasks and tokens has been quite complicated. However, all tokens are the same, they don't convey information - so it was sensible to get rid of their identity. Now only numbers count - all we need to know about tokens is how many of them sit in each place. Results: 50% of code thrown away while retaining the same functionality. Performance and clarity improved.
  2. Custom process state serialization. I have decided to use custom XML serialization instead of binary serialization used previously. Main reason is that binary serialization doesn't support versioning and upgrading the library breaks old version of processes. It adds some work to task implementation, but we have complete control of persistence.
  3. Introduced distributed transactions (each step of process is run in a separate transaction).
  4. Basic infractructure is working. Now I need to concentrate on details and providing complete functionality. Especially, task implementation is quite behind.
  5. Number of examples was added to NGinn.XmlFormsWWW project. It demonstrates how to start and cancel process instances and how to handle worklist functionality (manual tasks). Simple TODO List web application is working (sort of).
Summing up, NGinn API is maturing and there are no heavy public interface modifications. It's time to start documenting it. I hope to use the engine in some commercial project, so this should speed things up and improve the quality. Sounds nice.

Tuesday, 23 September 2008

Tasks in NGinn

Tasks are what the workflow is 'really' made of - they provide the functionality. The rest of workflow definition - places and arrows - just defines how the tasks are interconnected and what are run-time dependencies between them.
So let's try to describe what are the most common types of tasks and what can they do. NGinn is not complete yet, so this will rather be a wish list than a typical technical documentation.

  1. Manual task

    Manual tasks are tasks that are assigned to people (application users). Usually application will provide some kind of 'TODO' list where each user can see tasks currently assigned to him and from where he/she can pick up next task to be done. NGinn provides 'Manual task' building block, but it does not contain actual TODO list or GUI implementation - this is application specific and NGinn does not restrict the implementation.
    Manual tasks have the following configurable parameters:
    • Assignee - id of person responsible for the task
    • Assignee group - id of group responsible for the task (either Assignee or Assignee group must be specified)
    • Task title (short summary)
    • Description (textual description of the task)
    Much more can be said about manual tasks, for example we haven't touched at all the subject of resource management (people database) and organizational structure (groups). I'm going to give you more details on this in next posts.

  2. Timer task

    Timer tasks are used to introduce configurable delays into the process. In runtime, timer task 'starts' when it is enabled (that is, when it gets all required input tokens) and then waits specified amount of time before completing. Task has two parameters:
    • Delay amount (for example: 00:00:30, meaning 30 seconds) or
    • Due date (fixed moment in time when the task will complete). Exactly one of these parameters needs to be specified, depending on situation.


  3. Subprocess task

    As the name suggests, it's a task for starting a sub-process. When this task is enabled it initiates an instance of sub-process and waits until the sub-process completes. Then the task will also complete.
    To start a sub-process we need to know it's name (definition ID) and we need to have input data in correct structure (as defined by process input variables). In this case subprocess task's input data becomes the input data for the newly created process, and when the sub-process completes it's output data becomes Subprocess task's output data. Therefore we need to make sure the subprocess task has the same input/output data structure as the sub-process.

  4. Notification task

    Notification task is used for sending email / sms / other notifications to users.

  5. 'Receive Message' task

    The 'receive message' task waits for a message. It is used in scenarios where communication with external systems is necessary and when our process needs to wait for some information sent by external party. Each external message that can be received contains some data and must contain a special ID, called Message Correlation ID (MCID). The MCID is a runtime parameter of the Receive Message task, that is we need to specify what is the MCID for each Receive Message task. We are free to choose any MCID, but it must uniquely identify the task waiting for the message. By default (when not specified), MCID is assumed to have the following structure: [process instance id].[task id], for example e3bc903829badca321.wait_task.
    The structure of message is defined by Receive Message task's output variables - the message should simply contain values of these variables. When message is received its contents are retrieved and put in Receive Message task's output variables. Then the task completes.
    The most important fact here is that the MCID must be known to the external party when it sends us the message. So either the MCID is mutually agreed on, or our process needs to send the MCID to the external system before it can receive the message from it.


  6. Script task

    Script tasks are used for adding custom logic to the process. Currently they can be programmed in 'Script.NET' language. Script tasks are synchronous, they cannot be 'put to sleep' and reactivated by NGinn execution engine. Script code can access and modify task's variables, but it can also communicate with other objects in application's runtime. They can be used for communication between business processes and the rest of the application.

  7. Empty task

    Empty task does nothing - completes just after being started. But all variable bindings do their work and they can be used for synchronization without side effects - and this is the main purpose of empty tasks.

  8. REST/WS call task

    Synchronous communication with external systems, using XML/HTTP or SOAP. The task sends a HTTP request containing it's input data and expects to receive XML with the output data (XML structure is defined by task's output data structure). Currently there's no implementation of web service calls - I'm waiting for the right idea.

  9. Custom tasks

    Custom tasks can be used to introduce some custom or application-specific components into NGinn process description language. Custom tasks can be implemented in any CLR language, they just have to implement few interfaces and conform to some rules. This is a good topic for separate post.
And this is it, I consider the list to be complete and broad enough at the same time. Almost all real-time task examples can be implemented using one (or more) of NGinn tasks, and what can't be implemented or is difficult to stuff into built-in task type can be done as a custom task.