SyntaxHighlighter

Monday, July 29, 2013

Azure Worker Role - Simplified

Azure Cloud Services support 2 types of roles viz. Web Role and Worker Role (VM Role is retired on May 31st 2013). Web Role is used for web application, web APIs etc. and  Worker Role are used to perform background/asynchronous tasks. These are non persistent VMs that Azure spins up on demand with not extract of persistent when they come down.

RoleEntryPoint Class & Role Lifetime

RoleEntryPoint is an abstract class that serves as base class to the entry point class for a Web Role or a Worker Role. Based on the need, you can implement one or more of the methods given below.
OnStart(): This is a virtual method used for initialization of the role before it starts. This is the first method called by the Azure runtime services call
Run(): This is virtual method that must be implemented by the worker role and is optional for web role. Once this method exits, the role shuts down that trigger the AppFabric to start another instance of this role unless this is the result of scale down of the active instances. The lifetime of the role dictates the lifetime of the VM host of this role. In other words, once the role exits the VM associated with the role also comes down.
Typically worker roles implement a loop in this method and perform the required task(s) in the loop. It becomes difficult to implement multiple tasks or parallel processing etc. as this method is called on the primary thread and it is important to make sure that this thread does not exit or abort.
In web roles, the base class Run() method is called that implements the primary thread to sleep and IIS threading model come to play to process clients request.
OnStop(): This is a virtual method used to do clean up of the resource. Cleaning up of the local resource does not matter as this method is called just before the VM associated with the role shutting down, however if you have any shared resource, put the clean up code here.

Typical Worker Role Implementation

The business processes are implemented in the Run() method of the worker role. Below given is a implementation of a typical worker role Run() method.
public override void Run()
{
    while(true)
    {
        CloudQueueMessage message = _inputQ.GetMessage(TimeSpan.FromSeconds(5));
        if (message != null)
        {
            //Process the order based on the inventory
        }
        else
        {
            System.Threading.Thread.Sleep(200);
        }
    }
}
The above given method performs a single background process. If there are no message to be processed in the queue, the role sits idle adding to your cost. If there are n such business processes that you want run in the background, you might end up having a separate role for each of the process. Due to the programming model of the role, putting multiple processes into single role become complex from programming and maintenance standpoint.

Let us assume that you have 3 different business processes viz. Process Order (continuous process), Order Shipment (continuous process) and Settlement (EOD payment processing). To implement this you can have 3 different roles and they can be independently scaled. Or have them implemented into a single role and manage the processes using the threading.

Simplifying Worker Role Implementation


To implement multiple business processes into a single worker role without making it complex, we’ll use XecMe framework. You can reference the assemblies thru Nuget package.

You have to implement these business processes using ITask interface. The above worker role code will change to ITask implementation as shown below. It is recommended to understand the significance of the return value ExecutionState.
public ExecutionState OnExecute(ExecutionContext context)
{
    CloudQueueMessage message = _inputQ.GetMessage(TimeSpan.FromSeconds(5));
    if (message != null)
    {
        //Process the order based on the inventory

        return ExecutionState.Executed;
    }
    else
    {
        return ExecutionState.Idle;
    }
}

And the role entry point implementation will change to
public class MyWorkerRole: RoleEntryPoint
{
    public override bool OnStart()
    {
    }

    public override void Run()  
    {
        TaskManager.Start(new TaskManagerConfig());
        TaskManager.WaitTaskToComplete();
    }

    public override bool OnStop()
    {
        TaskManager.Stop();
    }
}



Once you have all these processes implemented, its matter of how you want to configure these tasks. XecMe support multiple ways to configure the tasks without the need to change a single line of code. The configuration documentation can be referred here. The most suitable configuration for us is Parallel Task Runner for Process Order & Order Shipment and Scheduled Task Runner for EOD Payment Processing.
<parallelTaskRunner name="Process Order" 
    taskType="Background.OrderProcess, Background, Version=1.0.0.0, Culture=nuetral, PublicKeyToken=null" 
    minInstances="1" 
    maxInstances="2"
    idlePollingPeriod="500">
    <parameters>
        <parameter name="inputQueue" value="orderQ"/>
    </parameters>
</parallelTaskRunner>
<parallelTaskRunner name="Order Shipment" 
    taskType="Background.ShipmentProcess, Background, Version=1.0.0.0, Culture=nuetral, PublicKeyToken=null" 
    minInstances="1" 
    maxInstances="2"
    idlePollingPeriod="500">
    <parameters>
        <parameter name="inputQueue" value="shipmentQ"/>
    </parameters>
</parallelTaskRunner>
<scheduledTaskRunner name="Settlement Recon"
    taskType="Background.ShipmentProcess, Background, Version=1.0.0.0, Culture=nuetral, PublicKeyToken=null"
    recursion="Daily" 
    taskTime="22:00:00">
    <parameters>
        <parameter name="fileName" vakue="AUTH_YYYYMMDD.dat" />
    </parameters>
</scheduledTaskRunner>



The parallel task runner is capable of running parallel instances of the task based on the resource availability. So in this case we configured maximum of 2 threads for each of the 2 parallel tasks. If we have enough resources (CPU) then it can run maximum of 2 instances of each parallel tasks. The scheduled task gets triggered every day @ 10 PM and process the settlement reconciliation file. With this configuration we can have 2-3 active role instances based on the need.