Saturday, January 16, 2010

StubFiber: How to Deterministically Test Concurrent Code

Writing correct concurrent code is considered by many to be one of the more difficult areas of computer science.  Verifying the correctness of concurrent code is deemed even more difficult.  Ask most developers how to verify the correctness of concurrent code deterministically, and you will be often met with a blank stare.  Even books such as Clean Code, an otherwise worthwhile read, is incredibly disappointing when it comes to testing concurrent code.  An entire chapter and appendix is dedicated to the topic of clean concurrent code.  The recommendation comes down to abstracting away threading (but with no examples of how to do this) and using third party tools to alter the thread scheduler to introduce artificial pauses in the hopes that they will expose bugs.

It turns out that writing correct concurrent code and testing it deterministically is actually not that difficult.  As the saying goes, if it hurts, you are doing it wrong.  The first step is architectural.  Instead of using shared-state concurrency with all the perils of nested locking, use message based concurrency.  Keep unparallalizable mutable state in a single thread and then communicate with other such threads via immutable messages.  Two libraries written by Mike Rettig, called Retlang (for C#) and Jetlang (for Java) greatly assist with message based concurrency by introducing the Fiber and Channel abstractions.  Fibers are abstractions for threads and Channels are abstractions for the conduit through which Fibers send and receive messages from other Fibers.  Since a Fiber is inherently synchronous with itself, it processes all of its messages in order, eliminating the concern of one message handler interrupting another.

By relying on the Fiber abstraction instead of threads themselves, ThreadFibers or PoolFibers can be substituted with StubFibers to obtain synchronous deterministic tests.  Any race conditions between threads can be made explicit in a test and thus correct execution can be validated for all interleaves of the thread schedule.  Below is an example in Retlang (C#):

public class Runner()
{
    private readonly Channel<CalculationResult> channel = new Channel<CalculationResult>();
    private readonly Channel<object> requestChannel = new Channel<object>();

    private readonly StubFiber fiberOne = new StubFiber { ExecutePendingImmediately = false; };
    private readonly CalculatorOne calculatorOne;

    private readonly StubFiber fiberTwo = new StubFiber { ExecutePendingImmediately = false; };
    private readonly CalculatorTwo calculatorTwo;

    private readonly ResultAggregator resultAggregator;

    public Runner()
    {
        calculatorOne = new CalculatorOne(fiberOne, channel, requestChannel);
        calculatorTwo = new CalculatorTwo(fiberTwo, channel, requestChannel);
        resultAggregator = new ResultAggregator(new StubFiber(), channel, requestChannel);
    }

    public void ExecuteAllPendingOnFiberOne()
    {
        fiberOne.ExecuteAllPending();
    }

    public void ExecuteAllPendingOnFiberTwo()
    {
        fiberTwo.ExecuteAllPending();
    }

    public void StartCalculations()
    {
        resultAggregator.StartCalculations();
    }

    public CombinedResult CombinedResult
    {
        get { return resultAggregator.Result; }
    }
}

[Test]
public void FiberExecutionOrderShouldNotChangeResult()
{
    var runner = new Runner();

    runner.ExecuteAllPendingOnFiberOne();
    runner.ExecuteAllPendingOnFiberTwo();

    Assert.AreEqual(CombinedResult.Expected, runner.CombinedResult);
}

[Test]
public void SecondFiberExecutionOrderShouldNotChangeResult()
{
    var runner = new Runner();

    runner.ExecuteAllPendingOnFiberTwo();
    runner.ExecuteAllPendingOnFiberOne();

    Assert.AreEqual(CombinedResult.Expected, runner.CombinedResult);
}
Message based concurrency is vastly superior to shared state concurrency.  Not only is the resulting design significantly easier to understand, it also has the nice side effect of being able to be tested deterministically.

4 comments:

  1. It's really nice to see more blogging about Retlang. :) I do think it's a pity you renamed it to StubFiber, though. I think there's still a role for SynchronousFiber in actual production code.

    ReplyDelete
  2. hi graham, i'm using retlang to buid an nzb downloader .. architecturally i'm implementing a service oriented architecture
    (based on the pipes/filters & process manager abstractions described in the enterprise integration patterns book you link
    to in the retlang google site).
    i have a requirement of multiple consumers (downloading fibers/threads) feeding off a single producer (telling which segment/files
    need to be downloaded). I need to be able to pause all downloads, be able to switch currently downloading nzb (its composed of
    multiple files) and not starve the consumers.
    i'm thinking of implementing the equivalent of the java blockingqueue class into C#, making it bounded in both max and min size.
    The only thing is i'm not sure how to go about it with retlang ... can you suggest an implementation ?

    ReplyDelete
  3. Juan, without knowing the specifics of your problem, a possible solution would be for the consuming fibers to notify the producing fiber when it is ready to take on more downloads and how many. It may help to publish a channel over a channel, so that the receiver can reply directly, but asynchronously. The producing fiber can then send that specific consumer the items to be processed. For pausing, publish a signal on a channel to all consumers. The flag set by this signal should be checked between socket calls.

    ReplyDelete
  4. Graham, thank ... I will into your suggestions over the weekend and let you know of any solution I may come up with

    ReplyDelete