As you already know the parallel processing on multicore processors will be supported on the next version of .NET Framework. The .NET Framework 4.0 and Visual Studio 2010 will be release in spring of this year. In this post I am going to talk about Partitiones in ParallelFX.
Partition of data source play important role in parallel data processing. PLINQ and TLP have built-in partition concept which in certain situation and advanced scenarios, does not produce efficient results.
Two main kinds of partitioning existing in ParallelFX.
Range partitioning: When we talk about collections of data such as lists or arrays with known length, range partitioning is a commonly used and the simplest way of partitioning. Each thread receives a certain amount of data from the initial to the final index and there is no possibility that two or more threads access the same information in the list. The only overhead involved in range partitioning is the initial work of creating the ranges; no additional synchronization is required after that. This partitioning method is very efficient when we have constant workload by all data in the list. Drawback of this partitioning is the situation when one thread is finish early, and cannot help other to finish their work.
Chunk partitioning: When we have collections with an unknown number of elements, such as LinkedList etc., the possibility of partitioning comes down to it that every thread takes a number of elements from the list – “chunk size” and processes them. When it completes, it returns and take another chunk. This implementation of partitioning is done in a way that no one thread can take the same element in collection. Chunk size is arbitrary and can be from 1 to several elements, all depends on the nature of the operation. The Chunk partitioner does incur the synchronization overhead each time the thread needs to get another chunk. The amount of synchronization incurred in these cases is inversely proportional to the size of the chunks.
The TPL partitioners support dynamic number of partitions, they are created on the fly when the parallel loop spawns a new task.
Partitioners Class provides common partitioning strategies for arrays, lists, and enumerables. It contains only one methods, and several overloaded versions, which are presented in below text.
public static class Partitioner
public static OrderablePartitioner Create(IEnumerable source);
public static OrderablePartitioner Create(IList list, bool loadBalance);
public static OrderablePartitioner> Create(int fromInclusive, int toExclusive);
public static OrderablePartitioner> Create(long fromInclusive, long toExclusive);
public static OrderablePartitioner Create(TSource array, bool loadBalance);
public static OrderablePartitioner> Create(int fromInclusive, int toExclusive, int rangeSize);
public static OrderablePartitioner> Create(long fromInclusive, long toExclusive, long rangeSize);
In this demo we will show some of custom partitions implementations.Let’s assume that we have to count prime numbers below a 500 000. In this example we have a method for prime number check which is implemented below.
public bool IsPrime(long n)
for (long i = 2; i * i < n; i++)
if (n % i == 0)
The method return true if n is prime, otherwise returns false. The interval for prime number checking is [1,500 000], so the code implementation is:
var src = Enumerable.Range(1, 500000).ToArray();
I Implementation is pure sequential, and it contains simple LINQ query as the following:
var count = (from p in src
As we know this code will be executed on one core of multicore processors.
II Implementation is implementation of the standard PLINQ query similar as above:
var count = (from p in src.AsParallel()
The differences of mentioned implementation is in AsParallel(), word in second implementation. III Implementation is implementation of custom partitioner with Partitioners class. We create partitioner with Create method, and ParallelFX does the rest.
var count2 = Partitioner.Create(src, true).AsParallel().
Where(p => IsPrime(p)).
Select(p => p).Count();
The test is created on Quad Core Intel processors, and the result is presented on the following picture.
The new Partitioner class allows to define custom partitioner which can be in some situations more efficient than default partitioner in PLINQ. Use custom partitioner in PLINQ query when you try to process no constant operations for every element on data source.