多线程循环,同时保持顺序

时间:2010-12-28 01:52:32

标签: c# multithreading parallel-processing

我开始乱用多线程来处理我正在运行的CPU密集型批处理。基本上我正在尝试将多个单页tiff压缩成单个PDF文档。这适用于foreach循环或标准迭代,但对于几百页文档来说可能非常慢。我尝试了以下基于我发现使用多线程的一些示例,并且它具有显着的性能改进但是它消除了页面顺序而不是1,2,3,4它将是1,3,4,2,6,5 on什么线程首先完成。

我的问题是如何在维护页面顺序的同时利用这种技术,如果可以,它会否定多线程的性能优势?提前谢谢。

PdfDocument doc = new PdfDocument();
string mail = textBox1.Text;
string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);

int counter = split.Count();

// Source must be array or IList.
var source = Enumerable.Range(0, 100000).ToArray();
// Partition the entire source array.
var rangePartitioner = Partitioner.Create(0, counter);
double[] results = new double[counter];
// Loop over the partitions in parallel.
Parallel.ForEach(rangePartitioner, (range, loopState) =>
{
    // Loop over each range element without a delegate invocation.
    for (int i = range.Item1; i < range.Item2; i++)
    {
        f_prime = split[i].Replace(" " , "");
        PdfPage page = doc.AddPage();
        XGraphics gfx = XGraphics.FromPdfPage(page);
        XImage image = XImage.FromFile(f_prime);
        double x = 0;
        gfx.DrawImage(image, x, 0);

    }
});

3 个答案:

答案 0 :(得分:3)

我只使用Parallel.ForEach的重载来返回元素索引:

 Parallel.ForEach(rangePartitioner, (range, loopState, elementIndex) =>

然后在你的循环中,你可以用你的工作结果填充一个数组,并在完成所有工作后按顺序查看结果。

答案 1 :(得分:2)

使用.AsParallel()。AsOrdered(),如本文档所述:http://msdn.microsoft.com/en-us/library/dd460677.aspx

我认为它看起来像这样:

rangePartitioner.AsParallel().AsOrdered().ForAll(
    range => 
    {
        // Loop over each range element without a delegate invocation.
        ...
    });

答案 2 :(得分:2)

我不确定其他解决方案是否会按照他想要的方式运行。这样做的原因是PdfPage page = doc.AddPage();同时创建并添加了一个新页面,因此它总是会出现故障,因为订单是先到先后通过doc

来决定的

如果AddPage是快速操作,您可以一次创建所有100个页面,无需任何处理。然后返回并将Tiff图像渲染到页面中。

PdfDocument doc = new PdfDocument();
string mail = textBox1.Text;
string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);

int counter = split.Count();

// Source must be array or IList.
var source = Enumerable.Range(0, 100000).ToArray();
// Partition the entire source array.
var rangePartitioner = Partitioner.Create(0, counter);

double[] results = new double[counter];

PdfPage[] pages = new PdfPage[counter];
for (int i = 0; i < counter; ++i) 
{
    pages[i] = doc.AddPage();
}

// Loop over the partitions in parallel.
Parallel.ForEach(rangePartitioner, (range, loopState) =>
{
    // Loop over each range element without a delegate invocation.
    for (int i = range.Item1; i < range.Item2; i++)
    {
        f_prime = split[i].Replace(" " , "");
        PdfPage page = pages[i];
        XGraphics gfx = XGraphics.FromPdfPage(page);
        XImage image = XImage.FromFile(f_prime);
        double x = 0;
        gfx.DrawImage(image, x, 0);
    }
});

修改

我认为有一个更优雅的解决方案但不知道PdfPage的属性我以前不想提供它。如果您可以告诉PfdPage属于哪个页面,您可以将事情变得非常简单:

PdfDocument doc = new PdfDocument();
string mail = textBox1.Text;
string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);

int counter = split.Count();

// Source must be array or IList.
var source = Enumerable.Range(0, 100000).ToArray();
// Partition the entire source array.
var rangePartitioner = Partitioner.Create(0, counter);

double[] results = new double[counter];

// Loop over the partitions in parallel.
Parallel.ForEach(rangePartitioner, (range, loopState) =>
{
    // Loop over each range element without a delegate invocation.
    for (int i = range.Item1; i < range.Item2; i++)
    {
        PdfPage page = doc.AddPage();
        // Only use i as a loop not as the index
        int pageIndex = page.PageIndex; // This is what I don't know
        f_prime = split[pageIndex].Replace(" " , "");
        XGraphics gfx = XGraphics.FromPdfPage(page);
        XImage image = XImage.FromFile(f_prime);
        double x = 0;
        gfx.DrawImage(image, x, 0);
    }
});