parallel.for confusion(收集失败的订单)

时间:2016-02-12 15:13:52

标签: c# concurrency task-parallel-library parallel.for

我遇到一个问题,我正在尝试使用parallel.for()将一些文件加载​​到数据库中。我的问题是,传递给数据库函数的文件ID在某种程度上是不正确的。也就是说,数据库返回错误的数据。我试图通过使用并发字典来验证这一点,以添加有和没有并行的id / name对。在我看来,循环结束后列表应该是相同的。但他们不是。这模拟了我以非常简单的方式做的事情。

这有意义吗?:

class Program
    {
       ConcurrentDictionary<int, string> _cd = new ConcurrentDictionary<int, string>();
        static void Main()
        {
            //simulate the situation
            int[] idList = new int[] {1, 8, 12, 19, 25, 99};
            string[] fileList = new string[] {"file1", "file8", "file12", "file19", "file25", "file99"};

            //run in serial first
            ProcessFiles(idList, fileList); 

            //write out pairs to text file
            foreach (var item in _cd)
            {
                var key = _cd.key;
                var val = _cd.value;
                string line = string.Format("fileId is {0} and fileName is {1}", key, val);

                File.AppendAllText(@"c:\serial.txt", line + Environment.NewLine);
            }
            //results of text file (all good): 
            //fileId is 1 and fileName is file1
            //fileId is 8 and fileName is file8
            //fileId is 12 and fileName is file12
            //fileId is 19 and fileName is file19
            //fileId is 25 and fileName is file25
            //fileId is 99 and fileName is file99

            _cd.Clear();

            //now run in parallel
            ProcessFilesInParallel(idList, fileList); 

            //write out pairs to text file  
            foreach (var item in _cd)
            {
                var key = _cd.key;
                var val = _cd.value;
                string line = string.Format("fileId is {0} and fileName is {1}", key, val);

                File.AppendAllText(@"c:\parallel.txt", line + Environment.NewLine);
            }

            //results of text file (1. some, not all, are mismatched and 2. not all elements got added): 
            //fileId is 8 and fileName is file8
            //fileId is 12 and fileName is file19
            //fileId is 19 and fileName is file12
            //fileId is 25 and fileName is file25
        }

        private void static ProcessFiles(int[]Ids, string[] files)
        {
            int fileId = 0;
            string fileName = string.Empty;

            for(var i=0, i<Ids.Count; i++) 
            {
                fileId = Ids[i];
                fileName = GetControlFileMetaDataFromDB(fileId);

                _cd.TryAdd(fileId, fileName);
            }
        }

        private void static ProcessFilesInParallel(int[]Ids, string[] files)
        {
            int fileId = 0;
            string fileName = string.Empty;

            Parallel.For(0, Ids.Count, i => 
            {
                fileId = Ids[i];

                //this is returning the wrong fileName 
                fileName = GetControlFileMetaDataFromDB(fileId);

                _cd.TryAdd(fileId, fileName);
            }

            );
        }

        private void static GetControlFileMetaDataFromDB(int fileId)
        {
            //removed for brevity:
            //1. connect to oracle
            //2. call function, passing file id
            //3. iterate over data reader and look for the filename 

            while (reader.Read())
            {
                //strip out filename, add it to collection
                int endPos = reader[0].ToString().IndexOf("txt");
                if (endPos != -1)
                {
                    endPos += 3;
                    int startPos = reader[0].ToString().IndexOf(":\\") - 1; 
                    string path = reader[0].ToString().Substring(startPos, endPos - startPos);
                    sring fileName = Path.GetFileName(path);

                    _cd.TryAdd(fileId, fileName);
                    break;
                }
            }
        }
    }

2 个答案:

答案 0 :(得分:7)

您已在Parallel.For中声明fileIdfileName ,这意味着每次迭代都会共享相同的变量。

由于迭代可能在不同的线程上并行运行,因此您可以重新分配变量,而另一个同时迭代可能正在使用它们。

您需要做的是将变量声明移动到循环中,因此它们在每次迭代时都是本地的;

Parallel.For(0, Ids.Count, i => 
{
    int fileId = Ids[i];

    //this is returning the wrong fileName 
    string fileName = GetControlFileMetaDataFromDB(fileId);

    _cd.TryAdd(fileId, fileName);
}

答案 1 :(得分:1)

此处的问题出现在var tabs = require("sdk/tabs"); var { viewFor } = require("sdk/view/core"); tabs.on('open', tab => { var { Cc, Ci } = require('chrome'); var ss = Cc["@mozilla.org/browser/sessionstore;1"].getService(Ci.nsISessionStore); var lowLevelTab = viewFor(tab); ss.setTabValue(lowLevelTab, "key-name-here", "value"); }); 函数中。 ProcessFilesInParallel(int[]Ids, string[] files)循环中的迭代将并行执行,并且您在for的范围之外声明fileIdfileName,因此这些变量将在所有迭代中共享竞争条件。

您可以解决此问题,在for内移动fileIdfileName个变量:

for

此外,在问题的标题 parallel.for confusion(collection lose order),你说集合失去了秩序。正如您可以阅读here那样,没有为并行循环定义执行顺序。