Question

我已经在C ++和MPI中实现了一个代码，该代码应该进行数百万次计算，并为处理其数据的每个CPU在大约7个文件中保存数百万个数字。我使用大约10,000个内核，总共提供70,000个文件，并行编写数百万行代码。

我使用ofstream进行写入，但由于某种原因，MPI代码在中间断开，文件似乎是空的。我希望每个处理器独立写入其7个文件而不是所有其他处理器，根据我的搜索，这可以使用MPI完成，但我在许多资源中阅读它，我无法理解它如何用于独立写入和使用在执行期间动态指定文件名。如果这是正确的方式，有人可以用尽可能多的细节解释它吗？如果没有，请尽可能详细地解释您的其他建议？

我目前的写作不起作用看起来像这样：

if (rank == 0)
    {

    if(mkdir("Database",0777)==-1)//creating a directory
    {

    }
    rowsCount = fillCombinations(BCombinations,  RCombinations,
                                 BList,               RList,
                                 maxCombinations,        BIndexBegin, 
                                 BIndexEnd,           RIndexBegin, 
                                 RIndexEnd,    
                                 BCombinationsIndex,  RCombinationsIndex
                          );
}

//then broad cast all the arrays that will be used in all of the computations and at the root 
//send all the indexes to work on on the slaves then at the slave 

or (int cc = BeginIndex ; cc <= EndIndex; cc++)
        {


           // begin by specifying the values that will be used 
           // and making files for each B and R in the list


            BIndex      = betaCombinationsIndex   [cc];
            RIndex     = roughCombinationsIndex  [cc];



            //creating files to save data in and indicating the R and B by their index 
            //specifying files names

           std::string str1;
           std::ostringstream buffer1;
           buffer1 << "Database/";
           str1 = buffer1.str();

           //specifying file names

            std::ostringstream pFileName;
            std::string ppstr2;
            std::ostringstream ppbuffer2;
            ppbuffer2 <<"P_"<<"Beta_"<<(BIndex+1)<<"_Rho_"<<(RIndex+1)<<"_sampledP"<< ".txt";
            ppstr2 = ppbuffer2.str();
            pFileName <<str1.c_str()<<ppstr2.c_str();
            std::string p_file_name = pFileName.str();

            std::ostringstream eFileName;
            std::string eestr2;
            std::ostringstream eebuffer2;
            eebuffer2 <<"E_"<<"Beta_"<<(BIndex+1)<<"_Rho_"<<(RIndex+1)<<"_sampledE"<< ".txt";
            eestr2 = eebuffer2.str();
            eFileName <<str1.c_str()<< eestr2.c_str();
            std::string e_file_name = eFileName.str();

            // and so on for the 7 files .... 


            //creating the files
            ofstream pFile;
            ofstream eFile;

            // and so on for the 7 files .... 

            //opening the files
            pFile      .open (p_file_name.c_str());
            eFile        .open (e_file_name.c_str());

            // and so on for the 7 files .... 
            // then I start the writing in the files and at the end ...



            pFile.close();

            eFile.close();
}
// end of the segment loop

Answer 1

标准C ++ / C库不足以访问那么多文件。如果您尝试同时访问数十万个文件，即使BG / L / P内核也会崩溃，这与您的数量非常接近。大量物理文件也强调并行系统具有额外的元数据。

复杂的超级计算机通常具有大量专用I / O节点 - 为什么不将标准MPI功能用于并行I / O？这对于您想要保存的文件数量应该足够了。

您可以从这里开始：http://www.open-mpi.org/doc/v1.4/man3/MPI_File_open.3.php

祝你好运！

Answer 2

你需要自己做IO吗？如果没有，你可以试试HDF5 library，这在使用HPC的科学家中非常受欢迎。它可能会看到它，这可能会简化您的工作。例如。你可以在同一个文件中写东西，避免有数千个文件。（注意你的表演可能还取决于你的文件系统）

Answer 3

创建7个线程或处理您正在使用的内容，并将threadid / processid附加到正在写入的文件中。这种方式不应该存在争议。

Answer 4

蓝色基因架构可能只剩下几年，但是如何做到可扩展的I / O＆＃34;我们会待一段时间。

首先，MPI-IO基本上是这种规模的要求，尤其是集体I / O功能。尽管本文是为/ L编写的，但课程仍然具有相关性：

集体开放让图书馆设置了一些优化
集体读取写入可以转换为与GPFS文件系统块边界很好地排列的请求（这对于锁定管理和最小化开销很重要）
选择和放置＆＃34; I / O聚合器＆＃34;可以用一种注意机器拓扑结构的方式来完成

https://press3.mcs.anl.gov/romio/2006/02/15/romio-on-blue-gene-l/

聚合器的选择在/ Q上非常复杂，但我们的想法是选择这些聚合器来平衡所有可用的系统调用I / O转发的I / O＆＃34; （ciod）链接：

https://press3.mcs.anl.gov/romio/2015/05/15/aggregation-selection-on-blue-gene/

在C ++和MPI中独立并行写入文件

4 个答案: