如何在MPI C ++中分散字符串数组

时间:2017-12-12 18:15:31

标签: c++ mpi

我想要做的是对某些字符串运行基本的MapReduce操作。我想:

  1. 向我的所有进程分发(同等)字符串列表,
  2. 在流程中:将收到的字符串映射到自定义类的对象(例如WordWithFrequency),
  3. 收集对象并再次将它们发送到进程以进行进一步操作。
  4. 这应该是一项简单的任务,但我找不到合适的方法。这是我破碎的代码:

    #include <iostream>
    #include <fstream>
    #include <mpi.h>
    #include <vector>
    
    ...
    
    int main(int argc, char *argv[]) {
        // Initialize the MPI environment
        MPI_Init(&argc, &argv);
    
        // Find out the process rank and the world size
        int world_rank;
        MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
        int world_size;
        MPI_Comm_size(MPI_COMM_WORLD, &world_size);
    
        vector<string> words = { "a", "bc", "d" };
        const int wordsLength = words.size();
        const int wordsPerProcess = wordsLength / world_size;
    
        string *subWords = new string[wordsPerProcess];
        MPI_Scatter(&words, wordsPerProcess, MPI_CHAR, subWords, wordsPerProcess, ???customDataType???, 0, MPI_COMM_WORLD);
    
        printf("Process %d got words:\n", world_rank);
        for (int i = 0; i < wordsPerProcess; ++i) {
            cout << subWords[i] << endl;
        }
    
        ...
    

    输出是一些有趣的字母,从执行变为执行:

    Process 0 got words:
    �R
    
    
    Process 1 got words:
    

1 个答案:

答案 0 :(得分:0)

使用Boost.MPI这是一项非常简单的任务:

#include <boost/mpi.hpp>
...

int main(int argc, char *argv[]) {
    // Initialize the MPI environment.
    mpi::environment env(argc, argv);
    mpi::communicator world;

    vector<string> words = { "foo", "bar", "baz", "..." };
    const int wordCount = words.size();
    const int wordsPerProcess = wordCount / world.size();
    vector<vector<string> > wordsByProcess(world.size(), vector<string>());
    for (int j = 0; j < world.size(); ++j) {
        for (int k = 0, wordIndex = j * wordsPerProcess + k;
             k < wordsPerProcess && wordIndex < wordCount; ++k, ++wordIndex) {
            wordsByProcess[j].push_back(words[wordIndex]);
        }
    }

    vector<string> subWords;
    mpi::scatter(world, wordsByProcess, subWords, 0);
    // subWords is equal to wordsByProcess[world.rank()] here in every process.

Scatter采用元素向量,该元素由相应值将被发送到的进程号索引。有关详细信息,请参阅:http://www.boost.org/doc/libs/1_41_0/doc/html/boost/mpi/scatter.html