多线程文件读取为每个线程产生相同的结果

时间:2018-02-01 17:12:15

标签: c++ multithreading file-io

基本上,我遇到的问题是标题,我正在尝试创建一个多线程应用程序来读取和总结文件的内容,这可以正常使用一个线程。然而,当引入更多时,它们会产生相同的输出。我该如何解决这个问题?

代码

void *sumThread(void *);
pthread_mutex_t keepOut = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t keepOutSum = PTHREAD_MUTEX_INITIALIZER;
int counter = 0, line_count = 0;
char* loc;
double total = 0;

void split(const string& s, char c, vector<string>& v)
{
    string::size_type i = 0;
    string::size_type j = s.find(c);

    while (j != string::npos)
    {
        v.push_back(s.substr(i, j - i));
        i = ++j;
        j = s.find(c, j);

        if (j == string::npos)
            v.push_back(s.substr(i, s.length()));
    }
}

int main(int argc, char* argv[])
{

    if (argc < 2)
    {

        cerr << "Usage: " << argv[0] << " filename" << endl;
        return 1;
    }

    string line;
    loc = argv[1];
    ifstream myfile(argv[1]);
    myfile.unsetf(ios_base::skipws);

    line_count = std::count(std::istream_iterator<char>(myfile),
                            std::istream_iterator<char>(),
                            '\n');

    myfile.clear();
    myfile.seekg(-1, ios::end);
    char lastChar;
    myfile.get(lastChar);
    if (lastChar != '\r' && lastChar != '\n')
        line_count++;

    myfile.setf(ios_base::skipws);
    myfile.clear();
    myfile.seekg(0, ios::beg);

    pthread_t thread_id[NTHREADS];

    for (int i = 0; i < NTHREADS; ++i)
    {
        pthread_create(&thread_id[i], NULL, sumThread, NULL);
    }

    for (int i = 0; i < NTHREADS; ++i)
    {
        pthread_join(thread_id[i], NULL);
    }

    cout << setprecision(2) << fixed << total << endl;
    return 0;
}

void *sumThread(void *)
{

    pthread_mutex_lock(&keepOut);
    int threadNo = counter;
    counter++;
    pthread_mutex_unlock(&keepOut);

    ifstream myfile(loc);
    double runningTotal = 0;
    string line;

    if (myfile.is_open())
    {
        for (int i = threadNo; i < line_count; i += NTHREADS)
        {
            vector < string > parts;

            getline(myfile, line);
            // ... and process out the 4th element in the CSV.
            split(line, ',', parts);

            if (parts.size() != 3)
            {
                cerr << "Unable to process line " << i
                        << ", line is malformed. " << parts.size()
                        << " parts found." << endl;
                continue;
            }

            // Add this value to the account running total.
            runningTotal += atof(parts[2].c_str());
        }
        myfile.close();
    }
    else
    {
        cerr << "Unable to open file";
    }

    pthread_mutex_lock(&keepOutSum);

    cout << threadNo << ":  " << runningTotal << endl;
    total += runningTotal;
    pthread_mutex_unlock(&keepOutSum);
    pthread_exit (NULL);
}

示例输出

 2:  -46772.4
 0:  -46772.4
 1:  -46772.4
 3:  -46772.4
 -187089.72

每个线程都应该读取并汇总文件中的数字,然后在完成后将它们加在一起。但是,即使threadNo变量与输出中指示的明显不同,所有线程似乎都返回相同的数字。

1 个答案:

答案 0 :(得分:1)

你的问题在这里:

for (int i = threadNo; i < line_count; i += NTHREADS) {
    vector<string> parts;

    getline(myfile, line);

getline()不知道i的值,因此它仍在读取文件中的相邻行,而不跳过任何行。因此,所有线程都在读取文件的前几行。