Question

我有四组文本文件，每组包含不同的单词。

noun.txt有7个单词 Article.txt有5个单词 verb.txt有6个单词和 Preposition.txt有5个单词

在下面的代码中，在我的第二个for循环中，一个计数数组跟踪我读入和从哪个文件中读取的单词数。所以例如。 count [0]应该是5个世界，但是count [1]有8个单词，但应该是7.我回去检查文本文件，我没有犯错，它有7个单词。这是ifstream如何表现的问题吗？

我也被告知eof（）不是好习惯。在准确读取数据方面，业界最佳做法是什么？换句话说，我还可以使用更好的东西！infile.eof（）？

#include <cstdlib>
#include <iostream>
#include <fstream>
#include <cctype>
#include <array> // std::array

using namespace std;

const int MAX_WORDS = 100;

class Cwords{
    public:
        std::array<string,4> partsOfSpeech;
};

int main()
{
    Cwords elements[MAX_WORDS];

   int count[4] = {0,0,0,0};

   ifstream infile;

    string file[4] = {"Article.txt",
                      "Noun.txt",
                      "Preposition.txt",
                      "verb.txt"};

    for(int i = 0; i < 4; i++){
        infile.open(file[i]);
        if(!infile.is_open()){
            cout << "ERROR: Unable to open file!\n";
            system("PAUSE");
            exit(1);
        }

        for(int j = 0;!infile.eof();j++){
            infile >> elements[j].partsOfSpeech[i];
            count[i]++;
        }

        infile.close();
    }

    ofstream outfile;
    outfile.open("paper.txt");

    if(!outfile.is_open()){
        cout << "ERROR: Unable to open or create file.\n";
        system("PAUSE");
        exit(1);
    }



    outfile.close();
    system("PAUSE");
    return 0;
}

Answer 1

正确阅读数据的简单答案是：始终在读取后读取操作成功。这个测试确实不涉及eof()的使用（任何教导在阅读之前使用eof()的书都值得立即刻录）。

读取文件的主循环应如下所示：

for (int j = 0; infile >> elements[j].partsOfSpeach[i]; ++j){ ++count[i]; }
顺便说一句，虽然这种语言被称为“C ++”而不是“++ C”，但是除非你确实使用了表达式的结果，否则不要使用post增量：在大多数情况下它并不重要，但有时它会物质然后后增量可能比预增量慢得多。

Answer 2

您是否检查过以确保文本文件末尾没有任何额外的空格或换行符？您的上一个额外“字词”有可能是在到达eof之前跟踪字符。

Answer 3

可能你在文件的末尾有一个空行，看起来是“空的”。我的建议是使用如下代码：

#include <boost/algorithm/string.hpp>
#include <string>

...

    std::string line;
    int cnt = 0;
    while(! infile.eof()) {
        infile >> line;
        boost::algorithm::trim(line);
        if(line.size > 0)
            words[filenr][cnt++] = line;
    }

注意，我强烈建议有一个“外部”对象，它由列表类型索引（对于Article.txt为0，对于Noun.txt为1），“inner”对象为向量，这需要的话。您的实现是相反的，这是次优的，因为您必须在您的实现中的partsOfSpeech向量中携带空插槽。另请注意，在您的示例中，为每个文件的字数设置硬上限为“100”是非常危险的 - 它可能导致缓冲区溢出！最好将std :: vector用于实际的单词列表，因为向量可以轻松自动扩展。

从错误计数的文件中读取数据。阅读数据的最佳做法是什么？

3 个答案: