读取CSV文件以检测文件中的最后一个字段

时间:2018-09-25 01:41:01

标签: c++ file parsing stringstream

我正在尝试读取CSV文件,我应该读取三个字段,最后一个字段是整数,并且我使用stoi函数在文件的最后一行崩溃,因为存在没有换行符,我不确定在最后一行时该如何检测。前两个getline语句正在读取前两个字段,而我的第三条getline正在读取并期望一个整数,而我的定界符仅是'\ n',但这不适用于最后一行输入,我想知道有什么解决方法吗?

我期望的字段类型为[int,string,int],并且我必须在中间字段中包含空格,因此我认为使用stringstream不会有效

while (! movieReader.eof() ) { // while we haven't readched end of file
    stringstream ss;
    getline(movieReader, buffer, ','); // get movie id and convert it to integer
    ss << buffer; // converting id from string to integer
    ss >> movieid;
    getline(movieReader, movieName, ','); // get movie name
    getline(movieReader, buffer, '\n');
    pubYear = stoi(buffer); // buffer will be an integer, the publish year
    auto it = analyze.getMovies().emplace(movieid, Movie(movieid, movieName, pubYear ) );
    countMovies++;
}

1 个答案:

答案 0 :(得分:0)

对于读写对象来说,流式提取和流插入运算符在概念上会过载:

csv示例:

1, The Godfather, 1972
2, The Shawshank Redemption, 1994
3, Schindler's List, 1993
4, Raging Bull, 1980
5, Citizen Kane, 1941

代码:

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <iterator>

void skip_to(std::istream &is, char delim) // helper function to skip the rest of a line
{

    char ch;
    while ((ch = is.get()) && is && ch != delim);
}

std::istream& eat_whitespace(std::istream &is) // stream manipulator that eats up whitespace
{
    char ch;
    while ((ch = is.peek()) && is && std::isspace(static_cast<int>(ch)))
        is.get();
    return is;
}

class Movie
{
    int movieid;
    std::string movieName;
    int pubYear;

    friend std::istream& operator>>(std::istream &is, Movie &movie)
    {
        Movie temp;  // use a temporary to not mess up movie with a half-
        char ch;     // extracted dataset if we fail to extract some field.

        if (!(is >> temp.movieid))  // try to extract the movieid
            return is;

        if (!(is >> std::skipws >> ch) || ch != ',') { // read the next non white char
            is.setf(std::ios::failbit);                // and check its a comma
            return is;
        }

        is >> eat_whitespace; // skip all whitespace before the movieName
        if (!std::getline(is, temp.movieName, ',')) {  // read the movieName up to the
            return is;                                 // next comma
        }

        if (!(is >> temp.pubYear))                     // extract the pubYear
            return is;

        skip_to(is, '\n');  // skip the rest of the line (or till eof())
        is.clear();

        movie = temp;  // all went well, assign the temporary
        return is;
    }

    friend std::ostream& operator<<(std::ostream &os, Movie const &movie)
    {
        os << "Nr. " << movie.movieid << ": \"" << movie.movieName << "\" (" << movie.pubYear << ')';
        return os;
    }
};

int main()
{
    char const * movies_file_name{ "foo.txt" };
    std::ifstream is{ movies_file_name };

    if (!is.is_open()) {
        std::cerr << "Couldn't open \"" << movies_file_name << "\"!\n\n";
        return EXIT_FAILURE;
    }

    std::vector<Movie> movies{ std::istream_iterator<Movie>{is},
                               std::istream_iterator<Movie>{} };

    for (auto const & m : movies)
        std::cout << m << '\n';
}

输出:

Nr. 1: "The Godfather" (1972)
Nr. 2: "The Shawshank Redemption" (1994)
Nr. 3: "Schindler's List" (1993)
Nr. 4: "Raging Bull" (1980)
Nr. 5: "Citizen Kane" (1941)