Question

几个小时前，我问如何从特定格式的文件中读取内容，以便对操作员进行编程>> 该文件的格式为：

Salad;Tomatoe 50;Fresh lettuce 100;Potatoe 60;Onion 10
Macaroni;Macaroni 250;Tomatoe 60;Oil 10
Fish and chips;fish 30;potatoe 30;Oil 40

我有以下课程：

...
#include <list> //I'm using list of the STL
....
class recipe{
 private:
   list<pair<string,unsigned int>> ing; //A list of the ingredients of one recipe. String for the name of the ingredient and unsigned int for the quantity of each ingredient
 public:
  ....

 //My solution for operator >>

istream & operator >> (istream &i, recipe &other){
   string line, data, name_ing;
   string code, nombre;
   unsigned int plate, quantity;
   list<pair<string,unsigned int>> ings;

   getline(i,line);

   stringstream s (line);

   getline(s,data,';');
   code = data;
   getline(s,data,';');
   plate = atoi(data.c_str());
   getline(s,data,';');
   name = data;

   while(getline(s,data,' ')){
     name_ing = data;

    getline(s,data,';');
    quantity = atoi(data.c_str());

    pair<string,unsigned int> ingredient;
    ingredient.first = name_ing;
    ingredient.second = quantity;

    ings.push_back(ingredient);   
}

   recipe a_recipe(code,plate,name,0,0,0,0,0,ings);
   oher = a_recipe;

   return i;
}

所以现在我有另一个问题，我不知道如何读取由两个词组成的成分，例如：“新鲜生菜50”，因为输出将是：

 Salad;Tomatoe 50;Fresh 0;Potatoe 60;Onion 10

它不读生菜和数量。有帮助吗？

Answer 1

如前所述：

要解决眼前的问题，有或多或少的标准方法。您要读取csv数据。

在您的情况下，这有点困难，因为您确实嵌套了csv数据。所以首先是“;”分隔列表，然后是空格分隔的列表。第二个有点不太精确，因为我们的配料在量之前有2个空格，例如“红辣椒2”中

现在，该怎么办呢？ C ++是一种面向对象的语言。您可以创建由数据和对该数据进行操作的成员函数组成的对象。我们将定义一个“食谱”类，并覆盖插入程序和提取程序操作符。因为只有班级应该知道这是如何工作的。完成此操作后，输入和输出变得容易。

提取器，这就是问题的核心，如前所述，有点棘手。如何才能做到这一点？

在提取器中，我们将首先使用功能std::istream从std::getline中读取整行。有了这一行之后，我们看到一个std::string包含用分号分隔的“数据字段”。 std::string需要拆分，并且“数据字段”-内容应存储。另外，您需要拆分成分。

拆分字符串的过程也称为标记化。 “数据字段”内容也称为“令牌”。为此，C ++具有标准功能：std::sregex_token_iterator。

并且因为我们已经为此目的设计了一些东西，所以我们应该使用它。

这是一个迭代器。用于遍历字符串，因此使用sregex。开始部分定义了我们将在什么输入范围上进行操作，然后在输入字符串中有一个std :: regex表示应该匹配的内容或不应该匹配的内容。匹配策略的类型由最后一个参数给出。

1 --> give me the stuff that I defined in the regex and
-1 --> give me that what is NOT matched based on the regex.

我们可以使用此迭代器将令牌存储在std::vector中。 std::vector具有范围构造器，该构造器接受2个迭代器一个参数，并将第一个迭代器和第二个迭代器之间的数据复制到std::vector。

声明

std::vector token(std::sregex_token_iterator(line.begin(), line.end(), separator, -1), {});

定义类型为std::vector<std::string>的变量“令牌”，拆分std::string并将令牌放入std::vector。将数据保存在std::vector中之后，我们将其复制到我们类的数据成员中。

对于第二次拆分，我们创建2个简单的lambda，并将数据复制到成分列表中。

非常简单。

下一步。我们想从文件中读取。文件内容也包含某种相同的数据。相同的数据是行。

如上所述，我们可以迭代相似的数据。是文件输入还是其他。为此，C ++具有std :: istream_iterator。这是一个模板，作为模板参数，它获取应读取的数据类型，作为构造函数参数，它获取对输入流的引用。不管输入流是std :: cin还是std :: ifstream或std :: istringstream。各种流的行为都相同。

由于我们没有SO的文件，因此我（在下面的示例中）使用std :: istringstream来存储输入的csv文件。但是，当然可以通过定义std :: ifstream csvFile（filename）来打开文件。没问题。

我们现在可以读取完整的csv文件并将其拆分为令牌并获取所有数据，只需定义一个新变量并再次使用range构造函数即可。

std::vector cookBook(std::istream_iterator<Recipe>(sourceFile), {});

这种非常简单的单行代码将读取完整的csv文件并完成所有预期的工作。

请注意：我正在使用C ++ 17，并且可以不使用模板参数定义std::vector。编译器可以从给定的函数参数中推导出自变量。此功能称为CTAD（“类模板参数推导”）。

此外，您可以看到我没有明确使用“ end（）”迭代器。

此迭代器将使用正确的类型从空括号括起来的初始化程序列表构造，因为由于std :: vector构造函数要求与第一个参数的类型相同，因此会推断出它。 >

Ì希望我能回答您的基本问题。请参见下面的完整C ++示例：

#include <iostream>
#include <regex>
#include <string>
#include <list>
#include <vector>
#include <iterator>
#include <sstream>

// Data types for ingredients and quantity
using Ingredients = std::pair<std::string, int>;

// Some helper functions
auto trim = [](const std::string & s) { return std::regex_replace(s, std::regex("^ +| +$"), "$1"); };
auto split = [](const std::string & s) {size_t pos{ s.rfind(' ') }; return Ingredients(s.substr(0, pos), std::stoi(s.substr(pos))); };

std::regex separator{ ";" };

// Our recipe class
struct Recipe {
    // data
    std::string title{};
    std::list<Ingredients> ingredients{};

    // Overwrite extractor
    friend std::istream& operator >> (std::istream& is, Recipe& r) {

        // We will read one line into this temproary
        std::string line{};
        if (std::getline(is, line)) {
            // Tokenize the base string
            std::vector token(std::sregex_token_iterator(line.begin(), line.end(), separator, -1), {});
            // get the recipe title
            r.title = token[0];
            // And, get the ingredients
            r.ingredients.clear();
            std::transform(std::next(token.begin()), token.end(), std::back_inserter(r.ingredients), 
                [](const std::string& s) { return split(trim(s)); });
        }
        return is;
    }

    // Overwrite inserter
    friend std::ostream& operator << (std::ostream& os, const Recipe& r) {
        // Print one recipe
        os << "---- Recipe: " << r.title << "\n-- Ingredients:\n\n";
        for (const auto& [ingredient, quantity] : r.ingredients) 
            os << ingredient << " --> " << quantity << "\n";
        return os;
    }
};

// Source file with CSV data. I added "Red Pepper 2" to Salad
std::istringstream sourceFile{ R"(Salad;Tomatoe 50;Lettuce 100;Potatoe 60;Red Pepper 2;Onion 10
Macaroni;Macaroni 250;Tomatoe 60;Oil 10
Fish and chips;fish 30;potatoe 30;Oil 40)" };

int main() {
    // Read all data from the file with the following one-liner
    std::vector cookBook(std::istream_iterator<Recipe>(sourceFile), {});

    // Show some debug output
    std::copy(cookBook.begin(), cookBook.end(), std::ostream_iterator<Recipe>(std::cout, "\n"));
    return 0;
}

再次：可惜没有人读过这个。。

Answer 2

我建议您使用成分和数量部分来做类型，而不要使用std::pair<std::string, unsigned>。这样一来，您也可以为该类型添加流运算符（而不用担心它被您想要支持的std::pair<std::string, unsigned>所使用的类型不同）。它在某种程度上分解了问题，使其更易于实施/理解。

话虽如此，我建议您使用除空格之外的其他东西作为成分名称和数量之间的分隔符，因为这会使解析变得复杂（如代码中所示）。

这是一个带有注释的示例：

#include <cstdlib>
#include <iostream>
#include <list>
#include <sstream>
#include <string>
#include <tuple>

// a simple ingredient type
struct ingredient {
    std::string name{};
    unsigned amount{};
};

// read an ingredient "<name> <amount>"
std::istream& operator>>(std::istream& is, ingredient& i) {
    std::string entry;
    if(std::getline(is, entry, ';')) { // read until ; or EOL

        // find the last space in "entry"
        if(size_t pos = entry.rfind(' '); pos != std::string::npos) {

            // extract the trailing amount
            if(unsigned am = static_cast<unsigned>(
                   // Create a substring from the last space+1 and convert it to an
                   // unsigned (long). The static_cast<unsigned> silences a warning about
                   // the possibility to get the wrong value if it happens to be larger
                   // than an unsigned can hold.
                   std::strtoul(entry.substr(pos + 1).c_str(), nullptr, 10));
               // and check that we extracted something else than zero
               am != 0)
            {        // extracted the amount successfully
                i.name = entry.substr(0, pos); // put the name part in i.name
                i.amount = am;                 // and the amount part in i.amount
            } else { // extracting the amount resulted in 0
                // set failbit state on is
                is.setstate(std::ios::failbit);
            }
        } else { // no space found, set failbit
            is.setstate(std::ios::failbit);
        }
    }
    return is;
}

// output an ingredient
std::ostream& operator<<(std::ostream& os, const ingredient& i) {
    return os << i.name << " " << i.amount;
}

class recipe {
public:
    std::string const& name() const { return rname; }

    // convenience iterators to iterate over ingreidiences, const
    auto begin() const { return ing.cbegin(); }
    auto end() const { return ing.cend(); }

    // non-const if you'd like to be able to change an ingredient property while iterating
    auto begin() { return ing.begin(); }
    auto end() { return ing.end(); }

private:
    std::list<ingredient> ing{};     // the new type in use
    std::string rname{};             // recipe name

    friend std::istream& operator>>(std::istream&, recipe&);
};

std::istream& operator>>(std::istream& i, recipe& other) {
    std::string line;
    if(std::getline(i, line)) {
        std::istringstream ss(line);
        if(std::getline(ss, other.rname, ';')) {
            // only read the recipe's name here and delegate reading each ingredient
            // to a temporary object of your new ingredient type
            other.ing.clear();             // remove any prior ingrediences from other
            ingredient tmp;
            while(ss >> tmp) {             // extract as normal
                other.ing.push_back(tmp);  // and put in ing if successful
            }
        }
    }
    return i;
}

// output one recipe in the same format as it can be read
std::ostream& operator<<(std::ostream& os, const recipe& other) {
    os << other.name();
    for(auto& i : other) {
        os << ';' << i;
    }
    return os << '\n';
}

int main() {
    std::istringstream is(
        "Salad;Tomatoe 50;Fresh lettuce 100;Potatoe 60;Onion 10\n"
        "Macaroni;Macaroni 250;Tomatoe 60;Oil 10\n"
        "Fish and chips;fish 30;potatoe 30;Oil 40\n");
    recipe r;
    while(is >> r) {
        std::cout << r;
    }
}

使用运算符从文件中逐字读取>>

2 个答案: