C ++ - 解析带有多个分隔符的.txt文件,显示字符串

时间:2017-12-01 20:25:52

标签: c++ string parsing variables output

大家好! 我是C ++的新手,唉我犯了愚蠢的错误。 这是.txt文件内容的片段:

<tag attr1="value1" attr2="value2" ... >

我想要完成的是解析.txt文件,生成以下输出:

Tag: tag
name: attr1
value: value1
name: attr2
value: value2

到目前为止我所做的并没有奏效(我的问题是分隔符):

#include<iostream>
#include <sstream>
#include <string>
#include <vector>
#include <fstream>

using namespace std;

struct tagline{
string tag;
string attributeN;
string attributeV;

};

int main(){
vector<tagline> information;
string line;
tagline t;

ifstream readFile("file.txt");
    while(getline(readFile,line)){
    stringstream in(line);
    getline(in,t.tag);
    getline(in,t.attributeN,'=');
    getline(in,t.attributeV,'"');
    information.push_back(t);

}

vector<tagline>::iterator it = information.begin();

for(; it != information.end(); it++){
cout << "Tag: " << (*it).tag << " \n"
     << "name: " << (*it).attributeN << " \n"
     << "value: " << (*it).attributeV << " \n";

}
return 0;

}

所有我得到的是片段的简单显示,因为它在.txt文件中格式化:

<tag attr1="value1" attr2="value2" ... >
如果有人可以提供帮助,我会很高兴。谢谢!

3 个答案:

答案 0 :(得分:3)

使用HTML / XML解析器(取决于文件实际包含的内容)可以更好地处理。

话虽如此,你没有正确解析这些行。

您对getline(in,t.tag);的第一次调用未指定分隔符,因此它会读取整行,而不仅仅是第一个单词。您必须使用getline(in, t.tag, ' ');代替。

此外,您的代码可以有多个属性,但您只是阅读并存储第一个属性,而忽略其余属性。您需要一个循环来读取所有这些循环,并使用std::vector将它们全部存储到其中。

尝试更像这样的东西:

#include <iostream>
#include <sstream>
#include <string>
#include <vector>
#include <fstream>

using namespace std;

struct tagattribute {
    string name;
    string value;
};

struct tagline {
    string tag;
    vector<tagattribute> attributes;
};

int main() {
    vector<tagline> information;
    string line;

    ifstream readFile("file.txt");
    while (getline(readFile, line)) {
        istringstream in(line);

        tagline t;
        tagattribute attr;

        in >> ws;

        char ch = in.get();
        if (ch != '<')
            continue;

        if (!(in >> t.tag))
            continue;

        do
        {
            in >> ws;

            ch = in.peek();
            if (ch == '>')
                break;

            if (getline(in, attr.name, '=') &&
                in.ignore() &&
                getline(in, attr.value, '"'))
            {
                t.attributes.push_back(attr);
            }
            else
                break;
        }
        while (true);

        information.push_back(t);
    }

    vector<tagline>::iterator it = information.begin();
    for(; it != information.end(); ++it) {
        cout << "Tag: " << it->tag << "\n";

        vector<tagattribute>::iterator it2 = it->attributes.begin();
        for(; it2 != it->attributes.end(); ++it2) {
            cout << "name: " << it2->name << "\n"
            << "value: " << it2->value << "\n";
        }

        cout << "\n";
    }

    return 0;
}

Live demo

或者,考虑编写一些自定义operator>>来帮助解析,例如:

#include <iostream>
#include <sstream>
#include <string>
#include <vector>
#include <fstream>

using namespace std;

struct tagattribute {
    string name;
    string value;
};

istream& operator>>(istream &in, tagattribute &attr)
{
    getline(in, attr.name, '=');
    in.ignore();
    getline(in, attr.value, '"');
    return in;
}

struct tagline {
    string tag;
    vector<tagattribute> attributes;
};

istream& operator>>(istream &in, tagline &t)
{
    tagattribute attr;

    in >> ws;

    char ch = in.get();
    if (ch != '<')
    {
        in.setstate(ios_base::failbit);
        return in;
    }

    if (!(in >> t.tag))
        return in;

    do
    {
        in >> ws;

        ch = in.peek();
        if (ch == '>')
        {
            in.ignore();
            break;
        }

        if (!(in >> attr))
            break;

        t.attributes.push_back(attr);
    }
    while (true);

    return in;
}

int main() {
    vector<tagline> information;
    string line;

    ifstream readFile("file.txt");
    while (getline(readFile, line)) {
        istringstream in(line);
        tagline t;     

        if (in >> t)
            information.push_back(t);
    }

    vector<tagline>::iterator it = information.begin();
    for(; it != information.end(); ++it) {
        cout << "Tag: " << it->tag << "\n";

        vector<tagattribute>::iterator it2 = it->attributes.begin();
        for(; it2 != it->attributes.end(); ++it2) {
            cout << "name: " << it2->name << "\n"
            << "value: " << it2->value << "\n";
        }

        cout << "\n";
    }

    return 0;
}

Live demo

答案 1 :(得分:1)

好吧,我会尝试做这样的事情using this wonderful answer

struct xml_skipper : std::ctype<char> {
    xml_skipper() : ctype(make_table()) { }
private:
    static mask* make_table() {
        const mask* classic = classic_table();
        static std::vector<mask> v(classic, classic + table_size);
        v[','] |= space;
        v['"'] |= space;
        v['='] |= space;
        v['<'] |= space;
        v['>'] |= space;
        return &v[0];
    }
};

然后,你能做的就是继续阅读:

ifstream readFile("file.txt");
while(getline(readFile,line)){
    istringstream in(line);
    in.imbue(std::locale(in.getloc(), new xml_skipper));
    in >> t.tag >> t.attributeN >> t.attributeV;
    information.push_back(t);
}
//...

请注意,如果值或属性名称包含空格,则会中断。

如果你想要更严肃的事情,你需要编写词法分析器,语法树构建器和语义树构建器。

完整代码

#include<iostream>
#include <sstream>
#include <string>
#include <vector>
#include <fstream>
#include <sstream>

using namespace std;

struct tagline{
    string tag;
    string attributeN;
    string attributeV;
};

struct xml_skipper : std::ctype<char> {
    xml_skipper() : ctype(make_table()) { }
private:
    static mask* make_table() {
        const mask* classic = classic_table();
        static std::vector<mask> v(classic, classic + table_size);
        v[','] |= space;
        v['"'] |= space;
        v['='] |= space;
        v['<'] |= space;
        v['>'] |= space;
        return &v[0];
    }
};

int main(){
    vector<tagline> information;
    string line;
    tagline t;
    std::istringstream readFile{"<tag attr1=\"value1\" attr2=\"value2\" ... >"};
    while(getline(readFile,line)){
        istringstream in(line);
        in.imbue(std::locale(in.getloc(), new xml_skipper));
        in >> t.tag >> t.attributeN >> t.attributeV;
        information.push_back(t);
    }


    vector<tagline>::iterator it = information.begin();

    for(; it != information.end(); it++){
        cout << "Tag: " << (*it).tag << " \n"
             << "name: " << (*it).attributeN << " \n"
             << "value: " << (*it).attributeV << " \n";
    }
}

Live on Wandbox

答案 2 :(得分:0)

如果您的输入可能在xml规范的范围内变化,那么XML解析器可能比解析字符串&#34;手动&#34;更好。 只是为了说明它的外观,请参阅以下代码。它基于tinyxml2,只需要在项目中包含一个.cpp / .h文件。你当然可以使用任何其他xml库;这仅用于演示目的:

#include <iostream>
#include "tinyxml2.h"
using namespace tinyxml2;

int main()
{
    const char* test = "<tag attr1='value1' attr2 = \"value2\"/>";
    XMLDocument doc;
    doc.Parse(test);
    XMLElement *root = doc.RootElement();
    if (root) {
        cout << "Tag: " << root->Name() << endl;
        const XMLAttribute *attrib = root->FirstAttribute();
        while (attrib) {
            cout << "name: " << attrib->Name() << endl;
            cout << "value : " << attrib->Value() << endl;
            attrib = attrib->Next();
        }
    }
}