如何解析表示树状数据结构的字符串

时间:2018-10-18 09:00:01

标签: c++ parsing emv

我有一个字符串,格式为:

  

0002010212431438742295520465365303566540810000.005802NG5917Arinfbnuattest4086005Lagos61052340163049091

我需要将文本中包含的每个标记解析为三个成员的结构,例如

struct token {
    string id;
    int length;
    string value;
};

每个令牌都可以使用它的第一个文本(长度为2)(从00到99)来标识,它就是它的ID。 然后,id后面跟一个数字值,代表该值的长度,该值是下一个项目,然后是长度。

这里的问题是一些ID(令牌)也代表令牌的集合,每个ID都从00开始...我试图通过这种方式解决它...

tlv* Decoder::parsetlv(std::string data)
{
    tlv* root = new tlv();

    tlv* tlv_list = root;
    tlv* temp = nullptr;


    for (size_t index = 0; index < data.length(); temp = tlv_list, tlv_list = tlv_list->next) {

        if (!tlv_list) {
            temp->next = new tlv();
            tlv_list = temp->next;
        }


        tlv_list->Id = data.substr(index, 2);
        auto tempId = tlv_list->Id;
        index = index + 2;

        tlv_list->length = data.substr(index, 2);
        index = index + 2;

        int length = atoi(tlv_list->length.c_str());
        tlv_list->value = data.substr(index, length);
        if (any_of(_parentTagsIdentifiers, 72, tlv_list->Id)) {
            //place of horror
            tlv_list->child = tlv_list;

        }
        index = index + length;

    }

    return root;
}

我从实现中发现的问题是,子ID被混淆为父ID,因为它们具有相同的ID,而不同之处在于,父ID在所谓的根ID下,而子ID在以下(之后的另一个ID,在这种情况下称为模板ID。

在我的实现中,我使用某种形式的链表,但是欢迎使用任何c ++容器的想法。

编辑

这是利用解码器()的实现

const char* doToString(const char * dataId)
{
if (strncmp("00", dataId, 2) == 0) {
    return "Payload Format";
}
else if (strncmp("01", dataId, 2) == 0) {
    return "Point of Initiation Method";
}
else if (strncmp("02", dataId, 2) == 0) {
    return "Visa Card id";
}
else if (strncmp("52", dataId, 2) == 0) {
    return "Merchant Category code";
}
else if (strncmp("53", dataId, 2) == 0) {
    return "Transaction Currency Code";
}
else if (strncmp("54", dataId, 2) == 0) {
    return "Transaction Amount";
}
else if (strncmp("55", dataId, 2) == 0) {
    return "Tip or Convinience Indicator";
}
else if (strncmp("56", dataId, 2) == 0) {
    return "Value of Convinience Fee Fixed";
}
else if (strncmp("57", dataId, 2) == 0) {
    return "Value of Convinience Fee Fixed";
}
else if (strncmp("58", dataId, 2) == 0) {
    return "Country Code";
}
else if (strncmp("59", dataId, 2) == 0) {
    return "Merchant Name";
}
else if (strncmp("60", dataId, 2) == 0) {
    return "Merchant City";
}
else if (strncmp("61", dataId, 2) == 0) {
    return "Merchant Postal Code";
}
else if (strncmp("62", dataId, 2) == 0) {
    return "Additional Data Field Template";
}
else if (strncmp("63", dataId, 2) == 0) {
    return "Cyclic Redundancy Check";
}
else if (strncmp("64", dataId, 2) == 0) {
    return "Merchant Info Lang Template";
}
}

void realTostring(string data) {
Qr::Decoder dec;
const Qr::tlv* head = dec.parsetlv(data);
const Qr::tlv* qr = head;
string name;
while (qr) {
    if (!qr->child) {
        std::cout << doToString(qr->Id.c_str()) << " " << qr->length << " " 
  << qr->value << std::endl;
    }
    else if(qr->child) {
        std::cout << qr->Id;
        std::cout << "Additional Child" <<" "<< qr->child->Id << " "
   << qr->child->length <<" "<<qr->child->value << std::endl;
    }
    qr = qr->next;
}
deleteTlv(head);
}

这是输出

Payload Format 02 01
02Additional Child 02 12 431438742295
Merchant Category code 04 6536
Transaction Currency Code 03 566
Transaction Amount 08 10000.00
Country Code 02 NG
Merchant Name 17 Arinfbnuattest408
Merchant City 05 Lagos
Merchant Postal Code 05 23401
Cyclic Redundancy Check 04 9091
从输出的第二行

可以看出,它正在将ID为02的令牌作为模板ID为62的子ID对待,因为它们的id第二行应相同     Visa卡ID

1 个答案:

答案 0 :(得分:0)

以下是将数据分解为TLV格式的示例代码。 See it working here

#include <iostream>
#include <vector>
#include <string>
#include <cstdio>
using namespace std;

class TLV
{
    public:

        string tag;
        unsigned int length;
        string value;

        TLV(string data)
        {
            value = "";
            //If there is enough data
            if(data.length() >= 4)
            {
                tag = data.substr(0,2);
                sscanf(data.substr(2,4).c_str(),"%2x",&length);

                //If there is enough data
                if(data.length() >= 4 + length*2) value = data.substr(4,length*2);
            }
        }

        static void parseTLV(string data, vector<TLV*> &res)
        {
            while(data.length() >= 4)
            {
                TLV *t = new TLV(data);
                if(t->value == "") break;
                res.push_back(t);
                data = data.substr(4+(t->length+t->length));
            }

            if(data.length() != 0)
            {
                //Whole data is not in TLV format. Can throw some error
                cout<<"ERROR [1] :: ["<<data<<"]\n";
            }
        }
};


int main()
{
    string data = "0007AAAAAAAAAAAAAA010FAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA0220AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";
    vector<TLV*> res;
    TLV::parseTLV(data, res);
    for(TLV *t:res)
    {
        printf("%s | %02X | %s |\n",t->tag.c_str(),t->length,t->value.c_str());
    }
    return 0;
}