使用C ++中的不同分隔符将输入解析为标记

时间:2014-09-21 06:25:16

标签: c++ parsing delimiter c-strings

所以我试图解析输入数据并将数据分解成字符串标记,问题是,我有时需要在同一行和不同行上使用不同的分隔符,所有这些都是同时进行的。这是我需要解析的一个例子:

输入:

<15>
algorithm [binary tree] analysis heap
<1>
[binary search tree] analysis
complexity algorithm [2-3 tree]
<5>
tree [b+ tree] [Binary Tree]
<8>
graph clique Tree
<5>
tree [full binary tree]
[complete binary tree]
<-1>

因此,从上面的输入中,我需要使用分隔符&lt;&gt;来解析尖括号内的数字。我已经完成并在我的代码中工作。然后我需要使用单个字的“dilemeter”解析这些行上的所有数据,并使用“[]”作为括号中需要包含空格的单词的分隔符。

所以这就是我现在所拥有的:

// create a file-reading object
ifstream fin;
fin.open("input.txt"); // open a file
if (!fin.good())
return 1; // exit if file not found

// read each line of the file
while (!fin.eof())
{
// read an entire line into memory
char buf[maxChars];
fin.getline(buf, maxChars);
cout << "The line I'm about to read is: "<<  buf << endl;

int n = 0; // a for-loop index

// array to store memory addresses of the tokens in buf
char* token[maxTokens] = {}; // initialize to 0

// parse the line into <> delimited tokens first
token[0] = strtok(buf, "<>"); // number tokens with <> delimeters
if (token[0])
{
  token[1] = strtok(token[0], "[]"); //break tokens by spaces
  for (n = 1; n <20; n++)
  {
    token[n]=strtok(NULL, " "); //break tokens by brackets
    if (!token[n]) break; // no more tokens
  }
}

// process (print) the tokens
for (int i = 0; i < n; i++) // n = #of tokens
  cout << "Token[" << i << "] = " << token[i] << endl;
cout << endl;
}

我得到了这种输出:

The line I'm about to read is: <15>
Token[0] = 15      
The line I'm about to read is: algorithm [binary tree] analysis heap
Token[0] = algorithm 
Token[1] = binary tree
Token[2] =  analysis heap

The line I'm about to read is: <1>
Token[0] = 1

The line I'm about to read is: [binary search tree] analysis
Token[0] = [binary search tree
Token[1] =  analysis

The line I'm about to read is: complexity algorithm [2-3 tree]
Token[0] = complexity algorithm 
Token[1] = 2-3 tree

The line I'm about to read is: <5>
Token[0] = 5

The line I'm about to read is: tree [b+ tree] [Binary Tree]
Token[0] = tree 
Token[1] = b+ tree
Token[2] =  
Token[3] = Binary Tree

The line I'm about to read is: <8>
Token[0] = 8

The line I'm about to read is: graph clique Tree
Token[0] = graph clique Tree

The line I'm about to read is: <5>
Token[0] = 5

The line I'm about to read is: tree [full binary tree]
Token[0] = tree 
Token[1] = full binary tree

The line I'm about to read is: [complete binary tree]
Token[0] = [complete binary tree

The line I'm about to read is: <-1>
Token[0] = -1

基本上我需要它看起来像:

  

令牌[0] = 15
      令牌[0] =算法
      令牌[1] =二叉树
      令牌[2] =分析
      令牌[3] =堆
  等等......

0 个答案:

没有答案