我有以下形式的字符串
a = x + y
或abc = xyz + 5
或6 + 5
或f(p)
我需要的是对字符串进行标记,以便我阅读每个operator
和operand
因此,对于a = x + y
令牌,返回应为a,=,x,+,y
,如果为abc=xyz+5
则返回abc,=,xyz,+,5
。请注意,operator
和operands
这就是我试过的
void tokenize(std::vector<std::string>& tokens, const char* input, const char* delimiters) {
const char* s = input;
const char* e = s;
while (*e != 0) {
e = s;
while (*e != 0 && strchr(delimiters, *e) == 0) {
++e;
}
if ( *e != ' ' && strchr(delimiters, *e) != 0 ){
std::string op = "";
op += *e;
tokens.push_back(op);
}
if (e - s > 0) {
tokens.push_back(std::string(s,e - s));
}
s = e + 1;
}
}
答案 0 :(得分:5)
您可以使用此实现。 第一个参数是要标记的std :: string,第二个参数是要使用的分隔符。它返回一个标记化的字符串向量。非常简单而有效。
vector<string> tokenizeString(const string& str, const string& delimiters)
{
vector<string> tokens;
// Skip delimiters at beginning.
string::size_type lastPos = str.find_first_not_of(delimiters, 0);
// Find first "non-delimiter".
string::size_type pos = str.find_first_of(delimiters, lastPos);
while (string::npos != pos || string::npos != lastPos)
{ // Found a token, add it to the vector.
tokens.push_back(str.substr(lastPos, pos - lastPos));
// Skip delimiters. Note the "not_of"
lastPos = str.find_first_not_of(delimiters, pos);
// Find next "non-delimiter"
pos = str.find_first_of(delimiters, lastPos);
}
return tokens;
}
答案 1 :(得分:4)
此示例使用boost tokenizer来实现所需的行为:
#include <boost/tokenizer.hpp>
#include <iostream>
using namespace std;
using namespace boost;
int main(int , char* [])
{
const string formula = " ABC + BYZ =6 +5";
typedef boost::tokenizer<boost::char_separator<char> > tokenizer;
boost::char_separator<char> sep(" ", "+-=");
tokenizer tokens(formula, sep);
for (tokenizer::iterator tok_iter = tokens.begin();tok_iter != tokens.end(); ++tok_iter)
std::cout << "<" << *tok_iter << "> ";
return 0;
}
<强>输出强>
&LT; ABC&GT; 1 +&GT; &LT; BYZ&GT; &LT; =&GT; &LT; 6个1 +&GT; &LT; 5个
跳过空格,包含分隔符