我正在尝试使用逗号分隔数据库转储。我只需要读取第一个单词,它会告诉我这是否是我需要的行,然后将行标记化并将每个分离的字符串保存在向量中。
我无法按顺序保留所有数据类型。我使用getline方法:
string line;
vector<string> tokens;
// Iterate through each line of the file
while( getline( file, line ) )
{
// Here is where i want to tokenize. strtok however uses a character array and not a string.
}
问题是,如果第一个单词是我所追求的,我只想继续阅读并标记一条线。以下是文件中一行的示例:
example,1,200,200,220,10,550,550,550,0,100,0,-84,255
所以,如果我在字符串示例之后,它会继续并将该行的其余部分标记化以供我使用,然后停止从该文件中读取。
我应该使用strtok,stringstream还是别的什么?
谢谢!
答案 0 :(得分:1)
#include <iostream>
#include <fstream>
#include <vector>
using namespace std;
void do(ifstream& file) {
string line;
string prefix = "example,";
// Get all lines from the file
while (getline(file,line).good()) {
// Compare the beginning for your prefix
if (line.compare(0, prefix.size(), prefix) == 0) {
// Homemade tokenization
vector<string> tokens;
int oldpos = 0;
int pos;
while ((pos = line.find(',', oldpos)) != string::npos) {
tokens.push_back(line.substr(oldpos, pos-oldpos));
oldpos = pos + 1;
}
tokens.push_back(line.substr(oldpos)); // don't forget the last bit
// And here you are!
}
}
}
答案 1 :(得分:0)
How do I tokenize a string in C++?
http://www.daniweb.com/software-development/cpp/threads/27905
希望这有帮助,尽管我不是精通C / C ++程序员。如果您可以使用标签或您正在使用的邮政语言进行指定,那将是很好的。
答案 2 :(得分:0)
#ifndef TOKENIZER_H
#define TOKENIZER_H
#include <string>
#include <vector>
#include <sstream>
class Tokenizer
{
public:
Tokenizer();
~Tokenizer();
void Tokenize(std::string& str, std::vector<std::string>& tokens);
};
#endif /* TOKENIZER_H */
#include "Tokenizer.h"
using namespace std;
string seps(string& s) {
if (!s.size()) return "";
stringstream ss;
ss << s[0];
for (int i = 1; i < s.size(); i++)
ss << '|' << s[i];
return ss.str();
}
void tok(string& str, vector<string>& tokens, const string& delimiters = ",")
{
seps(str);
string::size_type lastPos = str.find_first_not_of(delimiters, 0);
string::size_type pos = str.find_first_of(delimiters, lastPos);
while (string::npos != pos || string::npos != lastPos)
{
tokens.push_back(str.substr(lastPos, pos - lastPos));
lastPos = str.find_first_not_of(delimiters, pos);
pos = str.find_first_of(delimiters, lastPos);
}
}
Tokenizer::Tokenizer()
{
}
void Tokenizer::Tokenize(string& str, vector<string>& tokens)
{
tok(seps(str), tokens);
}
Tokenizer::~Tokenizer()
{
}
#include "Tokenizer.h"
#include <string>
#include <vector>
#include <iostream>
#include <cstdlib>
using namespace std;
int main()
{
// Required variables for later below
vector<string> t;
string s = "This is one string,This is another,And this is another one aswell.";
// What you need to include:
Tokenizer tokenizer;
tokenizer.Tokenize(s, t); // s = a string to tokenize, t = vector to store tokens
// Below is just to show the tokens in the vector<string> (c++11+)
for (auto c : t)
cout << c << endl;
system("pause");
return 0;
}