我不熟悉C ++文件阅读,但我通过pyspark完成了很多工作。 所以现在我有一个txt文件,内容如下:
1 52 Hayden Smith 18:16 15 M Berlin
2 54 Mark Puleo 18:25 15 M Berlin
3 97 Peter Warrington 18:26 29 M New haven
4 305 Matt Kasprzak 18:53 33 M Falls Church
5 272 Kevin Solar 19:17 16 M Sterling
6 394 Daniel Sullivan 19:35 26 M Sterling
7 42 Kevan DuPont 19:58 18 M Boylston
8 306 Chris Goethert 20:00 43 M Falls Church
如您所见,有8列和351行(我只显示了8行), 对于每一行,[0]是排名,[1]是BIB,[2]是名字,[3]是姓,[4]是时间,[5]是年龄,[6]是性,[7]是镇 例如,第一行,1是排名,52是BIB,Hayden Smith是名字,18:16是时间,15是年龄,M是男性,柏林是城镇。
我有一个已排序的链接结构,名为:Class SortedLinked 和一个itemtype类,名为:Class Runner
您不必担心SortedLinked类。
Class Runner有四个私有属性:
string name, int age, int min, int sec
在我的驱动文件中,我可以这样做:
SortedLinked mylist // initialize a sorted list
Runner M("Jordan", 22, 20, 20) // initialize a Runner called Jordan, who is 22 years old, and finished the race in 20 mins and 20 sec
mylist.add(M) //add Runner M into my sorted list
所以我需要阅读文本文件并创建一个Runner对象,其中包含跑步者的名字,年龄,分钟数和秒数。将该Runner插入已排序的链接列表。
所以,如果这是在pyspark,我可以这样做:
file=sc.textFile("hdfs") //we usually use hdfs in pyspark
newfile = file.map(lambda line: line.split('\t') //for each column, they are seperated by Tabs, except column[2][3] are separated by a space
ColumnIneed = newfile.filter(lambda r: [r[2], r[3], r[4], r[5]]) // I only need the column [2][3][4][5]
mylist = ColumnIneed.collect() // transform the RDD into a list
Then I can just transform every row into a Runner object.
但是,在C ++中我只知道这个:
ifstream, infile;
string s, sAll;
if(infile.is_open())
{
while(getline(line, s))
{
s = s.rstrip('\n') //does NOT work in C++
name, age, time = s.split('\t') // Does NOT work in C++ and I dont need all the columns
所以,问题:
1,我需要访问每一行,并删除换行符
2,我只需要列[2] [3] [4] [5] //每列用Tabs分隔
3,列[4]是时间,这是文本文件中的字符串,我需要拆分":"并投入一分钟和几秒钟
4,列[2] [3]是名字和姓氏,我需要将它们组合成字符串名称
5,列[2] [3]用空格分隔
理想情况下,我想这样做:
while(I need a loop)
{
eachline = access each line;
eachline.strip('\n') //strip newline
eachline.split('\t') //split Tabs
string name = eachline[2][3];
string time = eachline[4];
int min;
int sec;
min, sec = time.split(':")
int age = eachline[5];
Runner M(name, age, min, sec) //I don't know if this works, because it looks like you are overwriting the Runner M each time you access a new line.
mylist.add(M) //add M into my linkedlist, this step you don't need to worry, I already finished.
}
如果你有更好的方法,我会非常感激。
答案 0 :(得分:1)
一些代码段
std::ifstream in;
in.open(/*path to file*/);
std::string line;
if(in.is_open())
{
while(std::getline(in, line)) //get 1 row as a string
{
std::istringstream iss(line); //put line into stringstream
std::string word;
while(iss >> word) //read word by word
{
std::cout << word << std::endl;
}
/*
int row;
int age;
std::string name;
iss >> row >> age >> name; // adopt to your input line
Runner M(name, age, min, sec); //common agreement - variables shouldn't start with capital, you don't override M, each time u create new local variable type of Runner, then you put copy of M into some container, M gets destroyed at the end of the block, probably you could use movement semantic, but you need C++ basics first
mylist.add(M);
*/
}
}