我希望能够将我拥有的数据读入C ++,然后开始操作它来操作它。我很新,但有一点点基础知识。打击我的最明显的方法(也许这可能来自之前使用excel)将数据读入2d数组。这是我到目前为止的代码。
#include <iostream>
#include <fstream>
#include <algorithm>
#include <string>
#include <sstream>
using namespace std;
string C_J;
int main()
{
float data[1000000][10];
ifstream C_J_input;
C_J_input.open("/Users/RT/B/CJ.csv");
if (!C_J_input) return -1;
for(int row = 0; row <1000000; row++)
{
string line;
getline(C_J_input, C_J, '?');
if ( !C_J_input.good() )
break;
stringstream iss(line);
for(int col = 0; col < 10; col++)
{
string val;
getline(iss, val, ',');
if (!iss.good() )
break;
stringstream converter(val);
converter >> data[row][col];
}
}
cout << data;
return 0;
}
一旦我读完了数据,我希望能够逐行阅读,然后分析它,寻找某些东西,但我认为这可能是另一个主题的主题,一旦我有了读入的数据。
请以任何方式让我知道这是一个不好的问题,我会尝试添加任何可能使其更好的内容。
谢谢!
答案 0 :(得分:2)
作为提问者的请求,这是你如何将它加载到一个字符串中,然后分成几行,然后进一步拆分成元素:
#include <iostream>
#include <string>
#include <fstream>
#include <vector>
#include <sstream>
//This takes a string and splits it with a delimiter and returns a vector of strings
std::vector<std::string> &SplitString(const std::string &s, char delim, std::vector<std::string> &elems)
{
std::stringstream ss(s);
std::string item;
while (std::getline(ss, item, delim))
{
elems.push_back(item);
}
return elems;
}
int main(int argc, char* argv[])
{
//load the file with ifstream
std::ifstream t("test.csv");
if (!t)
{
std::cout << "Unknown File" << std::endl;
return 1;
}
//this is just a block of code designed to load the whole file into one string
std::string str;
//this sets the read position to the end
t.seekg(0, std::ios::end);
str.reserve(t.tellg());//this gives the string enough memory to allocate up the the read position of the file (which is the end)
t.seekg(0, std::ios::beg);//this sets the read position back to the beginning to start reading it
//this takes the everything in the stream (the file data) and loads it into the string.
//istreambuf_iterator is used to loop through the contents of the stream (t), and in this case go up to the end.
str.assign((std::istreambuf_iterator<char>(t)),
std::istreambuf_iterator<char>());
//if (sizeof(rawData) != *rawSize)
// return false;
//if the file has size (is not empty) then analyze
if (str.length() > 0)
{
//the file is loaded
//split by delimeter(which is the newline character)
std::vector<std::string> lines;//this holds a string for each line in the file
SplitString(str, '\n', lines);
//each element in the vector holds a vector of of elements(strings between commas)
std::vector<std::vector<std::string> > LineElements;
//for each line
for (auto it : lines)
{
//this is a vector of elements in this line
std::vector<std::string> elementsInLine;
//split with the comma, this would seperate "one,two,three" into {"one","two","three"}
SplitString(it, ',', elementsInLine);
//take the elements in this line, and add it to the line-element vector
LineElements.push_back(elementsInLine);
}
//this displays each element in an organized fashion
//for each line
for (auto it : LineElements)
{
//for each element IN that line
for (auto i : it)
{
//if it is not the last element in the line, then insert comma
if (i != it.back())
std::cout << i << ',';
else
std::cout << i;//last element does not get a trailing comma
}
//the end of the line
std::cout << '\n';
}
}
else
{
std::cout << "File Is empty" << std::endl;
return 1;
}
system("PAUSE");
return 0;
}
答案 1 :(得分:1)
乍看之下,我发现很少有明显的问题会大大减慢你的进度,所以我会把它们放在这里:
1)您正在使用两个断开连接的变量来读取这些行:
C_J
- 从getline
功能line
- 用作stringstream
我非常确定C_J
完全没必要。我想你只想做
getline(C_J_input, line, ...) // so that the textline read will fly to the LINE var
// ...and later
stringstream iss(line); // no change
或者,或者:
getline(C_J_input, C_J, ...) // no change
// ...and later
stringstream iss(C_J); // so that ISS will read the textline we've just read
否则,stringstream将永远不会看到getline从文件中读取的内容 - getline将数据写入到不同位置(C_J
)而不是字符串流查看(line
)。
getline()
。 CSV通常使用&#39;换行符&#39;用于分隔数据行的字符。当然,您的输入文件可能会使用&#39;?&#39; - 我不知道。但是如果你想使用换行符然后省略参数,getline将使用与你的操作系统匹配的默认换行符,这可能就是好的。
3)你的浮动数组是巨大的。请考虑使用list
。当你读行时,它会很好地增长。您甚至可以嵌套它们,因此list<list<float>>
也非常有用。我实际上可能会使用list<vector<float>>
,因为列数是不变的。使用预先分配的大型数组并不是一个好主意,因为总会有一个文件包含一行 - 你知道的太多,而且还有热播。
4)你的代码包含一个非常庞大的循环,它会迭代一定次数。循环本身没问题,但行数会有所不同。你实际上不需要计算线数。特别是如果您使用list<>
来存储值。就像你一样;检查文件是否正确打开if(!C_J_input)
,你也可以检查你是否已经到达文件结束:
if(C_J_input.eof())
; // will fire ONLY if you are at the end of the file.
呃......嗯,那是开始的。古德勒克!