我开发了一种将CSV转换为sparse data格式的C#方法。代码运行良好,但有时它会下降。所以,有人吗? 有一个C#方法将CSV文件转换为稀疏文件。
稀疏数据的格式是
[label] [column_index]:[value] [column_index]:[value] ...
[label] [column_index]:[value] [column_index]:[value] ...
有一件事是,只要列值为0,就会跳转column_index。 例如,
400,0.39,0,0.098,0.4387
421,0.63,0.23,0,0.14
表示为
400 1:0.39 3:0.098 4:0.4387
421 1:0.63 2:0.23 4:0.14
我的C#代码是:
public static string getSparse(string path)
{
string final_string = String.Empty;
string[] line = System.IO.File.ReadAllLines(path);
int count = 0, m = 0;
foreach (string ln in line)
{
string[] num = ln.Split(new char[] { ',' });
foreach (string n in num)
{
if (count == 0)
final_string += n + " ";
else
{
if (n == "0")
{
count++;
continue;
}
final_string += count + ":" + n + " ";
}
count++;
}
count = 0;
final_string += Environment.NewLine;
m++;
}
return final_string;
}