假设我有以下示例数据:
Sample.csv:
Dog,25
Cat,23
Cat,20
Dog,0
我想将其加载到IDataView
,将其转换为可用于ML(没有字符串等),然后再次将其另存为.csv
,说用另一种工具进行分析或语言。
// Load data:
var sampleCsv = Path.Combine("Data", "Sample.csv");
var columns = new[]
{
new TextLoader.Column("type", DataKind.String, 0),
new TextLoader.Column("age", DataKind.Int16, 1),
};
var mlContext = new MLContext(seed: 0);
var dataView = mlContext.Data.LoadFromTextFile(sampleCsv, columns,',');
// Transform
var pipeline =
mlContext.Transforms.Categorical.OneHotEncoding("type",
// This outputKind will add just one column, while others will add some:
outputKind: OneHotEncodingEstimator.OutputKind.Key);
var transformedDataView = pipeline.Fit(dataView).Transform(dataView);
// transformedDataView:
// Dog,1,25
// Cat,2,23
// Cat,2,20
// Dog,1,0
如何获取两个数字列并将其写入.csv
文件?
答案 0 :(得分:1)
我在自己的项目中使用以下代码创建一个 .csv 文件。希望这会有所帮助。
var predictions = mlContext.Data.CreateEnumerable<SpikePrediction>(transformedData, reuseRowObject: false);
SavePredictions(predictions.ToArray());
private void SavePredictions(SpikePrediction[] predictions) {
if (dict.Count() != predictions.Count()) {
Console.WriteLine("> Cannot save predictions because it does not correspond with the dataset length");
return;
}
List<string> predictionsCol = _dataCol.ToList();
predictionsCol.Add("Label");
var fullResultFilePath = Path.Combine(_dataPath, FileHandeling.resultFolder, $"{_modelName}.csv");
using (var stream = File.CreateText(fullResultFilePath)) {
stream.WriteLine(string.Join(",", predictionsCol));
for (int i = 0; i < predictions.Count(); i++) {
var label = predictions[i];
stream.WriteLine(string.Join(",", new string[] { dict[i].Item1.Split("T")[0].Substring(1), dict[i].Item2, label.Prediction[0].ToString() }));
}
}
}
答案 1 :(得分:0)
您可以为输出数据创建class
:
class TempOutput
{
// Note that the types should be the same from the DataView
public UInt32 type { get; set; }
public Int16 age { get; set; }
}
然后使用CreateEnumerable<>
从DataView
读取所有行并将它们打印到`.csv。文件:
File.WriteAllLines(sampleCsv + ".output",
mlContext.Data.CreateEnumerable<TempOutput>(transformedDataView, false)
.Select(t => string.Join(',', t.type, t.age)));