将分隔文件的读取从F#转换为C#

时间:2013-11-06 23:13:11

标签: c# csv f# f#-data

我知道这是一个相当厚的问题,但我试图在F#和C#之间进行一些比较,并从http://www.clear-lines.com/blog/post/Nearest-Neighbor-Classification-part-2.aspx借用了一个F#脚本,并试图从C#程序中获取等效操作测试操作和语法的目的。这部分是一个更大的脚本,我将转换为一个F#程序,对给定的数据执行k-means分析。

这是F#部分:

let elections =
    let file = @"C:\Users\Deines\Documents\Election2008.txt"
    let fileAsLines =
        File.ReadAllLines(file)
            |> Array.map (fun line -> line.Split(','))
    let dataset =
        fileAsLines
        |> Array.map (fun line ->
            [| Convert.ToDouble(line.[1]);
               Convert.ToDouble(line.[2]);
               Convert.ToDouble(line.[3]) |])
    let labels = fileAsLines |> Array.map (fun line -> line.[4])
    dataset, labels 

以下是数据样本(Election2008.txt):

AL,32.7990,-86.8073,4447100,REP 
AK,61.3850,-152.2683,626932,REP 
AZ,33.7712,-111.3877,5130632,REP 
AR,34.9513,-92.3809,2673400,REP 
CA,36.1700,-119.7462,33871648,DEM 
CO,39.0646,-105.3272,4301261,DEM 
CT,41.5834,-72.7622,3405565,DEM 
DE,39.3498,-75.5148,783600,DEM 
DC,38.8964,-77.0262,572059,DEM 
FL,27.8333,-81.7170,15982378,DEM 

1 个答案:

答案 0 :(得分:4)

您可以通过以下方式在C#中执行相同的基本操作:

Tuple<double[][], string[]> GetElections()
{
    var file = @"C:\Users\Deines\Documents\Election2008.txt";
    var fileAsLines = File.ReadLines(file).Select(line => line.Split(','));
    var dataset = fileAsLines.Select(line => new[] 
                                             { 
                                                 Convert.ToDouble(line[1]),
                                                 Convert.ToDouble(line[2]),
                                                 Convert.ToDouble(line[3])
                                             }).ToArray();
    var labels = fileAsLines.Select(line => line[4]).ToArray();
    return Tuple.Create(dataset, labels);
}

话虽这么说,C#开发人员很少会以这种方式写这个。您更有可能创建自定义类型来保存结果(使用名称+值),并以这种方式读取,即:

class ElectionResult
{
     public ElecationResult(string label, double x, double y, int amount)
     {
         this.Label = label;
         this.Point = new Point(x,y);
         this.Amount = amount;
     }
     string Label { get; private set; }
     Point Location { get; private set; }
     int Amount { get; private set; }
}

IList<ElectionResult> GetElectionResults()
{
    var file = @"C:\Users\Deines\Documents\Election2008.txt";
    var fileAsLines = File.ReadLines(file).Select(line => line.Split(','));

    return fileAsLines.Select(line => new ElectionResult(line[4],
                                                 Convert.ToDouble(line[1]),
                                                 Convert.ToDouble(line[2]),
                                                 Convert.ToInt32(line[3]))
                      .ToList();
}

这使得它对于典型的C#开发人员来说更加实用,因为从Tuple结果中提取数组没有模式匹配。