F#CsvTypeProvider从稍微不同的csv文件中提取相同的列

时间:2016-09-04 10:22:26

标签: csv f# type-providers f#-data

我正在创建一个程序,用于从不同的CSV文件中读取足球比赛。我感兴趣的列存在于所有文件中,但文件的列数不同。

这使我为每个文件变体创建了一个单独的映射函数,每种类型都有不同的样本:

setMinimum()

loadGames函数再次使用它们:

type GamesFile14 = CsvProvider<"./data/sample_14.csv">
type GamesFile15 = CsvProvider<"./data/sample_15.csv">
type GamesFile1617 = CsvProvider<"./data/sample_1617.csv">

let mapRows14 (rows:seq<GamesFile14.Row>) = rows |> Seq.map ( fun c -> { Division = c.Div; Date = DateTime.Parse c.Date; 
        HomeTeam = { Name = c.HomeTeam; Score = c.FTHG; Shots = c.HS; ShotsOnTarget = c.HST; Corners = c.HC; Fouls = c.HF }; 
        AwayTeam = { Name = c.AwayTeam; Score = c.FTAG; Shots = c.AS; ShotsOnTarget = c.AST; Corners = c.AC; Fouls = c.AF };
        Odds = { H = float c.B365H; U = float c.B365D;  B = float c.B365A } } ) 


let mapRows15 (rows:seq<GamesFile15.Row>) = rows |> Seq.map ( fun c -> { Division = c.Div; Date = DateTime.Parse c.Date; 
        HomeTeam = { Name = c.HomeTeam; Score = c.FTHG; Shots = c.HS; ShotsOnTarget = c.HST; Corners = c.HC; Fouls = c.HF }; 
        AwayTeam = { Name = c.AwayTeam; Score = c.FTAG; Shots = c.AS; ShotsOnTarget = c.AST; Corners = c.AC; Fouls = c.AF };
        Odds = { H = float c.B365H; U = float c.B365D;  B = float c.B365A } } ) 


let mapRows1617 (rows:seq<GamesFile1617.Row>) = rows |> Seq.map ( fun c -> { Division = c.Div; Date = DateTime.Parse c.Date; 
        HomeTeam = { Name = c.HomeTeam; Score = c.FTHG; Shots = c.HS; ShotsOnTarget = c.HST; Corners = c.HC; Fouls = c.HF }; 
        AwayTeam = { Name = c.AwayTeam; Score = c.FTAG; Shots = c.AS; ShotsOnTarget = c.AST; Corners = c.AC; Fouls = c.AF };
        Odds = { H = float c.B365H; U = float c.B365D;  B = float c.B365A } } ) 

在我看来,必须有更好的方法来解决这个问题。

有什么方法可以让我的绘图功能更通用,这样我就不必一遍又一遍地重复相同的功能?

是否可以根据资源动态创建CsvProvider,或者我是否需要为上面代码中的csv文件的每个变体显式声明一个样本?

其他建议?

1 个答案:

答案 0 :(得分:2)

在您的方案中,您可能会从FSharp.Data's CsvFile type获得更好的结果。它使用更动态的CSV解析方法,使用动态1. Can we return large result set from WCF? Or Does the use of WCF best to return large result set or I need to move to different way like WebAPI? 2. StreamResponse should work like returning results in chunck I guess but after implementation, I don't think it is working as I am getting result altogether. 运算符进行数据访问:您丢失了类型提供程序为您提供的一些类型安全保证,因为每个单独的CSV文件都将加载到保存中?类型 - 这意味着您无法在编译时保证任何给定列都在文件中,并且您必须为运行时错误做好准备。但在你的情况下,这正是你想要的,因为它可以让你的三个函数被重写如下:

CsvRow

尝试let mapRows14 rows = rows |> Seq.map ( fun c -> { Division = c?Div; Date = DateTime.Parse c?Date; HomeTeam = { Name = c?HomeTeam; Score = c?FTHG; Shots = c?HS; ShotsOnTarget = c?HST; Corners = c?HC; Fouls = c?HF }; AwayTeam = { Name = c?AwayTeam; Score = c?FTAG; Shots = c?AS; ShotsOnTarget = c?AST; Corners = c?AC; Fouls = c?AF }; Odds = { H = float c?B365H; U = float c?B365D; B = float c?B365A } } ) ,看看它是否能解决您的问题。