我希望尝试使用F#将逗号分隔的文件读入内存,将其重复一个字段,然后将结果写入以管道分隔的文件。
我写了一个关于我希望程序在C#中做什么的例子:
var input = new StreamReader(@"D:\input.txt");
var addresses = new Dictionary<string, AddressModel>();
while (!input.EndOfStream)
{
var address = new AddressModel(input);
if (!addresses.ContainsKey(address.Id))
addresses.Add(address.Id, address);
}
var output = new StreamWriter(@"D:\CSharp.txt");
foreach (var address in addresses.Values)
{
output.WriteLine(address.ToString());
}
output.Flush();
将AddressModel定义为:
class AddressModel
{
public string Id { get; set; }
public string StreetName { get; set; }
public int ZipCode { get; set; }
public AddressModel(StreamReader inputStream)
{
if (inputStream == null) return;
var input = inputStream.ReadLine();
if (input == null) return;
var split = input.Split(new char[] { ',' }, StringSplitOptions.None);
Id = split[0];
ZipCode = int.Parse(split[1]);
StreetName = BuildStreet(split);
}
private string BuildStreet(string[] items)
{
var street = "";
if (!string.IsNullOrWhiteSpace(items[5]))
street += items[5];
if (!string.IsNullOrWhiteSpace(items[6]))
street += string.IsNullOrWhiteSpace(street) ? items[6] : " " + items[6];
if (!string.IsNullOrWhiteSpace(items[7]))
street += string.IsNullOrWhiteSpace(street) ? items[7] : " " + items[7];
if (!string.IsNullOrWhiteSpace(items[8]))
street += string.IsNullOrWhiteSpace(street) ? items[8] : " " + items[8];
return street;
}
public override string ToString()
{
return string.Format("{0}|{1}|{2}", Id, StreetName, ZipCode);
}
}
所以我希望程序要做的是逐行读取文件,使用每一行构造一个新的AddressModel对象,看看这个项目是否已经存在于字典中,如果不存在则添加它,然后将此词典的内容写入第二个文本文件。
当然,如果我认为“过于面向对象”,并且我可以以更具功能性的方式做到这一点,如果有人能指出我正确的方向,我将不胜感激。
答案 0 :(得分:3)
您可以像这样编写主程序:
open System
let lines = IO.File.ReadLines @"D:\input.txt"
let addresses = new Dictionary<string, AddressModel>()
lines |> Seq.iter (fun line ->
let address = AddressModel line
if not (addresses.ContainsKey address.Id) then
addresses.Add (address.Id, address))
IO.File.WriteAllLines(@"D:\CSharp.txt", Seq.map string addresses.Values)
正如您所看到的那样,结构与C#中的结构没有什么不同,不同之处在于您可以使用更高阶函数,例如map
和iter
然后关于你的Address类,你可以重用你的C#类或编写一个解析每一行的F#函数:
let parseLine (input:string) =
let split = input.Split [|','|]
let id, zipCode = split.[0], Int32.Parse split.[1]
let street =
split.[5..8]
|> Array.filter (String.IsNullOrWhiteSpace >> not)
|> String.concat " "
(id, zipCode, street)
let printLine (id, zipCode, street) = sprintf "%s|%i|%s" id zipCode street
然后您可以像这样更新您的主要功能:
open System
let lines = IO.File.ReadLines @"D:\input.txt"
let addresses = new Dictionary<string, (string*int*string)>()
lines |> Seq.map parseLine |> Seq.iter (fun ((id,_,_) as line) ->
if not (addresses.ContainsKey id) then
addresses.Add (id, line))
IO.File.WriteAllLines(@"D:\CSharp.txt", Seq.map printLine addresses.Values)
现在你根本不需要字典步骤,如果它的唯一目的是获得不同的ID。您可以按照其他答案中的建议使用Seq.distinctBy
。所以你的代码将进一步简化为:
let lines =
IO.File.ReadLines @"D:\input.txt"
|> Seq.map parseLine
|> Seq.distinctBy (fun (id,_,_) -> id)
IO.File.WriteAllLines(@"D:\CSharp.txt", Seq.map printLine lines)
<强>更新强>
这是建议的最终代码:
open System
let parseLine (input:string) =
let split = input.Split [|','|]
let id, zipCode = split.[0], Int32.Parse split.[1]
let street =
split.[5..8]
|> Array.filter (String.IsNullOrWhiteSpace >> not)
|> String.concat " "
(id, zipCode, street)
let printLine (id, zipCode, street) = sprintf "%s|%i|%s" id zipCode street
let lines =
IO.File.ReadLines @"D:\input.txt"
|> Seq.map parseLine
|> Seq.distinctBy (fun (id,_,_) -> id)
IO.File.WriteAllLines(@"D:\CSharp.txt", Seq.map printLine lines)
答案 1 :(得分:0)
您可以使用Seq.distinctBy
内部使用的Dictionary
。
type Contact = {Id:string; Name:string}
let lines = File.ReadLines(@"D:\input.txt")
let output =
lines
|> Seq.map toContact
|> Seq.distinctBy (fun c -> c.Id)
|> Seq.map contactToStr
File.WriteAllLines(@"D:\CSharp.txt", output)
说你有一个联系人类型,一个从字符串构建联系人的函数(toContact
)和一个从联系人类型(contactToStr
)构建字符串的函数,例如:
let toContact (str:string) =
let values = str.Split(',')
{Id = values.[0]; Name = values.[1]}
let contactToStr contact = sprintf "%s|%s" contact.Id contact.Name