Question

我在C＃中写作，过去我使用http://www.filehelpers.net/成功解析了文本文件。我的文件格式已从更标准的.csv格式更改，我现在要解析一个如下所示的文件：

custID：1732
姓名：Juan Perez
余额：435.00
日期：11-05-2002

custID：554
姓名：Pedro Gomez
余额：12342.30
日期：06-02-2004

我如何解析这样的文件我找不到这个例子，而不是分隔符我需要找到关键字然后读取给定的值'：'

Answer 1

这是一个示例（请参阅.NetFiddle），需要根据实际文件进行定制。可以使用基本正则表达式来解析文件，然后使用Linq输出解析为类实例的内容。以下是用于使用正则表达式解析数据的模式，您需要再次根据自己的情况对其进行修改。

我正在创建一个目标类（而不是将字符串转换为此示例的最终所需格式），例如Customer对象：

public class Customer
{
    public string Id { get; set; }
    public string Name { get; set; }
    public string Balance { get; set; }
    public string Date { get; set; }
}

以下是我们与客户对象匹配的示例数据，此时它模拟数据已从文件中读取为字符串：

string data = @"
custID: 1732 
name: Juan Perez
balance: 435.00
date: 11-05-2002

custID: 554
name: Pedro Gomez
balance: 12342.30
date: 06-02-2004";

掌握数据和目标实体后，我们将使用正则表达式来映射我们想要解析文件的模式。我们将使用模式中的named captures (?<NameHere> )结构将数据从标题中分离出来以便于提取（而不是索引可用的索引）。

string pattern = @"custID:(?<ID>[^\r\n]+)\s+name:(?<Name>[^\r\n]+)\s+balance:(?<Balance>[^\r\n]+)\s+date:(?<Date>[^\r\n]+)";

var KVPs = Regex.Matches(data, pattern)
                .OfType<Match>()
                .Select (mt => new Customer()
                {
                    Id = mt.Groups["ID"].Value,
                    Name = mt.Groups["Name"].Value,
                    Balance = mt.Groups["Balance"].Value,
                    Date    = mt.Groups["Date"].Value,
                })
                .ToList();

当这个运行时，我们在列表中得到两个类实例，就像在LinqPad中运行一样：

enter image description here

您要解析的数据似乎是一个INI文件。我将讨论如何使用正则表达式（高级）将该信息解析为字典以进行访问INI Files Meet Regex and Linq in C# to Avoid the WayBack Machine of Kernal32.Dll。

Answer 2

将OmegaMan的解决方案应用于FileHelpers并不是一件容易的事，但以下内容可能会帮助您入门。

暂时假设您只有一条记录。然后以下工作：

[DelimitedRecord(":")]
public class ImportRecord
{
    [FieldTrim(TrimMode.Both)]
    public string Key;
    [FieldTrim(TrimMode.Both)]
    public string Value;
}

class Program
{
    static void Main(string[] args)
    {
        var engine = new FileHelperEngine<ImportRecord>();

        string fileAsString = @"custID: 1732" + Environment.NewLine +
                              @"name: Juan Perez" + Environment.NewLine +
                              @"balance: 435.00" + Environment.NewLine +
                              @"date: 11-05-2002" + Environment.NewLine;

        ImportRecord[] validRecords = engine.ReadString(fileAsString);

        var dictionary = validRecords.ToDictionary(r => r.Key, r => r.Value);

        Assert.AreEqual(dictionary["custID"], "1732");
        Assert.AreEqual(dictionary["name"], "Juan Perez");
        Assert.AreEqual(dictionary["balance"], "435.00");
        Assert.AreEqual(dictionary["date"], "11-05-2002");

        Console.ReadKey();
    }
}

但是只要你有多个记录，你最终会有重复的字典条目，上面的内容将不起作用。但有办法解决这个问题。例如，如果每个记录具有相同的行数（在您的示例中为4，则可以执行此类操作）

[DelimitedRecord(":")]
[IgnoreEmptyLines()]
public class ImportRecord
{
    [FieldTrim(TrimMode.Both)]
    public string Key;
    [FieldTrim(TrimMode.Both)]
    public string Value;
}

public class Customer
{
    public string Id { get; set; }
    public string Name { get; set; }
    public string Balance { get; set; }
    public string Date { get; set; }
}

class Program
{
    static void Main(string[] args)
    {
        var engine = new FileHelperEngine<ImportRecord>();

        string fileAsString =
@"custID: 1732
name: Juan Perez
balance: 435.00
date: 11-05-2002

custID: 554
name: Pedro Gomez
balance: 12342.30
date: 06-02-2004";

        ImportRecord[] validRecords = engine.ReadString(fileAsString);

        var customers = validRecords
            .Batch(4, x => x.ToDictionary(r => r.Key, r => r.Value))
            .Select(dictionary => new Customer()
                {
                    Id = dictionary["custID"],
                    Name = dictionary["name"],
                    Balance = dictionary["balance"],
                    Date = dictionary["date"]
                }).ToList();

        Customer customer1 = customers[0];
        Assert.AreEqual(customer1.Id, "1732");
        Assert.AreEqual(customer1.Name, "Juan Perez");
        Assert.AreEqual(customer1.Balance, "435.00");
        Assert.AreEqual(customer1.Date, "11-05-2002");

        Customer customer2 = customers[1];
        Assert.AreEqual(customer2.Id, "554");
        Assert.AreEqual(customer2.Name, "Pedro Gomez");
        Assert.AreEqual(customer2.Balance, "12342.30");
        Assert.AreEqual(customer2.Date, "06-02-2004");

        Console.WriteLine("All OK");
        Console.ReadKey();
    }
}

}

另一种替代方法是预先解析内容，以便将其转换为更传统的CSV文件。也就是说，使用File.ReadAllText()获取string，然后使用字段分隔符替换换行符，并使用换行符替换空行。然后使用FileHelpersEngine.ReadAsString()读取转换后的字符串。

解析平面文本文件后如何创建对象

2 个答案: