Question

我的文字文件包含以下信息：

add comment=user1 disabled=yes name=userA password=123456 profile=\
    Internet-128K service=pppoe
add name=user2 password=123 profile=Internet-2M service=pppoe
add disabled=yes name=user3 password=316 profile=Internet-2M service=\
    pppoe
add disabled=yes name=user4 password=1216 profile=Internet-512K service=\
    pppoe
add caller-id=8C:89:A5:68:18:9A name=user5 password=308 profile=\
    Internet-256K remote-ipv6-prefix=::/64 service=pppoe
...

正如您所看到的那样，每一行都以add开头，其中包含一些信息（字段），例如comment, disabled, name, password, profile，依此类推。现在我想提取每行中的那些信息（字段）。我怎么能这样做？

Answer 1

首先，您可以提取每个块，然后提取所有信息：

string text = File.ReadAllText("sample.txt");
string[] items = Regex.Matches(text, "add .*?(?=\r\nadd|$)", RegexOptions.Singleline)
                      .Cast<Match>()
                      .Select(m => m.Value)
                      .ToArray();
foreach (string item in items)
{
    string line = Regex.Replace(item, @"\\\s*\r\n\s*", string.Empty);
    KeyValuePair<string, string>[] pairs = Regex.Matches(line, @"(?<name>\w+)=(?<value>.*?)(?=\w+=|$)")
                                                .Cast<Match>()
                                                .Select(m => new KeyValuePair<string, string>(m.Groups["name"].Value, m.Groups["value"].Value))
                                                .ToArray();

    Console.WriteLine(line);
    foreach (var pair in pairs)
        Console.WriteLine("{0} = {1}", pair.Key, pair.Value);
}

Answer 2

我想出了一个不使用正则表达式的解决方案 - 似乎有效：

List<Dictionary<string, string>> listDict = new List<Dictionary<string, string>>(); 
string[] text = File.ReadAllLines("sample.txt");
text.ToList().ForEach(line =>
{
    IEnumerable<string> kvpList = line.Split(' ').Skip(1);
    Dictionary<string, string> lineDict = new Dictionary<string, string>();
    kvpList.ToList().ForEach(kvpItem =>
    {
        string[] kvp = kvpItem.Split('=');
        lineDict.Add(kvp[0], kvp[1]);
    });
    listDict.Add(lineDict);
});

//Output for debug purposes
listDict.ForEach(resultLine =>
{
    resultLine.ToList().ForEach(resultPair => Console.Write(String.Format("{0}:{1} ",    resultPair.Key, resultPair.Value)));
    Console.WriteLine();
});
Console.ReadLine();

正则表达式：如何在文本中提取一些字段

2 个答案: