Question

我有一个类似这样的纯文本文件：

Ford\tTaurus
  F-150
  F-250
Toyota\tCamry
  Corsica

换句话说，一个两级层次结构，其中第一个子元素与父元素位于同一行，但后续行中的后续子元素与父元素区分为两个空格前缀（\t以上代表文本中的文字标签。）

我需要使用RegEx转换为此：

Ford\tTaurus
Ford\tF-150
Ford\tF-250
Toyota\tCamry
Toyota\tCorsica

因此，我需要捕获父项（\ r \ n和\ t之间的文本不以\ s \ s开头），并将其应用于找到的任何\r\n\s\s中间，直到下一个父项。< / p>

我觉得这可以通过某种嵌套组来完成，但我认为我需要更多的咖啡因或其他东西，似乎无法解决这种模式。

（使用.NET关闭IgnoreWhitespace并关闭Multiline）

Answer 1

你想为此使用正则表达式的任何特殊原因？这是代码，它可以完成我认为你想要的东西，而无需计算正则表达式：

using System;
using System.IO;

class Test
{
    static void Main(string[] args)
    {
        string currentManufacturer = null;

        using (TextReader reader = File.OpenText(args[0]))
        using (TextWriter writer = File.CreateText(args[1]))
        {
            string line;
            while ((line = reader.ReadLine()) != null)
            {
                string car;
                if (line.StartsWith("  "))
                {
                    if (currentManufacturer == null)
                    {
                        // Handle this properly in reality :)
                        throw new Exception("Invalid data");
                    }
                    car = line.Substring(2);
                }
                else
                {
                    string[] bits = line.Split('\t');
                    if (bits.Length != 2)
                    {
                        // Handle this properly in reality :)
                        throw new Exception("Invalid data");
                    }
                    currentManufacturer = bits[0];
                    car = bits[1];
                }
                writer.WriteLine("{0}\t{1}", currentManufacturer, car);
            }
        }
    }
}

Answer 2

通过使用正则表达式来实现这一点很简单（但不是明智或快速）。

替换

(?<=^(Ford\t|Toyota\t).*?)^

$1。确保^和$在第一行/结尾处匹配，.与新行匹配。

正则表达式填充

2 个答案: