C#正则表达式匹配大小写 - 拆分字符串并写入文件输出

时间:2017-01-26 13:04:55

标签: c# regex split match streamreader

基本上我有这种格式的记录文本文件:

            if ($danresult->num_rows > 0) {
                while($row = $danresult->fetch_assoc()) {

                    $score1 = $row["score_1"];
                    $score2 = $row["score_2"];
                    $score3 = $row["score_3"];
                    $score4 = $row["score_4"];
                    $score5 = $row["score_5"];
                    $score6 = $row["score_6"];
                    $score7 = $row["score_7"];
                    $score8 = $row["score_8"];

                    $sum = $score1 + $score2 + $score3 + $score4 + $score5 + $score6 + $score7 + $score8;
                    $totalAverage = $sum / 8;

                    $totalAverages[] = $totalAverage;

                    $index = min($totalAverages);

                    echo $index . '<br>';

                }
            }

我想以下列格式输出到文本文件

(1909, 'Ford', 'Model T'),
(1926, 'Chrysler', 'Imperial'),
(1948, 'Citroën', '2CV'),

我知道我需要将每一行拆分为相关的文本部分,例如试图跟随像this这样的问题。但是对于如何获得Year,Make和Model的相关匹配字符串部分,已经遇到了心理障碍。

到目前为止,我发现了这一点,它发现了括号之间的一切:

new Vehicle() { Id = 1, Year = 1909, Make = "Ford", Model = "Model T" },
new Vehicle() { Id = 2, Year = 1926, Make = "Chrysler", Model = "Imperial" },
new Vehicle() { Id = 3, Year = 1948, Make = "Citroën", Model = "2CV" },

但不确定如何对值进行分组并用逗号分隔:

非常感谢任何帮助。

4 个答案:

答案 0 :(得分:1)

为什么不使用string.Split(',')?比Regex更快,适合你(当然,首先删除每一行的','。

答案 1 :(得分:1)

正则表达式将他们分组:

\((\d+),\s+[']([\w\së]+)['],\s+[']([\w\s]+)[']\)[,]*

请注意有关Citro的问题ë n =&gt;您必须输入不在a-z,A-Z内的所有特殊符号(如ëÿ等等)

要在代码中使用,您将获得第1组:

string cars = @"(1909, 'Ford', 'Model T'),"
string pattern = @"\((\d+),\s+[']([\w\së]+)['],\s+[']([\w\s]+)[']\)[,]*";
var lResult = Regex.Match(cars, pattern);

if(lResult.Success)
    foreach( var iGroup in lResult.Groups)
        Console.WriteLine(iGroup);

在lResult.Groups中您获得了有关汽车的信息,您只需将其输出到您需要的文件中。

C#6.0:

Console.WriteLine($"new Vehicle() {{ Id = 1, Year = {lResults.Groups[1]}, Make = \"{lResults.Groups[2]}\", Model = \"{lResults.Groups[3]}\"}},");

旧语法:

Console.WriteLine(@"new Vehicle() { Id = 1, Year = "+ lMatch.Groups[1]+", Make = "+ lMatch.Groups[2] + ", Model = "+ lMatch.Groups[3] + " },");

一旦将其自动化为for循环,您就可以轻松添加Id。

我的例子在Groups [0]整个字符串中,所以这就是我的索引从1到3的原因。

正如@Toto所说,\w已经包含\d,因此无需编写它。

答案 2 :(得分:1)

如果你愿意使用解析器框架(这可能有点过分),你可以使用例如sprache。没有正确错误处理的示例:

Parser<string> stringContent = 
    from open in Parse.Char('\'').Once()
    from content in Parse.CharExcept('\'').Many().Text()
    from close in Parse.Char('\'').Once()
    select content;

Parser<string> numberContent = Parse.Digit.AtLeastOnce().Text();
Parser<string> element = stringContent.XOr(numberContent);

Parser<List<string>> elements =
    from e in element.DelimitedBy(Parse.Char(',').Token())
    select e.ToList();

Parser<List<string>> parser =
    from open in Parse.Char('(').Once()
    from content in elements
    from close in Parse.Char(')').Once()
    select content;

var input = new List<string> { "(1909, 'Ford', 'Model T')", "(1926, 'Chrysler', 'Imperial')", "(1948, 'Citroën', '2CV')" };

foreach (var line in input)
{
    var parsed = parser.Parse(line);
    var year = Int32.Parse(parsed[0]);
    var make = parsed[1];
    var model = parsed[2];

    Console.WriteLine(">> " + year + " " + make + " " + model);
}

答案 3 :(得分:1)

您可以根据指定的捕获组使用此代码段:

var cars = new List<string>() {
    "(1909, 'Ford', 'Model T')",
    "(1926, 'Chrysler', 'Imperial')",
    "(1948, 'Citroën', '2CV')",
};

var regex = @"(?<Year>\d+).*?'(?<Brand>.*?)'.*?'(?<Model>.*?)'";

foreach (var car in cars)
{
    var match = Regex.Match(car, regex);
    if (match.Success)
    {
        Console.WriteLine($"{match.Groups["Brand"]} make {match.Groups["Model"]} in {match.Groups["Year"]}");
    }
}

将打印:

  

福特在1909年制造Model T

     

克莱斯勒于1926年创作帝国

     

Citroën于1948年制作2CV