协助构建正则表达式

时间:2014-01-23 19:27:57

标签: c# regex

我需要帮助构建一个正则表达式。

在我的MVC5视图中,我有一个包含或多组整数的文本区域,每个整数组可以包含6,7或8个字符。

在我的控制器中,我需要从输入字符串中提取所有这些数字并将它们放入数组中。

示例如下: 123456 123457 123458  或

123456 123457 123458

或 123456,123457,123458

这些组可能有也可能没有1或2个前导零:

00123456,00123457 123458

这就是我最终的结果:

    public string[] ExtractWorkOrderNumbers(string myText)
    {
        var result = new List<string>();
        var regex = new Regex(@"( |,)*(\d+)");
        var m = regex.Match(myText);

        while (m.Success)
        {
            for (int i = 1; i <= 2; i++)
            {
                var wo = m.Groups[2].ToString();
                if (result.Count == 0)
                {
                    result.Add(wo);
                }
                else
                {
                    var x = (from b in result where b == wo select b).ToList().Count;
                    if (x == 0) result.Add(wo);
                }
            }
            m = m.NextMatch();
        }
        return result.ToArray();
    }

2 个答案:

答案 0 :(得分:1)

假设:零个或多个空格和/或逗号用作分隔符。

    [TestMethod()]
    public void TestMethod3()
    {
        var myText = "123456 1234567, 123458, 00123456, 01234567";
        var regex = new Regex(@"( |,)*(\d+)");
        var m = regex.Match(myText);
        var matchCount = 0;
        while (m.Success)
        {
            Console.WriteLine("Match" + (++matchCount));
            for (int i = 1; i <= 2; i++)
            {
                Group g = m.Groups[i];
                Console.WriteLine("Group" + i + "='" + g + "'");
                CaptureCollection cc = g.Captures;
                for (int j = 0; j < cc.Count; j++)
                {
                    Capture c = cc[j];
                    Console.WriteLine("Capture" + j + "='" + c + "', Position=" + c.Index);
                }
            }
            m = m.NextMatch();
        }
    }

输出: (对于每场比赛,所有Group2都是您的比赛,Group1是分隔符)

Match1
Group1=''
Group2='123456'
Capture0='123456', Position=0
Match2
Group1=' '
Capture0=' ', Position=6
Group2='1234567'
Capture0='1234567', Position=7
Match3
Group1=' '
Capture0=',', Position=14
Capture1=' ', Position=15
Group2='123458'
Capture0='123458', Position=16
Match4
Group1=' '
Capture0=',', Position=22
Capture1=' ', Position=23
Group2='00123456'
Capture0='00123456', Position=24
Match5
Group1=' '
Capture0=',', Position=32
Capture1=' ', Position=33
Group2='01234567'
Capture0='01234567', Position=34

答案 1 :(得分:0)

通过使用正则表达式(Regex)的named capturing groups功能,我们可以从匹配模式中提取数据。在您的情况下,我们可以提取文本字符串的非零整数部分:

using System.Text.RegularExpressions;

// A pattern consisting of at most two leading zeros followed by 6 to 8 non-zero digits.
var regex = new Regex(@"^[0]{0,2}(?<Value>[1-9]{6,8})$");

var firstString = "123456";
var secondString = "01234567";
var thirdString = "0012345678";

var firstMatch = regex.Match(firstString);
var secondMatch = regex.Match(secondString);
var thirdMatch = regex.Match(thirdString);

int firstValue = 0;
int secondValue = 0;
int thirdValue = 0;

if (firstMatch.Success)
  int.TryParse(firstMatch.Groups["Value"].Value, out firstValue);

if (secondMatch.Success)
  int.TryParse(secondMatch.Groups["Value"].Value, out secondValue);

if (thirdMatch.Success)
  int.TryParse(thirdMatch.Groups["Value"].Value, out thirdValue);

Console.WriteLine("First Value  = {0}", firstValue);
Console.WriteLine("Second Value = {0}", secondValue);
Console.WriteLine("Third Value  = {0}", thirdValue);

输出:

First Value  = 123456
Second Value = 1234567
Third Value  = 12345678