正则表达式 - 忽略空格

时间:2021-07-02 00:02:42

标签: c# regex

我有一个正则表达式:

Regex.Match(result, @"\bTop Rate\b.*?\s*\s*([\d,\.]+)", RegexOptions.IgnoreCase);

然后解析成int

topRate = int.Parse(topRateMatch.Groups[1].Value, System.Globalization.NumberStyles.AllowThousands);

示例)

Top Rate: 888,888
Output: 888888

通过使用我当前的正则表达式,我可以很好地获得 int 输出。 但是,我注意到当数字之间有空格时 例如,

Top Rate: 8         88,888

我只得到 8。有没有办法忽略数字之间/最高评级字母之后可能存在或不存在的任何空格?

示例)

Top Rate:                       8                      88,888
Expected output: 888888

Top Rate:                       8     88,888
Expected output: 888888

Top Rate: 8                      88,888
Expected output: 888888

Top Rate: 8 8 8,888
Expected output: 888888

Top Rate: 888,          8  88
Expected output: 888888

4 个答案:

答案 0 :(得分:2)

首先,匹配和捕获数字时不能跳过或省略空格,只能通过在给定字符串后提取多个匹配项来实现。但是,有一个简单的两步方法。

您可以添加 \s 以匹配任何空格,或添加 \p{Zs}\t 以将任何水平空格匹配到字符类。我建议先用 \d 捕获数字,然后使用可选的非捕获组在末尾带有数字模式,以确保捕获的数字以数字开头和结尾:

\bTop Rate\b.*?(\d(?:[\d,.\s]*\d)?)

参见regex demo。请注意,重复 \s*\s* 没有意义,\s* 已经匹配零个或多个空白字符,甚至 \s* 也是多余的,因为 .*? 匹配除LF 字符尽可能少。要使其跨行匹配,请添加 RegexOptions.Singleline 选项。

详情

  • \bTop Rate\b - 一个完整的词 Top Rate
  • .*? - 除换行符以外的任何零个或多个字符,尽可能少
  • (\d(?:[\d,.\s]*\d)?) - 第 1 组:
    • \d - 一个数字
    • (?:[\d,.\s]*\d)? - 一个可选的非捕获组,匹配零个或多个数字、,. 或空格,然后是一个数字。

接下来,当你得到匹配时,只保留数字。

var text = "Top Rate: 8                      88,888";
var result = Regex.Match(text, @"\bTop Rate\b.*?(\d(?:[\d,.\s]*\d)?)", RegexOptions.Singleline);
if (result.Success)
{
    Console.WriteLine( new string(result.Groups[1].Value.Where(c => char.IsDigit(c)).ToArray()) );
}

参见C# demo。多重匹配:

var text = "Top Rate: 8                      88,888 and Top Rate:                       8  \n   88,888";
var results = Regex.Matches(text, @"\bTop Rate\b.*?(\d(?:[\d,.\s]*\d)?)", RegexOptions.Singleline)
        .Cast<Match>()
        .Select(x => new string(x.Groups[1].Value.Where(c => char.IsDigit(c)).ToArray()));
foreach (var s in results)
{
    Console.WriteLine( s );
}

this C# demo

答案 1 :(得分:0)

类似的东西?

using System;
using System.Text.RegularExpressions;
                    
public class Program
{
  public static void Main()
  {
    string[] texts = {
      "This should Not match the Top Rate thing",
      " Top Rate    : 888,888 ",
      "Top    Rate   : 8 8 8 , 8 8 8 ",
    };
    Regex rxNonDigit = new Regex(@"\D+"); // matches 1 or more characters other than decimal digits.
    Regex rxTopRate = new Regex(@"
      ^           # match start of line, followed by
      \s*         # zero or more lead-in whitespace characters, followed by
      Top         # the literal 'Top', followed by
      \s+         # 1 or more whitespace characters,followed by
      Rate        # the literal 'Rate', followed by
      \s*         # zero or more whitespace characters, followed by
      :           # a literal colon ':', followed by
      \s*         # zero or more whitespace characters followed by
      (?<rate>    # an named (explicit) capture group, containing
        \d+       # - 1 or more decimal digits, followed by
        (         # - an unnamed group, containing
          (\s|,)+ #     - interstial whitespace or a comma, followed by
          \d+     #     - 1 or more decimal digits
        )*        #   the whole of which is repeated zero or more times
      )           # followed by
      \s*         # zero or more lead-out whitespace characters, followed by
      $           # end of line
    ", RegexOptions.IgnorePatternWhitespace|RegexOptions.ExplicitCapture );

    foreach ( string text in texts )
    {
      Match m = rxTopRate.Match(text);
      if (!m.Success)
      {
        Console.WriteLine("No Match: '{0}'", text);
      }
      else
      {
        string rawValue = m.Groups["rate"].Value;
        string cleanedValue = rxNonDigit.Replace(rawValue, "");
        Decimal value = Decimal.Parse(cleanedValue);

        Console.WriteLine(@"Matched: '{0}' >>> '{1}' >>> '{2}' >>> {3}",
          text,
          rawValue,
          cleanedValue,
          value
        );
      }
    }

  }
    
}

答案 2 :(得分:0)

我验证并发现在 Regex 语句中稍作改动,就可以实现您的目标。

第一个:

enter image description here

第二个:

enter image description here

答案 3 :(得分:-2)

String TopRate="88,888"
for(int x=0; x<TopRate.Length;x++)
{
    if(TopRate[x]==",")
    {
       TopRate[x]="";
       break;
    }
}