我有一个正则表达式:
Regex.Match(result, @"\bTop Rate\b.*?\s*\s*([\d,\.]+)", RegexOptions.IgnoreCase);
然后解析成int
topRate = int.Parse(topRateMatch.Groups[1].Value, System.Globalization.NumberStyles.AllowThousands);
示例)
Top Rate: 888,888
Output: 888888
通过使用我当前的正则表达式,我可以很好地获得 int 输出。 但是,我注意到当数字之间有空格时 例如,
Top Rate: 8 88,888
我只得到 8。有没有办法忽略数字之间/最高评级字母之后可能存在或不存在的任何空格?
示例)
Top Rate: 8 88,888
Expected output: 888888
Top Rate: 8 88,888
Expected output: 888888
Top Rate: 8 88,888
Expected output: 888888
Top Rate: 8 8 8,888
Expected output: 888888
Top Rate: 888, 8 88
Expected output: 888888
答案 0 :(得分:2)
首先,匹配和捕获数字时不能跳过或省略空格,只能通过在给定字符串后提取多个匹配项来实现。但是,有一个简单的两步方法。
您可以添加 \s
以匹配任何空格,或添加 \p{Zs}
和 \t
以将任何水平空格匹配到字符类。我建议先用 \d
捕获数字,然后使用可选的非捕获组在末尾带有数字模式,以确保捕获的数字以数字开头和结尾:
\bTop Rate\b.*?(\d(?:[\d,.\s]*\d)?)
参见regex demo。请注意,重复 \s*\s*
没有意义,\s*
已经匹配零个或多个空白字符,甚至 \s*
也是多余的,因为 .*?
匹配除LF 字符尽可能少。要使其跨行匹配,请添加 RegexOptions.Singleline
选项。
详情:
\bTop Rate\b
- 一个完整的词 Top Rate
.*?
- 除换行符以外的任何零个或多个字符,尽可能少(\d(?:[\d,.\s]*\d)?)
- 第 1 组:
\d
- 一个数字(?:[\d,.\s]*\d)?
- 一个可选的非捕获组,匹配零个或多个数字、,
、.
或空格,然后是一个数字。接下来,当你得到匹配时,只保留数字。
var text = "Top Rate: 8 88,888";
var result = Regex.Match(text, @"\bTop Rate\b.*?(\d(?:[\d,.\s]*\d)?)", RegexOptions.Singleline);
if (result.Success)
{
Console.WriteLine( new string(result.Groups[1].Value.Where(c => char.IsDigit(c)).ToArray()) );
}
参见C# demo。多重匹配:
var text = "Top Rate: 8 88,888 and Top Rate: 8 \n 88,888";
var results = Regex.Matches(text, @"\bTop Rate\b.*?(\d(?:[\d,.\s]*\d)?)", RegexOptions.Singleline)
.Cast<Match>()
.Select(x => new string(x.Groups[1].Value.Where(c => char.IsDigit(c)).ToArray()));
foreach (var s in results)
{
Console.WriteLine( s );
}
答案 1 :(得分:0)
类似的东西?
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string[] texts = {
"This should Not match the Top Rate thing",
" Top Rate : 888,888 ",
"Top Rate : 8 8 8 , 8 8 8 ",
};
Regex rxNonDigit = new Regex(@"\D+"); // matches 1 or more characters other than decimal digits.
Regex rxTopRate = new Regex(@"
^ # match start of line, followed by
\s* # zero or more lead-in whitespace characters, followed by
Top # the literal 'Top', followed by
\s+ # 1 or more whitespace characters,followed by
Rate # the literal 'Rate', followed by
\s* # zero or more whitespace characters, followed by
: # a literal colon ':', followed by
\s* # zero or more whitespace characters followed by
(?<rate> # an named (explicit) capture group, containing
\d+ # - 1 or more decimal digits, followed by
( # - an unnamed group, containing
(\s|,)+ # - interstial whitespace or a comma, followed by
\d+ # - 1 or more decimal digits
)* # the whole of which is repeated zero or more times
) # followed by
\s* # zero or more lead-out whitespace characters, followed by
$ # end of line
", RegexOptions.IgnorePatternWhitespace|RegexOptions.ExplicitCapture );
foreach ( string text in texts )
{
Match m = rxTopRate.Match(text);
if (!m.Success)
{
Console.WriteLine("No Match: '{0}'", text);
}
else
{
string rawValue = m.Groups["rate"].Value;
string cleanedValue = rxNonDigit.Replace(rawValue, "");
Decimal value = Decimal.Parse(cleanedValue);
Console.WriteLine(@"Matched: '{0}' >>> '{1}' >>> '{2}' >>> {3}",
text,
rawValue,
cleanedValue,
value
);
}
}
}
}
答案 2 :(得分:0)
答案 3 :(得分:-2)
String TopRate="88,888"
for(int x=0; x<TopRate.Length;x++)
{
if(TopRate[x]==",")
{
TopRate[x]="";
break;
}
}