我有一个用c Sharp编写的程序,用于从包含检查结果的CSV文件中提取图案。匹配包含4位数字的中心号码的正则表达式之一是匹配其他带有斜杠的字符串,即日期时间字符串。
4个数字的正则表达式提取一个名为centerNumber的命名组:
(?<centreNumber>[0-9]{4})
。
记录模式后的匹配项包括:
matched centre number -> 6319
matched centre number -> 4/22/2017 6:28:17 PM
matched centre number -> 2016 MALAWI SCHOOL CERTIFICATE OF EDUCATION EXAMINATIONS
输入样本,按CSV逐行显示:
CENTRE NO: LIKOMA SECONDARY
CAND.ID
0035
4/22/2017 6:28:17 PM
CENTRE NO: LIKOMA SECONDARY
CAND.ID
5035
4/22/2017 6:28:17 PM
CENTRE NO: CHIFUNGA COMMUNITY
CAND.ID
0224
4/22/2017 6:28:46 PM
CENTRE NO: CHIKONDE COMMUNITY
CAND.ID
0238
4/22/2017 6:28:46 PM
上述示例输入的预期输出:
0035
5035
0224
0238
要访问命名组,我已将Regex加载到一个常量中:
StreamReader sr = new StreamReader(filepath);
while (!sr.EndOfStream)
{
var oneLine = sr.ReadLine();//read single line from csv
public const String REGEX_MSCE_CENTRE_NO = @"(?<centreNumber>[0-9]{4})";
Regex cNoRegex = new Regex(classes.AppConstants.REGEX_MSCE_CENTRE_NO, RegexOptions.Compiled | RegexOptions.IgnoreCase);
MatchCollection matches = cNoRegex.Matches(oneLine);
if (matches.Count == 1)
{
Console.WriteLine("matched centre number -> " + oneLine);
}
}
答案 0 :(得分:3)
正如FLydog57的评论中所述,这里我们只想拥有开始和结束锚点,这可能会解决我们的问题:
^[0-9]{4}$
^\d{4}$
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = @"^[0-9]{4}$";
string input = @"6319
4/22/2017 6:28:17 PM
2016 MALAWI SCHOOL CERTIFICATE OF EDUCATION EXAMINATIONS
2016";
RegexOptions options = RegexOptions.Multiline;
foreach (Match m in Regex.Matches(input, pattern, options))
{
Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
}
}
}