我正在尝试编写一个正则表达式,该表达式可用于查找字符串中的日期,该字符串的前面(或后面)可以有空格,数字,文本,行尾等。该表达式应处理美国日期格式是
1)年的月份名称天,即2019年1月10日,或
2)mm / dd / yy-即11/30/19
我找到这个是因为月份名称(Day Year)
(Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4}
(感谢Veverke在这里Regex to match date like month name day comma and year
这是mm / dd / yy(以及m / d / y的各种组合)
(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2}
(感谢Steven Levithan和Jan Goyvaerts在这里https://www.oreilly.com/library/view/regular-expressions-cookbook/9781449327453/ch04s04.html
我试图这样合并它们
((Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4})|((1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2})
,当我在输入字符串“ Pad on 1/1/2019”中搜索“ on [regex above]”时,它确实找到了日期,但没有找到单词“ on”。如果我只是使用
,则会找到该字符串(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2}
任何人都可以看到我在做什么吗?
修改
我正在使用下面的c#.net代码:
string stringToSearch = "Paid on 1/1/2019";
string searchPattern = @"on ((Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4})|((1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2})";
var match = Regex.Match(stringToSearch, searchPattern, RegexOptions.IgnoreCase);
string foundString;
if (match.Success)
foundString= stringToSearch.Substring(match.Index, match.Length);
例如
string searchPattern = @"on ((Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4})|((1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2})";
stringToSearch = "Paid on Jan 1, 2019";
found = "on Jan 1, 2019" -- worked as expected, found the word "on" and the date
stringToSearch = "Paid on 1/1/2019";
found = "1/1/2019" -- did not work as expected, found the date but did not include the word "on"
如果我反转图案
string searchPattern = @"on ((1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2})|((Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4})"";
stringToSearch = "Paid on Jan 1, 2019";
found = "Jan 1, 2019" -- did not work as expected, found the date but did not include the word "on"
stringToSearch = "Paid on 1/1/2019";
found = "on 1/1/2019" -- worked as expected, found the word "on" and the date
谢谢
答案 0 :(得分:0)
您的表情似乎都很好,两者都可以。如果您想捕获目标输出之前或之后的任何内容,只需在左右添加两个边界,即可为您完成。例如,请查看this test:
(.*)(((1[0-2]|0?[1-9])\/(3[01]|[12][0-9]|0?[1-9])\/(?:[0-9]{2})?[0-9]{2})|((Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4}))(.*)
例如,您可以在其中添加类似于(.*)
的两个组,然后将原始表达式包装在一组中。
该图直观地显示了您的表达式的工作方式,并且您可能希望在此link中测试其他表达式:
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = @"(.*)(((1[0-2]|0?[1-9])\/(3[01]|[12][0-9]|0?[1-9])\/(?:[0-9]{2})?[0-9]{2})|((Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4}))(.*)";
string input = @"Paid on Jan 1, 2019 And anything else that you wish to have after
Paid on 1/1/2019 And anything else that you wish to have after";
RegexOptions options = RegexOptions.Multiline;
foreach (Match m in Regex.Matches(input, pattern, options))
{
Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
}
}
}
此JavaScript演示显示您的表达式有效:
const regex = /(.*)(((1[0-2]|0?[1-9])\/(3[01]|[12][0-9]|0?[1-9])\/(?:[0-9]{2})?[0-9]{2})|((Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4}))(.*)/gm;
const str = `Paid on Jan 1, 2019 And anything else that you wish to have after
Paid on 1/1/2019 And anything else that you wish to have after`;
const subst = `\nGroup 1: $1 \nGroup 2: $2 \nGroup 3: $3 \nGroup 4: $4 `;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
此JavaScript代码段返回100万次for
循环以提高性能。
const repeat = 1000000;
const start = Date.now();
for (var i = repeat; i >= 0; i--) {
const string = 'Paid on Jan 1, 2019';
const regex = /(.*)(((1[0-2]|0?[1-9])\/(3[01]|[12][0-9]|0?[1-9])\/(?:[0-9]{2})?[0-9]{2})|((Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4}))(.*)/gm;
var match = string.replace(regex, "\nGroup #1: $1\nGroup #2: $2 \n");
}
const end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. ");
您可能希望在月份名称周围减少捕获组,并且可以根据需要将所有捕获组添加到一个捕获组中。