给出从PDF中提取的示例文本:
Professional Learning - August 31 Labor Day - September 3 Intersession- October 19 Professional Learning - October 22 Thanksgiving Break - November 21-23 Winter Break - December 24 - January 4 Martin Luther King, Jr. Day - January 21 Presidents’ Day - February 18 Spring Break - March 18-22 Teacher Comp Day - April 19
我的目标是捕获所有月份和日期,即它应该捕获以下所有内容:
August 31
October 19
March 18-22
December 24 - January 4
December 24-January 4
困难的部分是捕获月份不同的范围。我想出了这个RegExp:
/(January|February|March|April|May|August|September|October|November|December)\\s([0-9]*-?[0-9]+)(\s*-\s*(January|February|March|April|May|August|September|October|November|December)\\s([0-9]*-?[0-9]+))?/g
除了上面列出的最后两个示例外,它对所有其他程序都有效。在regexr上,它显示它在捕获组#3中捕获的很好,但是我无法在JavaScript中访问它。以以下代码段为例:
const string = 'Professional Learning - August 31 Labor Day - September 3 Intersession- October 19 Professional Learning - October 22 Thanksgiving Break - November 21-23 Winter Break - December 24 - January 4 Martin Luther King, Jr. Day - January 21 Presidents’ Day - February 18 Spring Break - March 18-22 Teacher Comp Day - April 19';
const subRegex = '(January|February|March|April|May|August|September|October|November|December)\\s([0-9]*-?[0-9]+)';
const dateRegex = new RegExp(`${subRegex}(\s*-\s*${subRegex})?`, 'g');
console.log(string.match(dateRegex));
似乎我可以分别捕获December 24
和January 4
,但不能同时捕获。有什么办法可以将它们捕获在一起吗?
答案 0 :(得分:1)
您只需要稍微调整(也许简化)您的原始RE:
const str = 'Professional Learning - August 31 Labor Day - September 3 Intersession- October 19 Professional Learning - October 22 Thanksgiving Break - November 21-23 Winter Break - December 24 - January 4 Martin Luther King, Jr. Day - January 21 Presidents’ Day - February 18 Spring Break - March 18-22 Teacher Comp Day - April 19';
// str2 has "December 24-January 4" instead - no spaces
const str2 = 'Professional Learning - August 31 Labor Day - September 3 Intersession- October 19 Professional Learning - October 22 Thanksgiving Break - November 21-23 Winter Break - December 24-January 4 Martin Luther King, Jr. Day - January 21 Presidents’ Day - February 18 Spring Break - March 18-22 Teacher Comp Day - April 19';
const re = /(January|February|March|April|May|August|September|October|November|December) [\d-]+([ -]*(January|February|March|April|May|August|September|October|November|December) \d+)?/g;
console.log(str.match(re));
console.log(str2.match(re));