我有这个字符串
设施齐全的自助式两卧室套房距离UVIC仅有5分钟步行路程,可供9月1日使用。
现在我正在使用pregmatch来提取它:这是正则表达式。
'/\bavailable\\s(?P<date_available>[?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?|immediately]+[\\s\d]+)[st|nd|rd|th]?/i'
目前这个正则表达式可以从字符串中提取:
Available september 1st.
Available September 2nd
available september 3rd
available september 4th
available sept 1
输出示例是:
Array
(
[0] => available September 1
[date_available] => September 1
[1] => September 1
)
但是我找不到提取字符串的方法:
Available for september 1st.
Available in September 2nd
available since september 3rd
available at september 4th
谁能帮我处理这件事?感谢
答案 0 :(得分:1)
使用通配符A-Z,2到5个字母(匹配“on”之类的东西):
$regex = '/\bavailable[ ]*(?:[a-z]{2,5})?[ ]*' .
'(?P<date_available>immediately|now|' .
'(?:(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?' .
'|Apr(?:il)?|May|Jun(?:e)|Jul(?:y)?|Aug(?:ust)?' .
'|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)' .
'[ ]+[\d]+))' .
//end <date_available>
'(?:st|nd|rd|th)?/i';
用法:
$lines = array(
'Fully furnished self contained 2 bedroom suite just 5 minute walk to UVIC is available now.',
'bedroom suite just 5 minute walk to UVIC is available on September 34.',
'bedroom suite just 5 minute walk to somewhere is available on Apr 1.',
);
foreach ($lines as $line) {
echo $line, "\n<br>\n";
if (preg_match($regex, $line, $matches) === 1) {
print_r($matches['date_available']);
} else {
echo "Does not match.";
}
echo "\n<br>\n";
}
答案 1 :(得分:0)
以下适用于所有示例,虽然我没有在PHP中放入“命名子模式”,因为我不知道它们的确切语法
\bavailable\s+(?:(?:for|in|at|since)\s+)?((?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Sept(?:ember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\s+\d{1,2}(?:st|nd|rd|th)?)
答案 2 :(得分:0)
我实际上根本无法让你的工作,看起来好像你正在尝试使用带方括号[ ]
的字符类,而不是使用括号( )
进行分组和交替。
以下可能是我根据您的要求获得的最短时间
$pattern = '/\bavailable\s+(?:(?:for|in|at|since)\s+)?((?:immediately|now)|(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Oct(?:ober)?|(?:Sept|Nov|Dec)(?:ember)?)\s+?\d{1,2}(?:st|nd|rd|th)?)/i';
这不包括指定的子模式,因为必需的匹配将始终在$matches[1]
中,但是如果要包含命名的子模式,则可以随时放入一个。
$pattern = '/\bavailable\s+(?:(?:for|in|at|since)\s+)?(?P<date_available>(?:immediately|now)|(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Oct(?:ober)?|(?:Sept|Nov|Dec)(?:ember)?)\s+?\d{1,2}(?:st|nd|rd|th)?)/i';
作为对@EthanB早期解决方案的回应,您似乎没有捕获日期st, nd, rd, th
的序数后缀,如果是这种情况,并且不需要,那么您可以通过不包括更短的时间来缩短它那天,在天数之后尝试匹配任何东西都没有意义。