需要使用preg_match从字符串中提取日期

时间:2012-08-23 08:38:31

标签: php preg-match

我有这个字符串

  

设施齐全的自助式两卧室套房距离UVIC仅有5分钟步行路程,可供9月1日使用。

现在我正在使用pregmatch来提取它:这是正则表达式。

'/\bavailable\\s(?P<date_available>[?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?|immediately]+[\\s\d]+)[st|nd|rd|th]?/i'

目前这个正则表达式可以从字符串中提取:

Available september 1st.
Available September 2nd
available september 3rd
available september 4th
available sept 1

输出示例是:

Array
(
    [0] => available September 1
    [date_available] => September 1
    [1] => September 1
)

但是我找不到提取字符串的方法:

Available for september 1st.
Available in September 2nd
available since september 3rd
available at september 4th
谁能帮我处理这件事?感谢

3 个答案:

答案 0 :(得分:1)

使用通配符A-Z,2到5个字母(匹配“on”之类的东西):

$regex = '/\bavailable[ ]*(?:[a-z]{2,5})?[ ]*' .
    '(?P<date_available>immediately|now|' .
    '(?:(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?' .
    '|Apr(?:il)?|May|Jun(?:e)|Jul(?:y)?|Aug(?:ust)?' .
    '|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)' .
    '[ ]+[\d]+))' .
    //end <date_available>
    '(?:st|nd|rd|th)?/i';

用法:

$lines = array(
    'Fully furnished self contained 2 bedroom suite just 5 minute walk to UVIC is available now.',
    'bedroom suite just 5 minute walk to UVIC is available on September 34.',
    'bedroom suite just 5 minute walk to somewhere is available on Apr 1.',
    );

foreach ($lines as $line) {
    echo $line, "\n<br>\n";
    if (preg_match($regex, $line, $matches) === 1) {
        print_r($matches['date_available']);
    } else {
        echo "Does not match.";
    }
    echo "\n<br>\n";
}

答案 1 :(得分:0)

以下适用于所有示例,虽然我没有在PHP中放入“命名子模式”,因为我不知道它们的确切语法

\bavailable\s+(?:(?:for|in|at|since)\s+)?((?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Sept(?:ember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\s+\d{1,2}(?:st|nd|rd|th)?)

答案 2 :(得分:0)

我实际上根本无法让你的工作,看起来好像你正在尝试使用带方括号[ ]的字符类,而不是使用括号( )进行分组和交替。

以下可能是我根据您的要求获得的最短时间

$pattern = '/\bavailable\s+(?:(?:for|in|at|since)\s+)?((?:immediately|now)|(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Oct(?:ober)?|(?:Sept|Nov|Dec)(?:ember)?)\s+?\d{1,2}(?:st|nd|rd|th)?)/i';

这不包括指定的子模式,因为必需的匹配将始终在$matches[1]中,但是如果要包含命名的子模式,则可以随时放入一个。

$pattern = '/\bavailable\s+(?:(?:for|in|at|since)\s+)?(?P<date_available>(?:immediately|now)|(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Oct(?:ober)?|(?:Sept|Nov|Dec)(?:ember)?)\s+?\d{1,2}(?:st|nd|rd|th)?)/i';

作为对@EthanB早期解决方案的回应,您似乎没有捕获日期st, nd, rd, th的序数后缀,如果是这种情况,并且不需要,那么您可以通过不包括更短的时间来缩短它那天,在天数之后尝试匹配任何东西都没有意义。