如何从巨大的字符串输出中提取这个元素?

时间:2014-02-27 18:33:59

标签: php regex string

我已尝试使用正则表达式和preg_replace来做这件事,但我对正则表达式的说法非常糟糕。

<div class="eab-guest-actions"><a href="#cancel-attendance" class="eab-guest-cancel_attendance" data-eab-user_id="COULD BE ANYTHING" data-eab-event_id="COULD BE ANYTHING">Cancel attendance</a></div>

任何人都有一些技巧,步骤,或处理正则表达式?

2 个答案:

答案 0 :(得分:1)

您可以使用:

$pattern = '~<div class="eab-guest-actions">.*?</div>~s';
$str = preg_replace($pattern, '', $str);

或者这(更高效):

$pattern = '~<div class="eab-guest-actions">(?>[^<]++|<(?!/div>))*</div>~';

如果你想更精确:

$start = '<div class="eab-guest-actions"><a href="#cancel-attendance" class="eab-guest-cancel_attendance"';

$pattern = '~\Q' . $start . '\E(?>[^<]++|<(?!/div>))*</div>~';

细节:

\Q...\E用于转义可能的特殊字符,这对于放置文字字符串非常有用。

(?>             # open an atomic group
    [^<]++      # all that is not a <
  |             # OR
    <(?!/div>)  # a < not followed by /div>
)*              # repeat the group zero or more times.

答案 1 :(得分:1)

如果您尝试阅读data-eab-user_id元素的data-eab-event_ida属性,则可以使用DOMDocument执行此操作:

<?php
    $html = '<div class="eab-guest-actions"><a href="#cancel-attendance" class="eab-guest-cancel_attendance" data-eab-user_id="USERIDHERE" data-eab-event_id="EVENTIDHERE">Cancel attendance</a></div>';

    $userID = $eventID = null;

    $document = DOMDocument::loadXML($html);
    $xpath = new DOMXpath($document);
    $elements = $xpath->query("a[@class='eab-guest-cancel_attendance']");
    if (!empty($elements)) {
        foreach ($elements as $element) {
            $userID = $element->getAttribute('data-eab-user_id');
            $eventID = $element->getAttribute('data-eab-event_id');
        }
    }

    var_dump($userID, $eventID);
    //string(10) "USERIDHERE"
    //string(11) "EVENTIDHERE"
?>

DEMO

请注意我们如何使用loadXML代替loadHTML,因为这不是完整的HTML。