PHP - 如何从内容块中检索和处理所有图像及其父锚标记?

时间:2012-05-08 22:49:36

标签: php regex image xpath anchor

注意:我正在使用Wordpress,但我不相信它与答案相关,所以我已经问过了。如果我错了,请告诉我/移动问题。

好吧,我正在加载丰富内容的块(通过Wordpress),这些内容经常包含许多用锚标记包装的图像。我想逐步浏览所有这些内容,以便将它们显示为a标记,并在其中包含相关的img

already found这个方便的正则表达式代码可以很好地获取图像:

            // Get the all post content in a variable
            $posttext = $post->post_content;
            //$posttext1 = get_cleaned_excerpt();

            // We will search for the src="" in the post content
            $regular_expression = '~src="[^"]*"~';
            $regular_expression1 = '~<img [^\>]*\ />~';

            // WE will grab all the images from the post in an array $allpics using preg_match_all
            preg_match_all( $regular_expression, $posttext, $allpics );

            // Count the number of images found.
            $NumberOfPics = count($allpics[0]);

            // This time we replace/remove the images from the content
             $only_post_text = preg_replace( $regular_expression1, '' , $posttext1);
            /*Only text will be printed*/

            // Check to see if we have at least 1 image
            if ( $NumberOfPics > 0 )
            {

            $this_post_id = get_the_ID();


            for ( $i=0; $i < $NumberOfPics ; $i++ )
            {           $str1=$allpics[0][$i];
            $str1=trim($str1);
            $len=strlen($str1);
            $imgpath=substr_replace(substr($str1,5,$len),"",-1);



            $theImageSrc = $imgpath;
            global $blog_id;
            if (isset($blog_id) && $blog_id > 0) {
                $imageParts = explode('/files/', $theImageSrc);
                if (isset($imageParts[1])) {
                    $theImageSrc = '/blogs.dir/' . $blog_id . '/files/' . $imageParts[1];
                }
    }

            ?>

            <img class="alignleft" src='<?php echo get_bloginfo('template_directory').'/timthumb.php?src=' . $theImageSrc  . '&h=150&w=150'; ?>' height="150" width="150" alt=""/>

我真的想用相关的父img包裹那个底部a。非常感谢任何帮助。

要搜索的内容的示例可能是:

    <h5>
    <a href="http://www.example.com/imagefoo.jpg">
        <img class="size-thumbnail wp-image-4091 alignleft" src="http://www.example.com/imagefoo-150x150.jpg" alt="" width="150" height="150" />
    </a>
</h5>
<h5>
    <a href="http://www.example.com/Image-Bar.jpg">
        <img class="wp-image-4087 alignleft" title="Image - Bar" src="http://www.example.com/Image-Bar-150x150.jpg" alt="" width="150" height="150" />
    </a>
</h5>
<h5>
    <a href="http://www.example.com/Image-Alphe.jpg">
        <img class="wp-image-4090 alignleft" title="Image-Alpha" src="http://www.example.com/Image-Alpha-150x150.jpg" alt="" width="150" height="150" />
    </a>
</h5>
    <a href="http://www.example.com/EXAMPLE-image-150.jpg"><img class="size-thumbnail wp-image-4088 alignleft" title="EXAMPLE-image-150" src="http://www.example.com/EXAMPLE-image-150-150x150.jpg" alt="" width="150" height="150" /></a>
<h5>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</h5>

<a href="http://www.example.com/insanely-long-permalink-created-as-if-by-a-madman-who-knows-no-bounds-of-shame/" rel="attachment wp-att-2780">
    <img class="alignright size-thumbnail wp-image-2780" title="Exhibition Title: Image Name by Artist Person" src="http://www.example.com/wp-content/uploads/2011/12/ExtraordinaryImage-150x150.jpg" alt="Example UK | Exhibition: Image by Artist Person" width="150" height="150" />
</a>
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

编辑:这是基于我的需求的工作代码。它使用XPath,基于cHao的答案如下。 (对于它的价值,我发现Tizag's webpagethis EarthInfo page一起作为XPath入门非常有用。):

            // Get the all post content in a variable
            $posttext = $post->post_content;

            $document = DOMDocument::loadHTML($posttext);
            $xpath = new DOMXPath($document);
             $i = 0;
            # for each link that has an image inside it, set its href equal to
            # the image's src.
            foreach ($xpath->query('//a/img/..') as $link) :


                $img = $link->getElementsByTagName('img')->item(0);
                $link_src = $link->getAttribute('href');
                $link_title = $link->getAttribute('title');
                $img_src = $img->getAttribute('src');


                $theImageSrc = $img_src;
                global $blog_id;
                if (isset($blog_id) && $blog_id > 0) {
                    $imageParts = explode('/files/', $theImageSrc);
                    if (isset($imageParts[1])) {
                        $theImageSrc = '/blogs.dir/' . $blog_id . '/files/' . $imageParts[1];
                    }
                }

                ?>

                <a href="<?php echo $link_src; ?>" rel="lightbox[<?php echo $this_post_id; ?>]" title="<?php if ($link_title) {
                    echo $link_title;
                } else { the_title(); } ?>" class="cboxElement">
                <img class="alignleft" src='<?php echo get_bloginfo('template_directory').'/timthumb.php?src=' . $theImageSrc  . '&h=150&w=150'; ?>' height="150" width="150" alt=""/>
            </a>

            <?php

            endforeach;

            ?>

1 个答案:

答案 0 :(得分:2)

最好不要使用正则表达式来查找图像。他们很难解析HTML。

相反,请查看DOMDocument和DOMXPath类。

$document = DOMDocument::loadHTML($posttext);
$xpath = new DOMXPath($document);

# for each link that has an image inside it, set its href equal to
# the image's src.
foreach ($xpath->query('//a[/img]') as $link) {
    $img = $link->getElementsByTagName('img')->item(0);
    $src = $img->getAttribute('src');

    # do your mangling of $src here, resulting in $href.
    # for example...
    $href = preg_replace('/-\d+x\d+(?=\.[^.]*$)/', '', $src);

    $link->setAttribute('href', $href);
}

$fixed_html = $document->saveHTML();