在php html dom解析器问题中设计内联样式

时间:2010-12-31 15:32:09

标签: php css

您好我正在天气网站上进行屏幕刮擦,其中包含内联样式的div并且没有类或ID,这里是他们的代码:

<div class="TodaysForecastContainer">

                    <div class="TodaysForecastContainerInner">
                        <div style="font-size:12px;"><u>This morning</u></div>
                        <div style="position:absolute;top:17px;left:3px;">
                            <a href="forecastPublicExtended.asp#Period0" target="_blank">
                                <img src="./images/wimages/b_cloudy.gif" height="50px" width="50px" alt="weather image">        
                            </a>                    </div>
                        <div style="position:absolute; top:25px; left:57px; text-align:left; height:47px; width:90px;">
                            Sunny Breaks                            </div>
                    </div>

                    <div class="TodaysForecastContainerInner">
                        <div style="font-size:12px;"><u>This afternoon</u></div>
                        <div style="position:absolute;top:17px;left:3px;">
                            <a href="forecastPublicExtended.asp#Period0" target="_blank">
                                <img src="./images/wimages/b_pcloudy.gif" height="50px" width="50px" alt="weather image">       
                            </a>                    </div>
                        <div style="position:absolute; top:25px; left:57px; text-align:left; height:47px; width:90px;">
                            Mix of Sun and Cloud                            </div>
                    </div>

问题是绝对位置内联样式,他们没有类或ID,我希望我可以添加一个类名称并删除div上的内联样式与“今天早上”,div包含图像,并删除链接和具有描述的div(例如Sunny Breaks)也改变了所有的TodaysForecastContainerInner,因为它有大约4个预测。使它类似于:

<div class="day>This morning</div><div class="thumbnail"><img src="sample.jpg"></div><div class="description">Sunny Breaks</div>

我正在使用:

foreach($html->find('.TodaysForecastContainerInner div') as $e)
echo $e->innertext . '<br>';

删除所有使用u和img标记生活的div, 我只是无法使用描述来设置div我使用img和u标签来设置其他两个div的样式,我只是php的初学者我希望有人能给我建议非常感谢你。

2 个答案:

答案 0 :(得分:1)

查看phpQuery库。它可以使用PHP进行类似jQuery的操作。此代码基本上完成了您要执行的操作:

<?php

include 'phpQuery-onefile.php';

$text = <<<EOF
<div class="TodaysForecastContainer">
    <div class="TodaysForecastContainerInner">
        <div style="font-size:12px;"><u>This morning</u></div>
        <div style="position:absolute;top:17px;left:3px;">
                <a href="forecastPublicExtended.asp#Period0" target="_blank">
                        <img src="./images/wimages/b_cloudy.gif" height="50px" width="50px" alt="weather image">        
                </a>
        </div>
        <div style="position:absolute; top:25px; left:57px; text-align:left; height:47px; width:90px;">
            Sunny Breaks
        </div>
    </div>
    <div class="TodaysForecastContainerInner">
        <div style="font-size:12px;"><u>This afternoon</u></div>
        <div style="position:absolute;top:17px;left:3px;">
            <a href="forecastPublicExtended.asp#Period0" target="_blank">
                <img src="./images/wimages/b_pcloudy.gif" height="50px" width="50px" alt="weather image">       
            </a>
        </div>
        <div style="position:absolute; top:25px; left:57px; text-align:left; height:47px; width:90px;">
            Mix of Sun and Cloud
        </div>
    </div>
EOF;

$doc = phpQuery::newDocumentHTML( $text );

$containers = pq('.TodaysForecastContainerInner', $doc);
foreach( $containers as $container ) {
    $div = pq('div', $container);

    $div->eq(0)->removeAttr('style')->addClass('day')->html( pq( 'u', $div->eq(0) )->html() );  
    $div->eq(1)->removeAttr('style')->addClass('thumbnail')->html( pq( 'img', $div->eq(1))->removeAttr('height')->removeAttr('width')->removeAttr('alt') );
    $div->eq(2)->removeAttr('style')->addClass('description');  
}

print $doc;

结果:

<div class="TodaysForecastContainer">
  <div class="TodaysForecastContainerInner">
    <div class="day">This morning</div>
    <div class="thumbnail"><img src="./images/wimages/b_cloudy.gif"></div>
    <div class="description">
      Sunny Breaks
    </div>
  </div>
  <div class="TodaysForecastContainerInner">
    <div class="day">This afternoon</div>
    <div class="thumbnail"><img src="./images/wimages/b_pcloudy.gif"></div>
    <div class="description">
      Mix of Sun and Cloud
    </div>
  </div>

答案 1 :(得分:0)

在客户端上比在服务器上更容易。

这个jQuery + Javascript将清除你的内联样式并为每个样式应用一个类名:

$(document).ready(function() { 
     var target = $('.TodaysForecastContainerInner div')
         for(var x=0;x< target.length;x++) {
               target.eq(x).attr('style','');
               target.eq(x).addClass("A_"+x)
         }   
})

结果:

<div class="TodaysForecastContainerInner">
    <div style="" class="A_0"><u>This morning</u></div>
    <div style="" class="A_1">
        <a target="_blank" href="forecastPublicExtended.asp#Period0">
            <img height="50px" width="50px" alt="weather image" src="./images/wimages/b_cloudy.gif">        
        </a>                    </div>
    <div style="" class="A_2">
        Sunny Breaks                            </div>
</div>

您可以使用样式表使其看起来像您想要的那样。