您好我正在天气网站上进行屏幕刮擦,其中包含内联样式的div并且没有类或ID,这里是他们的代码:
<div class="TodaysForecastContainer">
<div class="TodaysForecastContainerInner">
<div style="font-size:12px;"><u>This morning</u></div>
<div style="position:absolute;top:17px;left:3px;">
<a href="forecastPublicExtended.asp#Period0" target="_blank">
<img src="./images/wimages/b_cloudy.gif" height="50px" width="50px" alt="weather image">
</a> </div>
<div style="position:absolute; top:25px; left:57px; text-align:left; height:47px; width:90px;">
Sunny Breaks </div>
</div>
<div class="TodaysForecastContainerInner">
<div style="font-size:12px;"><u>This afternoon</u></div>
<div style="position:absolute;top:17px;left:3px;">
<a href="forecastPublicExtended.asp#Period0" target="_blank">
<img src="./images/wimages/b_pcloudy.gif" height="50px" width="50px" alt="weather image">
</a> </div>
<div style="position:absolute; top:25px; left:57px; text-align:left; height:47px; width:90px;">
Mix of Sun and Cloud </div>
</div>
问题是绝对位置内联样式,他们没有类或ID,我希望我可以添加一个类名称并删除div上的内联样式与“今天早上”,div包含图像,并删除链接和具有描述的div(例如Sunny Breaks)也改变了所有的TodaysForecastContainerInner,因为它有大约4个预测。使它类似于:
<div class="day>This morning</div><div class="thumbnail"><img src="sample.jpg"></div><div class="description">Sunny Breaks</div>
我正在使用:
foreach($html->find('.TodaysForecastContainerInner div') as $e)
echo $e->innertext . '<br>';
删除所有使用u和img标记生活的div, 我只是无法使用描述来设置div我使用img和u标签来设置其他两个div的样式,我只是php的初学者我希望有人能给我建议非常感谢你。
答案 0 :(得分:1)
查看phpQuery库。它可以使用PHP进行类似jQuery的操作。此代码基本上完成了您要执行的操作:
<?php
include 'phpQuery-onefile.php';
$text = <<<EOF
<div class="TodaysForecastContainer">
<div class="TodaysForecastContainerInner">
<div style="font-size:12px;"><u>This morning</u></div>
<div style="position:absolute;top:17px;left:3px;">
<a href="forecastPublicExtended.asp#Period0" target="_blank">
<img src="./images/wimages/b_cloudy.gif" height="50px" width="50px" alt="weather image">
</a>
</div>
<div style="position:absolute; top:25px; left:57px; text-align:left; height:47px; width:90px;">
Sunny Breaks
</div>
</div>
<div class="TodaysForecastContainerInner">
<div style="font-size:12px;"><u>This afternoon</u></div>
<div style="position:absolute;top:17px;left:3px;">
<a href="forecastPublicExtended.asp#Period0" target="_blank">
<img src="./images/wimages/b_pcloudy.gif" height="50px" width="50px" alt="weather image">
</a>
</div>
<div style="position:absolute; top:25px; left:57px; text-align:left; height:47px; width:90px;">
Mix of Sun and Cloud
</div>
</div>
EOF;
$doc = phpQuery::newDocumentHTML( $text );
$containers = pq('.TodaysForecastContainerInner', $doc);
foreach( $containers as $container ) {
$div = pq('div', $container);
$div->eq(0)->removeAttr('style')->addClass('day')->html( pq( 'u', $div->eq(0) )->html() );
$div->eq(1)->removeAttr('style')->addClass('thumbnail')->html( pq( 'img', $div->eq(1))->removeAttr('height')->removeAttr('width')->removeAttr('alt') );
$div->eq(2)->removeAttr('style')->addClass('description');
}
print $doc;
结果:
<div class="TodaysForecastContainer">
<div class="TodaysForecastContainerInner">
<div class="day">This morning</div>
<div class="thumbnail"><img src="./images/wimages/b_cloudy.gif"></div>
<div class="description">
Sunny Breaks
</div>
</div>
<div class="TodaysForecastContainerInner">
<div class="day">This afternoon</div>
<div class="thumbnail"><img src="./images/wimages/b_pcloudy.gif"></div>
<div class="description">
Mix of Sun and Cloud
</div>
</div>
答案 1 :(得分:0)
在客户端上比在服务器上更容易。
这个jQuery + Javascript将清除你的内联样式并为每个样式应用一个类名:
$(document).ready(function() {
var target = $('.TodaysForecastContainerInner div')
for(var x=0;x< target.length;x++) {
target.eq(x).attr('style','');
target.eq(x).addClass("A_"+x)
}
})
结果:
<div class="TodaysForecastContainerInner">
<div style="" class="A_0"><u>This morning</u></div>
<div style="" class="A_1">
<a target="_blank" href="forecastPublicExtended.asp#Period0">
<img height="50px" width="50px" alt="weather image" src="./images/wimages/b_cloudy.gif">
</a> </div>
<div style="" class="A_2">
Sunny Breaks </div>
</div>
您可以使用样式表使其看起来像您想要的那样。