我正在尝试使用xpath从给定的html段中提取所有文本 这是html代码快照
<div class="grid-block sub-list">
<ul>
<li><a href="http://yellowpages.sulekha.com/music-system-dealers_visakhapatnam" title="Music System Dealers in Visakhapatnam">Music System Dealers</a></li>
<li><a href="http://yellowpages.sulekha.com/projector-dealers_visakhapatnam" title="Projector Dealers in Visakhapatnam">Projector Dealers</a></li>
<li><a href="http://yellowpages.sulekha.com/satellite-tv-dealers_visakhapatnam" title="Satellite TV Dealers in Visakhapatnam">Satellite TV Dealers</a></li>
<li><a href="http://yellowpages.sulekha.com/tv-dealers_visakhapatnam" title="TV Dealers in Visakhapatnam">TV Dealers</a></li>
<li><a href="http://yellowpages.sulekha.com/bean-bags-dealers_visakhapatnam" title="Bean Bag Dealers in Visakhapatnam">Bean Bag Dealers</a></li>
<li><a href="http://yellowpages.sulekha.com/epabx-pbx-dealers_visakhapatnam" title="EPABX Dealers in Visakhapatnam">EPABX Dealers</a></li>
<li><a href="http://yellowpages.sulekha.com/generators-sales_visakhapatnam" title="Generators Dealers in Visakhapatnam">Generators Dealers</a></li>
<li><a href="http://yellowpages.sulekha.com/industrial-voltage-stabilizer-manufacturers_visakhapatnam" title="Industrial Voltage Stabilizers Dealers in Visakhapatnam">Industrial Voltage Stabilizers Dealers</a></li>
<li><a href="http://yellowpages.sulekha.com/online-ups-dealers_visakhapatnam" title="Online UPS Dealers in Visakhapatnam">Online UPS Dealers</a></li>
<li><a href="http://yellowpages.sulekha.com/photocopier-dealers_visakhapatnam" title="Photocopier Dealers in Visakhapatnam">Photocopier Dealers</a>
</li>
</ul>
</div>
现在我能够立即获得
Music System DealersProjector DealersSatellite TV DealersTV DealersBean Bag DealersEPABX DealersGenerators DealersIndustrial Voltage Stabilizers DealersOnline UPS DealersPhotocopier Dealers
预期出局应该是
Music System Dealers
Projector Dealers
Satellite TV Dealers
TV DealersBean Bag Dealers
EPABX Dealers
Generators Dealers
Industrial Voltage Stabilizers Dealers
Online UPS Dealers
Photocopier Dealers
以下是我尝试提取的方法
"".join(response.xpath('//body//*[not(self::script or self::style)]/text()').extract()
任何人都可以帮助我获得预期的输出
答案 0 :(得分:0)
添加一些像
这样的CSS.sub-list li{
float:left;
width:100%;
margin-bottom:5px;
}