对如何使用lxml感到很困惑...我通常使用正则表达式,因为我可以一次提取所有数据,但我不知道如何用lxml解析这些值:
data = tree.xpath('//div[@class="featured-item col-xs-12 col-sm-6 col-md-4 col-lg-3 center-block app_730_2"]')
# extract data from div class: featured-item col-xs-12 col-sm-6 col-md-4 col-lg-3 center-block app_730_2
"M4A4 | Poseidon " + "Factory New"
"9462141"
"195.00"
"https://steamcommunity-a.akamaihd.net/economy/image/-9a81dlWLwJ2UUGcVs_nsVtzdOEdtWwKGZZLQHTxDZ7I56KU0Zwwo4NUX4oFJZEHLbXH5ApeO4YmlhxYQknCRvCo04DEVlxkKgpou-6kejhjxszYfi5H5di5mr-HnvD8J_WCkmkEvp0pi7zDodv3jAHj-UM5ZGr7INfHJAc9MlzV-FK_kO281pa_ot2XnrA-A3kA/256fx256f"
"Chroma 2 Case Key"
"9462120"
"2.11"
"https://steamcommunity-a.akamaihd.net/economy/image/-9a81dlWLwJ2UUGcVs_nsVtzdOEdtWwKGZZLQHTxDZ7I56KU0Zwwo4NUX4oFJZEHLbXX7gNTPcUxuxpJSXPbQv2S1MDeXkh6LBBOie3rKFRh16PKd2pDvozixtSOwaP2ar7SlzIA6sEo2rHCpdyhjAGxr0A6MHezetG0RZXdTA/256fx256f"
我需要解析的html代码:
<div class="featured-item col-xs-12 col-sm-6 col-md-4 col-lg-3 center-block app_730_2">
<div>
<a class="glyphicon glyphicon-search market-name market-search-icon opskins-search-button" href="/?loc=shop_search&sort=lh&search_item=M4A4+%7C+Poseidon+%28Factory+New%29" title="Search"></a> <a class="market-name market-link" href="?loc=shop_view_item&item=9462141">
M4A4 | Poseidon
</a>
<div class="item-desc">
<small class="text-muted">Factory New</small>
<small style="color:#777777">Classified Rifle</small>
<small class="item-warning"></small>
</div>
<img class="item-img" src="https://steamcommunity-a.akamaihd.net/economy/image/-9a81dlWLwJ2UUGcVs_nsVtzdOEdtWwKGZZLQHTxDZ7I56KU0Zwwo4NUX4oFJZEHLbXH5ApeO4YmlhxYQknCRvCo04DEVlxkKgpou-6kejhjxszYfi5H5di5mr-HnvD8J_WCkmkEvp0pi7zDodv3jAHj-UM5ZGr7INfHJAc9MlzV-FK_kO281pa_ot2XnrA-A3kA/256fx256f">
<div class="item-add">
<div class="item-amount">$195.00</div>
<div class="market-name" style="padding-bottom:0.3em;"><i class="stm stm-steam" title="Steam Analyst"></i> <a style="color:white;" href="http://csgo.steamanalyst.com/id/115787731/" target="_BLANK">Suggested Price: $258.52</a>
</div>
<div class="item-buttons text-center"><a href="steam://rungame/730/76561202255233023/+csgo_econ_action_preview%20S76561198236464786A5000169384D16322433520890898502" class="btn btn-primary" style="margin-right:4px">Inspect</a>
<button class="btn btn-orange" type="button" id="shopItem" onclick="addToCart(9462141)">Add to Cart</button><span style="margin-left:3px;"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/apps/730/69f7ebe2735c366c65c0b33dae00e12dc40edbe4.jpg" data-appid="730" style="opacity: 0.7; display:inline"></span>
</div>
</div>
</div>
</div>
<div class="featured-item col-xs-12 col-sm-6 col-md-4 col-lg-3 center-block app_730_2">
<div>
<a class="glyphicon glyphicon-search market-name market-search-icon opskins-search-button" href="/?loc=shop_search&sort=lh&search_item=Chroma+2+Case+Key" title="Search"></a> <a class="market-name market-link" href="?loc=shop_view_item&item=9462120">
Chroma 2 Case Key
</a>
<div class="item-desc">
<small class="text-muted"></small>
<small style="color:#777777">Base Grade Key</small>
<small class="item-warning"></small>
</div>
<img class="item-img" src="https://steamcommunity-a.akamaihd.net/economy/image/-9a81dlWLwJ2UUGcVs_nsVtzdOEdtWwKGZZLQHTxDZ7I56KU0Zwwo4NUX4oFJZEHLbXX7gNTPcUxuxpJSXPbQv2S1MDeXkh6LBBOie3rKFRh16PKd2pDvozixtSOwaP2ar7SlzIA6sEo2rHCpdyhjAGxr0A6MHezetG0RZXdTA/256fx256f">
<div class="item-add">
<div class="item-amount">$2.11</div>
<div class="market-name" style="padding-bottom:0.3em;"><i class="stm stm-steam" title="Steam Analyst"></i> <a style="color:white;" href="http://csgo.steamanalyst.com/id/100994798/" target="_BLANK">Suggested Price: $2.70</a>
</div>
<div class="item-buttons text-center">
<button class="btn btn-orange" type="button" id="shopItem" onclick="addToCart(9462120)">Add to Cart</button><span style="margin-left:3px;"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/apps/730/69f7ebe2735c366c65c0b33dae00e12dc40edbe4.jpg" data-appid="730" style="opacity: 0.7; display:inline"></span>
</div>
</div>
</div>
</div>
PS:我是否需要为'//div[@class="featured-item col-xs-12 col-sm-6 col-md-4 col-lg-3 center-block app_730_2"]'
的每个实例执行for循环
或lxml将每个数据提取为列表?
答案 0 :(得分:1)
xpath
返回实例列表,您必须使用for
循环从实例中获取子元素。
示例代码低于data
data ='''<div class="featured-item col-xs-12 col-sm-6 col-md-4 col-lg-3 center-block app_730_2">
<div>
<a class="glyphicon glyphicon-search market-name market-search-icon opskins-search-button" href="/?loc=shop_search&sort=lh&search_item=M4A4+%7C+Poseidon+%28Factory+New%29" title="Search"></a> <a class="market-name market-link" href="?loc=shop_view_item&item=9462141">
M4A4 | Poseidon
</a>
<div class="item-desc">
<small class="text-muted">Factory New</small>
<small style="color:#777777">Classified Rifle</small>
<small class="item-warning"></small>
</div>
<img class="item-img" src="https://steamcommunity-a.akamaihd.net/economy/image/-9a81dlWLwJ2UUGcVs_nsVtzdOEdtWwKGZZLQHTxDZ7I56KU0Zwwo4NUX4oFJZEHLbXH5ApeO4YmlhxYQknCRvCo04DEVlxkKgpou-6kejhjxszYfi5H5di5mr-HnvD8J_WCkmkEvp0pi7zDodv3jAHj-UM5ZGr7INfHJAc9MlzV-FK_kO281pa_ot2XnrA-A3kA/256fx256f">
<div class="item-add">
<div class="item-amount">$195.00</div>
<div class="market-name" style="padding-bottom:0.3em;"><i class="stm stm-steam" title="Steam Analyst"></i> <a style="color:white;" href="http://csgo.steamanalyst.com/id/115787731/" target="_BLANK">Suggested Price: $258.52</a>
</div>
<div class="item-buttons text-center"><a href="steam://rungame/730/76561202255233023/+csgo_econ_action_preview%20S76561198236464786A5000169384D16322433520890898502" class="btn btn-primary" style="margin-right:4px">Inspect</a>
<button class="btn btn-orange" type="button" id="shopItem" onclick="addToCart(9462141)">Add to Cart</button><span style="margin-left:3px;"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/apps/730/69f7ebe2735c366c65c0b33dae00e12dc40edbe4.jpg" data-appid="730" style="opacity: 0.7; display:inline"></span>
</div>
</div>
</div>
</div>
<div class="featured-item col-xs-12 col-sm-6 col-md-4 col-lg-3 center-block app_730_2">
<div>
<a class="glyphicon glyphicon-search market-name market-search-icon opskins-search-button" href="/?loc=shop_search&sort=lh&search_item=Chroma+2+Case+Key" title="Search"></a> <a class="market-name market-link" href="?loc=shop_view_item&item=9462120">
Chroma 2 Case Key
</a>
<div class="item-desc">
<small class="text-muted"></small>
<small style="color:#777777">Base Grade Key</small>
<small class="item-warning"></small>
</div>
<img class="item-img" src="https://steamcommunity-a.akamaihd.net/economy/image/-9a81dlWLwJ2UUGcVs_nsVtzdOEdtWwKGZZLQHTxDZ7I56KU0Zwwo4NUX4oFJZEHLbXX7gNTPcUxuxpJSXPbQv2S1MDeXkh6LBBOie3rKFRh16PKd2pDvozixtSOwaP2ar7SlzIA6sEo2rHCpdyhjAGxr0A6MHezetG0RZXdTA/256fx256f">
<div class="item-add">
<div class="item-amount">$2.11</div>
<div class="market-name" style="padding-bottom:0.3em;"><i class="stm stm-steam" title="Steam Analyst"></i> <a style="color:white;" href="http://csgo.steamanalyst.com/id/100994798/" target="_BLANK">Suggested Price: $2.70</a>
</div>
<div class="item-buttons text-center">
<button class="btn btn-orange" type="button" id="shopItem" onclick="addToCart(9462120)">Add to Cart</button><span style="margin-left:3px;"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/apps/730/69f7ebe2735c366c65c0b33dae00e12dc40edbe4.jpg" data-appid="730" style="opacity: 0.7; display:inline"></span>
</div>
</div>
</div>
</div>'''
import lxml, lxml.html
html = lxml.html.fromstring(data)
divs = html.xpath('//div[@class="featured-item col-xs-12 col-sm-6 col-md-4 col-lg-3 center-block app_730_2"]')
for x in divs:
a = x.xpath('.//a/text()')[0]
print a.strip()
small = x.xpath('.//small[@class="text-muted"]/text()')
if small:
print small[0]
div = x.xpath('.//div[@class="item-amount"]/text()')[0]
print div
a_href = x.xpath('.//a/@href')
item = a_href[1].split('=')[-1]
print item
img = x.xpath('.//img[@class="item-img"]/@src')[0]
print img
-
M4A4 | Poseidon
Factory New
$195.00
9462141
https://steamcommunity-a.akamaihd.net/economy/image/-9a81dlWLwJ2UUGcVs_nsVtzdOEdtWwKGZZLQHTxDZ7I56KU0Zwwo4NUX4oFJZEHLbXH5ApeO4YmlhxYQknCRvCo04DEVlxkKgpou-6kejhjxszYfi5H5di5mr-HnvD8J_WCkmkEvp0pi7zDodv3jAHj-UM5ZGr7INfHJAc9MlzV-FK_kO281pa_ot2XnrA-A3kA/256fx256f
Chroma 2 Case Key
$2.11
9462120
https://steamcommunity-a.akamaihd.net/economy/image/-9a81dlWLwJ2UUGcVs_nsVtzdOEdtWwKGZZLQHTxDZ7I56KU0Zwwo4NUX4oFJZEHLbXX7gNTPcUxuxpJSXPbQv2S1MDeXkh6LBBOie3rKFRh16PKd2pDvozixtSOwaP2ar7SlzIA6sEo2rHCpdyhjAGxr0A6MHezetG0RZXdTA/256fx256f