Question

我有这个xpath返回选择器列表。

for i in response.xpath('//*[name()="h2" or name()="h3" or name()="p"]'):
     print i

结果：

<Selector xpath='//*[name()="h2" or name()="h3" or name()="p"]' data=u'<h3 class="fusion-header-tagline"><img s'>
<Selector xpath='//*[name()="h2" or name()="h3" or name()="p"]' data=u'<h3 class="features-title role-element l'>
<Selector xpath='//*[name()="h2" or name()="h3" or name()="p"]' data=u'<h2 style="text-align: center;">Sell you'>
<Selector xpath='//*[name()="h2" or name()="h3" or name()="p"]' data=u'<p>We buy properties in any shape, any p'>
<Selector xpath='//*[name()="h2" or name()="h3" or name()="p"]' data=u'<p>Attempting to sell your house in Marl'>
<Selector xpath='//*[name()="h2" or name()="h3" or name()="p"]' data=u'<h2 style="text-align: center;"><span st'>
<Selector xpath='//*[name()="h2" or name()="h3" or name()="p"]' data=u'<p><img class="aligncenter wp-image-1439'>
<Selector xpath='//*[name()="h2" or name()="h3" or name()="p"]' data=u'<h3><span style="color: #000000;">No com'>
<Selector xpath='//*[name()="h2" or name()="h3" or name()="p"]' data=u'<h3><span style="color: #000000;">You do'>
<Selector xpath='//*[name()="h2" or name()="h3" or name()="p"]' data=u'<h3><span style="color: #000000;">We wil'>
<Selector xpath='//*[name()="h2" or name()="h3" or name()="p"]' data=u'<h3><span style="color: #000000;">No lis'>

如何获取每个选择器的标签名称？即h3, h3, h2, p, p, h2等。我尝试过

print name(i)
print i.name()

那是行不通的。如何正确使用xpath name()来获取标签名称？

Answer 1

将代码更改为此：

for i in response.xpath('//*[name()="h2" or name()="h3" or name()="p"]'):
    print i.xpath('name()')

这将从第一个xpath上选择的每个元素中选择name()

Answer 2

Scrapy中的选择器实际上并不代表HTML树中的节点，而必须被视为引用XPath或CSS选择器的结果的抽象。因此，它们也没有标签名称或属性的概念。但是，您可以使用root属性轻松访问选择器的基础根节点：

for i in response.xpath('//*[name()="h2" or name()="h3" or name()="p"]'):
     print(i.root.tag)

Xpath。如何获得给定选择器的标签名称。 cra草

2 个答案: