Question

我有一个图像名称列表，来自网站的图像：

image1.jpg
image2.jpg
image3.jph

我想将这些图片与下面html中以下<p>中的文字相关联。所以在下面的例子中，我想将image1.jpg和image2.jpg与“联邦鱼系”联系起来

我如何使用xpath（或其他东西）来做到这一点？

<td> 
    <p align = "center">
        <a href "http://imagessite.gov" target = "_blank">
            <img src = "image1.jpg" width = "100" height = "60" alt = "description">
            <img src = "image2.jpg" width = "100" height = "60" alt = "a purple ant">
        </a>
    </p>
    <p align = "center">
        <img src = "globe.gif">
        <a href = "http://imagesite.gov" target = "blank"> The Federal Department of Fish</a>
    </p>
</td>

Answer 1

a ='''<td> 
    <p align = "center">
        <a href "http://imagessite.gov" target = "_blank">
            <img src = "image1.jpg" width = "100" height = "60" alt = "description">
            <img src = "image2.jpg" width = "100" height = "60" alt = "a purple ant">
        </a>
    </p>
    <p align = "center">
        <img src = "globe.gif">
        <a href = "http://imagesite.gov" target = "blank"> The Federal Department of Fish</a>
    </p>
</td>'''

我已经存储了你给我们的html，其余的代码应该是这样的

soup = BeautifulSoup(a, 'lxml')
table = soup.findAll('img') #finds all img tags 

for tag in table: # We loop through the mentioned
    if tag['src'].endswith('.jpg'): # this will check if the value from src ends with .jpg 
        print(tag['src'])

至于关联部分，我认为你的意思是这样的。之后我会添加一部分。 用户提出的问题，我认为，就是我们，例如，如果我们查找image1.jpg，我们希望文本'The Federal Department of Fish'与之相关/相关。

我想那会是 dict 等等。但是，我尝试使用例如tag.parent.parent.next_sibling它不起作用我将调查它稍后编辑并添加。

XPATH-我有一个图像列表。我想使用它们来搜索和提取位于以下<p>中的文本

1 个答案: