BeautifulSoup找到Imgur上的所有图像链接

时间:2017-06-17 06:21:44

标签: python beautifulsoup

我无法找到Imgur相册中的所有链接。

这是来自imgur的html:

<div class="post-image">...
<a href="//i.imgur.com/P1VMco8.png" class="zoom"><img src="//i.imgur.com/P1VMco8.png" alt="" itemprop="contentURL" />

如何从页面中仅提取href?我用下面的代码得到了所有东西。

with urllib.request.urlopen('https://imgur.com/a/OmD1E') as f:
    r = f.read()
    soup = BeautifulSoup(r,'lxml')
    result = soup.select(".post-image a")

1 个答案:

答案 0 :(得分:1)

以下代码打印所有图像链接:

import urllib
from bs4 import BeautifulSoup
with urllib.request.urlopen('https://imgur.com/a/OmD1E') as f:
    soup = BeautifulSoup(f.read(),'lxml')
for image in soup.select(".post-image"):
    print(image.a["href"])

如果您只查找第一个.post-image,请执行

import urllib
from bs4 import BeautifulSoup
with urllib.request.urlopen('https://imgur.com/a/OmD1E') as f:
    soup = BeautifulSoup(f.read(),'lxml')
print(soup.select(".post-image")[0].a["href"])