我无法找到Imgur相册中的所有链接。
这是来自imgur的html:
<div class="post-image">...
<a href="//i.imgur.com/P1VMco8.png" class="zoom"><img src="//i.imgur.com/P1VMco8.png" alt="" itemprop="contentURL" />
如何从页面中仅提取href?我用下面的代码得到了所有东西。
with urllib.request.urlopen('https://imgur.com/a/OmD1E') as f:
r = f.read()
soup = BeautifulSoup(r,'lxml')
result = soup.select(".post-image a")
答案 0 :(得分:1)
以下代码打印所有图像链接:
import urllib
from bs4 import BeautifulSoup
with urllib.request.urlopen('https://imgur.com/a/OmD1E') as f:
soup = BeautifulSoup(f.read(),'lxml')
for image in soup.select(".post-image"):
print(image.a["href"])
如果您只查找第一个.post-image
,请执行
import urllib
from bs4 import BeautifulSoup
with urllib.request.urlopen('https://imgur.com/a/OmD1E') as f:
soup = BeautifulSoup(f.read(),'lxml')
print(soup.select(".post-image")[0].a["href"])