我有以下脚本,在页面上查找图像并下载:
from lxml import html
import urllib
import urllib2
url = 'http://www.example.com/pages/page0987/'
usock = urllib2.urlopen(url)
data = usock.read()
usock.close()
tree = html.fromstring(data)
src = tree.xpath('/html/body/div[2]/div[4]/div/div/img/@src')
urllib.urlretrieve(src, "local-filename.jpg")
我得到一个网页,访问此页面上的<img>
元素(我使用XPath查询找到它),然后我获得此元素的src
属性,然后尝试下载使用来自源的此URL的图像。
但是出了点问题; Python说:
Traceback (most recent call last):
File "C:\Users\Sergey\Desktop\dlImg.py", line 15, in <module>
urllib.urlretrieve(src, "local-filename.jpg")
File "C:\Python27\lib\urllib.py", line 94, in urlretrieve
return _urlopener.retrieve(url, filename, reporthook, data)
File "C:\Python27\lib\urllib.py", line 228, in retrieve
url = unwrap(toBytes(url))
File "C:\Python27\lib\urllib.py", line 1060, in unwrap
url = url.strip()
AttributeError: 'list' object has no attribute 'strip'
答案 0 :(得分:2)
您的tree.xpath()
查询会返回列表,而非一次匹配。至少是第一项的索引:
urllib.urlretrieve(src[0], "local-filename.jpg")
或在结果上使用循环。考虑到列表也可以为空(未找到匹配项)。