我有一个问题是切掉我从Beautifulsoup得到的网址,我已经使用这段代码来检索网址。
import urllib2
from bs4 import BeautifulSoup
url = 'http://192.168.0.184:88/cgi-bin/CGIProxy.fcgi? cmd=snapPicture&usr=USER&pwd=PASS'
html = urllib2.urlopen(url)
soup = BeautifulSoup(html, "html5lib")
imgs = soup.findAll("img")
print imgs
print imgs[1:]
根据print imgs的结果,我得到[<img src="../snapPic/Snap_20160401-110642.jpg"/>]
我想从这个字符串中删除不需要的字符,所以我尝试用于例如。 print imgs [1:]但是结果我得到[]
任何提示或解决方案?
我想重建imgs字符串到正确的图像网址
imgs string = <img src="../snapPic/Snap_20160401-110642.jpg"/>
想要的结果= http://192.168.0.184:88/snapPic/Snap_20160401-110642.jpg
答案 0 :(得分:1)
试试这个
import urllib2
from bs4 import BeautifulSoup
url = 'http://192.168.0.184:88/cgi-bin/CGIProxy.fcgi? cmd=snapPicture&usr=USER&pwd=PASS'
html = urllib2.urlopen(url)
soup = BeautifulSoup(html, "html5lib")
imgs = soup.findAll("img")
print imgs
for img in imgs:
print img["src"].replace("..","http://192.168.0.184:88")