有一个噩梦,所以我用美丽的汤制作了一个桌面刮刀,它有点好吃,但是现在尝试但是它加入基本代码是一个正确的噩梦,我一直得到无效的语法错误,并且坦率地说,我只是不知道从哪里开始尝试导入它。
这是我的桌面代码:
url = input("Enter the direct url for the Tv Show you wish to pull: ")
tvname = input("Enter the name of the TV Show: ")
ui = tvname + '.xml'
response = opener.open(url)
page = response.read()
soup = BeautifulSoup(page, "html.parser")
tv_urls = []
newfile = open(ui, "w")
def get_soup(url):
response = opener.open(url)
page = response.read()
soup = BeautifulSoup(page, "html.parser")
return soup
soup = get_soup(url)
seasonepisode =(soup.find_all('td', {'width' : '100%'})[-2].string)
cols=soup.find_all('td', { 'width' : '100%', 'class' : 'entry'})
all_links = [col.find('a').get('href') for col in cols]
tv_urls.extend(all_links)
for url in tv_urls:
soup = get_soup(url)
title = soup.title.string
thumbnail=soup.select_one('td.summary img[src]')['src']
cols=soup.find_all('td', { 'width' : '100%', 'class' : 'entry'})
all_links = [col.find('a').get('href') for col in cols][1:]
string='<item>\n<title>[COLOR lime]' + title + '[/COLOR]</title>\n'
for link in all_links:
string = string + '<link>' + link + '</link>\n'
string=string+'<thumbnail>' + thumbnail + '</thumbnail>\n<fanart> </fanart>\n</item>\n\n'
newfile.write(string)
print((title + ' Tv links scraped'))
print('Done Master Nemzzy')
newfile.close()
答案 0 :(得分:1)
您必须使用python 2并使用addon.xml导入依赖项。
答案 1 :(得分:0)
您是否在addon xml中导入了bs4模块?如果没有,您在 addon.xml 中有导入script.module.bs4
,如下所示:
<requires>
<import addon="script.module.beautifulSoup4" version="3.3.0"/>
</requires>