我正在尝试一遍又一遍地使用相同的代码搜索三个不同的网站。想知道我如何使用三个不同的网站更改网站输入和Excel输出。
所以我会抓住列表中的每个网站,然后按照列表的顺序导出每个结果:1。)Sports.xlsx,Entertainment.xlsx,News.xlsx
websites ["https://news.google.com/news/section?topic=s","https://news.google.com/news/section?topic=e", "https://news.google.com/"
for x in websites:
for wiki in wikis:
website = requests.get(wiki)
soup = BeautifulSoup(website.content, "lxml")
text = ''.join([element.text for element in soup.body.find_all(lambda tag: tag != 'script', recursive=False)])
new = re.sub(r'[^a-zA-Z \n]','',text)
import xlsxwriter
if x == "https://news.google.com/news/section?topic=s"
new.to_excel('sports.xlsx', index=False)
elif x == "https://news.google.com/news/section?topic=e"
new.to_excel('entertainment.xlsx', index=False)
elif x == "https://news.google.com/"
new.to_excel('news.xlsx', index=False)
答案 0 :(得分:2)
只需将您的列表设为一组以下格式的元组:
websites = [ (link, file_object) ]
for link, file_object in websites: # Unpacks the tuple for each element in the list
# open the link, then write in the website