我已经用python编写了一个脚本,以从网页中获取一些表格内容,并将其写入csv文件中。我现在想做的是,仅当该页面中的表(显示为Top Mutual Fund Holders
可用)时,才让我的脚本将内容写入csv文件中,否则它将删除已创建的csv文件。
此webpage中的表格可用。
我要查找的表在此webpage中不可用。
这是我的尝试:
import os
import csv
import requests
from bs4 import BeautifulSoup
url = "https://finance.yahoo.com/quote/UBER/holders?p=UBER"
def get_mutual_fund(soup):
datalist = []
for items in soup.select_one("h3:contains('Top Mutual Fund Holders')").find_next_sibling().select("table tr"):
data = [item.text for item in items.select("th,td")]
datalist.append(data)
return datalist
def get_records(link):
r = requests.get(link)
soup_obj = BeautifulSoup(r.text,"lxml")
try:
item_one = get_mutual_fund(soup_obj)
except AttributeError:
item_one = ""
if item_one:
writer.writerows(item_one)
else:
os.remove("mutual_fund.csv")
return item_one
if __name__ == '__main__':
with open("mutual_fund.csv","w",newline="") as f:
writer = csv.writer(f)
for elem in get_records(url):
print(elem)
我尝试了没有该表的链接。但是,它引发以下错误
while deleting the csv file:
Traceback (most recent call last):
File "C:\Users\WCS\AppData\Local\Programs\Python\Python37-32\demo.py", line 33, in <module>
for elem in get_records(url):
File "C:\Users\WCS\AppData\Local\Programs\Python\Python37-32\demo.py", line 27, in get_records
os.remove("mutual_fund.csv")
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'mutual_fund.csv'
当表格内容不存在时如何删除csv文件?
答案 0 :(得分:1)
打开文件后,您实际上正在删除该文件。
您应该相应地更改主要功能。
def get_records(link):
r = requests.get(link)
soup_obj = BeautifulSoup(r.text,"lxml")
try:
item_one = get_mutual_fund(soup_obj)
except AttributeError:
item_one = None
return item_one
if __name__ == '__main__':
delete_file= False
with open("mutual_fund.csv","w",newline="") as f:
writer = csv.writer(f)
try:
for elem in get_records(url):
print(elem)
except TypeError:
delete_file=True
if delete_file:
os.remove("mutual_fund.csv")
答案 1 :(得分:1)
如果您保持现有逻辑不变,并且在csv中的内容为空时删除文件,则应执行以下操作:
import os
import csv
import requests
from bs4 import BeautifulSoup
# url = "https://finance.yahoo.com/quote/fb/holders?p=FB"
url = "https://finance.yahoo.com/quote/UBER/holders?p=UBER"
def get_mutual_fund(soup):
datalist = []
for items in soup.select_one("h3:contains('Top Mutual Fund Holders')").find_next_sibling().select("table tr"):
data = [item.text for item in items.select("th,td")]
datalist.append(data)
return datalist
def get_records(link):
r = requests.get(link)
soup_obj = BeautifulSoup(r.text,"lxml")
try:
item_one = get_mutual_fund(soup_obj)
except AttributeError:
item_one = ""
if item_one:
writer.writerows(item_one)
else:
f.close()
os.remove('mutual_fund.csv')
if __name__ == '__main__':
with open("mutual_fund.csv","w",newline="") as f:
writer = csv.writer(f)
get_records(url)