我有一个代码,网络碎片并将数据返回到excel。但是,有超过1个网页。该代码可以工作,但会覆盖先前报废的网页中的数据。有帮助吗?
for i in range(1,4):
url="https:..."
response=request.get(url)
soup=BeautifulSoup(response.text)
data=soup.find_all("td",{"class"})
results=[]
for item in data:
results.append(item.text)
writer=pd.ExcelWriter("test.xlsx",engine='xlsxwriter')df=pd.DataFrame(np.array(results).reshape(20,7),colums=list(abcdefg"))
df.to_excel(writer, sheet_name-'Sheet1')
writer.save
提前谢谢你:)
答案 0 :(得分:1)
如果您在同一脚本执行期间废弃所有页面,则可以尝试这种方式。
你在循环之前声明writer
,并且也是一个计数器,以跟踪起始行以追加下一个数据帧:
writer = pd.ExcelWriter("test.xlsx",engine='xlsxwriter')
count = 0
for i in range(1,4):
url="https:..."
response=request.get(url)
soup=BeautifulSoup(response.text)
data=soup.find_all("td",{"class"})
results=[]
for item in data:
results.append(item.text)
df=pd.DataFrame(np.array(results).reshape(20,7),colums=list('abcdefg'))
df.to_excel(writer, 'Sheet1', startrow=count)
count += len(results)+1 # +1 for the header
writer.save() # out of the loop to save only once at the end