每次迭代后如何将Excel-writer设置为空?

时间:2019-06-08 10:10:33

标签: python pandas pandas.excelwriter

我正在使用“ Pandas”库在单独的excel文件中写入数据进行多次迭代。但是,当一次迭代完成时,先前的数据和第二次迭代数据都存储在第二个excel文件中,依此类推。

我的代码:

from selenium import webdriver
import pandas as pd
import time
import bs4
import traceback
import requests


codes = []
f = open("codes.txt", "r")
for t in f:
   try:
      print(t)
      codes.append(t)
   except:
      break
profile = pd.DataFrame(columns=['Parcel','Year','Type','Name', 'Total 
Amt','Redeem Data'])

browser = webdriver.Chrome()

inc = 0
while inc<len(codes):
    code_list=codes[inc]
    inc = inc + 1
    myUrl = "#"
    browser.get(myUrl)
    time.sleep(3)

    txtField = browser.find_element_by_xpath("//input[@name='fparcel']")
    txtField.send_keys(code_list)


    try:
       srhBtn = browser.find_element_by_xpath("//input[@name='submit3']")
       srhBtn.click()
       time.sleep(5)
    except:
       print('Button error')
   try:
       soup = bs4.BeautifulSoup(browser.page_source,'html.parser')
   except:
     break
   cols = []

   page = soup.find('table', class_='tablstyle')
   rows = page.find_all('tr')
   del rows[0]
   print("---------------------------------------------------------------- 
   ---------------------------------------------")
   for col in rows:
      cols = col.find_all('td')

      try:
         pid = cols[0].text
         print("id" + pid)
      except:
         pid= "N/A"
         print("id not found")
      try:
         pyear = cols[1].text
         print("year" + pyear)
      except:
         print("Year not found")
      try:
         ptype = cols[2].text
         print("type" + ptype)
      except:
         print("Type not found")
      try:
         pname = cols[3].text
         print("name" + pname)
      except:
         print("Name not found")
      try:
         pamount = cols[4].text
         print("amount" + pamount)
      except:
         print("Amount not found")
      try:
         pdate = cols[5].text
         print("date" + pdate)
      except:
         print("Date not found")
      ser = pd.Series([pid,pyear,ptype,pname,pamount,pdate],index = 
      ['Parcel','Year','Type','Name', 'Total Amt','Redeem Data'])
      profile = profile.append(ser, ignore_index=True)   
  code_name = str(code_list.replace("\n",''))
  filename = code_name + '.xlsx'
  writer = pd.ExcelWriter(filename, engine='xlsxwriter')
  profile.to_excel(writer, index=False)
  writer.save()
  print('done')

谁能告诉我在每次迭代后将此行设置为null的最佳方法。

ser = pd.Series([pid,pyear,ptype,pname,pamount,pdate],index = 
      ['Parcel','Year','Type','Name', 'Total Amt','Redeem Data'])
      profile = profile.append(ser, ignore_index=True)

0 个答案:

没有答案