使用变量解压缩更改的文件名

时间:2018-08-17 19:53:12

标签: python-3.x python-webbrowser

这就是问题所在。我正在编写一个程序,每5分钟从网站提取一次数据。该程序使用bs4解析网站并获得URL,然后将URL传递到Web浏览器中。所有这些都成功完成。

文件已压缩,因此每次程序运行时(即每5分钟一次),我还想解压缩文件并将其从下载到的文件夹中移出,然后将它们传递到新文件夹中,一直工作到我在不同部分进行了一些更改。现在它不起作用了,我认为问题出在第32行和第40行之间。

在第32行中,我从ercot页面(第16行)中获取标题,并使用.text使其成为标题,这是每个下载的文件在每次运行时保存的内容。第33行提取文本并将qoutes放到第34行。问题是每5分钟运行一次要解压缩的标题是不同的,所以我使用tt变量传递文件名来解压缩。

任何帮助将不胜感激。

from urllib.request import urlopen as u_req
from bs4 import BeautifulSoup as soup
from datetime import datetime
import webbrowser, os, time, bs4, schedule, openpyxl, zipfile, csv
my_url = 'http://mis.ercot.com/misapp/GetReports.do? reportTypeId=11485&reportTitle=LMPs%20by%20Electrical%20Bus&showHTMLView=&mimicKey/'
snooze = time.sleep(30)
batch_time = 0

def job():
    #opening up connection, grabbing the page
    uClient = u_req(my_url)
    page_soup = soup(uClient, "html.parser")

    #csv 5 minute data title, variable name is clean_csv_title
    title = page_soup.findAll('tr')[3]
    titles = (title.findAll('td')[0])
    clean_csv_title = titles.text[-23:-15]
    batch_time = titles.text[-14:-10]
    #print(clean_csv_title)

    #variable that contains the link for the first 5 minute data
    first_csv = (page_soup.findAll('a')[0])
    csv_str = str(first_csv).strip('<a href="/misdownload/servlets/mirDownload?mimic_duns=&amp;doclookupId=')
    csv_str_2 = csv_str.strip('">zip</a>')
    complete_link = "http://mis.ercot.com/misdownload/servlets/mirDownload?mimic_duns=&doclookupId=" + csv_str_2

    #opening link, timeout 30 seconds
    webbrowser.open(complete_link, new=0, autoraise=True)    
    snooze

    #take previously downloaded file, unzip, and put in holding folder 
    called unzipped files
    os.chdir('C:\\Users\\Main\\Desktop\\ERCOT_Data\\Incoming ercot files')
    t = titles.text
    tt = str("'" + t + "'")
    unzipped = open(tt, 'rb')
    z = zipfile.ZipFile(unzipped)
    for name in z.namelist():
        outpath = 'C:\\Users\\Main\\Desktop\\ERCOT_Data\\Unzipped files'
        z.extract(name, outpath)
    unzipped.close()



    uClient.close()

schedule.every().day.at("00:01").do(job)
schedule.every().day.at("00:06").do(job)
schedule.every().day.at("00:11").do(job)
schedule.every().day.at("00:16").do(job)
#......n

while True:
    schedule.run_pending()
    time.sleep(1)

这是它抛出的错误

    Traceback (most recent call last):
  File "C:\Users\Main\Desktop\ERCOT_Data\total.py", line 358, in <module>
    schedule.run_pending()
  File "C:\Users\Main\AppData\Local\Programs\Python\Python37\lib\site-packages\schedule\__init__.py", line 493, in run_pending
    default_scheduler.run_pending()
  File "C:\Users\Main\AppData\Local\Programs\Python\Python37\lib\site-packages\schedule\__init__.py", line 78, in run_pending
    self._run_job(job)
  File "C:\Users\Main\AppData\Local\Programs\Python\Python37\lib\site-packages\schedule\__init__.py", line 131, in _run_job
    ret = job.run()
  File "C:\Users\Main\AppData\Local\Programs\Python\Python37\lib\site-packages\schedule\__init__.py", line 411, in run
    ret = self.job_func()
  File "C:\Users\Main\Desktop\ERCOT_Data\total.py", line 35, in job
    unzipped = open(tt, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 

   "'cdr.00011485.0000000000000000.20180817.144517347. 
LMPSELECTBUSNP6787_20180817_144513_csv.zip'"

当我将文件名与即将到来的ercot文件中的文件进行比较时,它们是完全相同的。

0 个答案:

没有答案