在继续脚本之前验证文件已下载

时间:2016-03-10 17:50:53

标签: python downloading

我需要确保在我的脚本可以继续之前下载了一个文件。我已经对exists()函数做了一些研究,但是我找不到一个我实际上想要做的例子。

我正在尝试下载多个文件。我有一个静态目标,用于下载所有文件。我需要确保在脚本可以继续之前已成功将文件下载到文件中。有人可以帮我提一下这个剧本的样子吗?

以下是我的工作内容:

import time
import os
import glob
import os.path
from selenium.webdriver.common.action_chains import ActionChains
from selenium.common.exceptions import MoveTargetOutOfBoundsException
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from selenium.common.exceptions import NoAlertPresentException

for x in range(1, 100):      
    while True:    
        try:
            fp = webdriver.FirefoxProfile('C:/Users/me/Documents/FirefoxProfile')
            browser = webdriver.Firefox(fp)
            browser.get('https://reportcenter.com')

            time.sleep(8)

            browser.find_element_by_id("ctl00_PlaceHolderMain_login_UserName").clear()
            browser.find_element_by_id("ctl00_PlaceHolderMain_login_UserName").send_keys("usr")
            browser.find_element_by_id("ctl00_PlaceHolderMain_login_password").clear()
            browser.find_element_by_id("ctl00_PlaceHolderMain_login_password").send_keys("pwd")
            browser.find_element_by_id("ctl00_PlaceHolderMain_login_login").click()

#gets user to reporting front end

            ReportMgr= browser.find_element_by_partial_link_text('Report Manager')
            ReportMgr.click()

            time.sleep(5)

            CustomReport= browser.find_element_by_partial_link_text('Custom Report')
            CustomReport.click()

            time.sleep(5)

            ProgramManagement= browser.find_element_by_partial_link_text('Program Management')
            ProgramManagement.click()
            ProgramManagement= browser.find_element_by_partial_link_text('Program Management').send_keys(Keys.ARROW_LEFT)

#pulls reports

            browser.find_element_by_partial_link_text('Program Management').click()
            time.sleep(60)
            browser.find_element_by_partial_link_text('Program Management').send_keys(Keys.ARROW_DOWN * x, Keys.ENTER, Keys.ENTER)
            time.sleep(180)
            browser.find_element_by_css_selector("#ctl00_PlaceHolderMain_ReportViewer1_HtmlOutputReportResults2_CSVButton_ImageAnchor > img").click()
            time.sleep(180)
            ##THIS IS WHERE I NEED TO VERIFY THAT THE REPORT HAS DOWNLOADED BEFORE I CAN CONTINUE
            browser.find_element_by_partial_link_text('Program Management').click()
            time.sleep(60)
            browser.quit()

        except:
               browser.quit()           
               continue
        else:
               break

1 个答案:

答案 0 :(得分:1)

一种方法是等待新文件出现在目标文件夹中。

用法示例:

# take a snapshot of the folder
waiter = FileWaiter(r"C:\Downloads\*.pdf")

# trigger the download
browser.find_element_by_css_selector("...").click()    

# wait for a new file or timeout after 10 seconds
new_file = waiter.wait_new_file(10)

# display the new file
print new_file

等待课程:

import os, time, glob, numbers, exceptions 

class FileWaiter:

  def __init__(self, path):
    self.path = path
    self.files = set(glob.glob(path))

  def wait_new_file(self, timeout):
    """
    Waits for a new file to be created and returns the new file path.
    """
    endtime = time.time() + timeout
    while True:
      diff_files = set(glob.glob(self.path)) - self.files
      if diff_files :
        new_file = diff_files.pop()
        try:
          os.rename(new_file, new_file)
          self.files = set(glob.glob(self.path))
          return new_file
        except :
          pass
      if time.time() > endtime:
          raise Exception("Timeout while waiting for a new file in %s" % self.path)
      time.sleep(0.1)