如何获取正在下载的URL或文件名?

时间:2019-06-11 12:41:28

标签: python selenium firefox

点击按钮后,如何知道下载网址(带有文件名),或者

如何知道正在下载的文件名(带有扩展名)?一个问题是例如下载的文件有些扩展名为.csv,有些没有。

例如我想统一重命名。 (请不要去D / L DIR,找到文件并重命名)

from selenium import webdriver
from selenium.webdriver.firefox.options import Options 
...
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', 'text/csv')
...
driver = webdriver.Firefox(profile, options=opts, executable_path=FIREFOX_GOCKO_DRIVER_PATH)

driver.get(url)
driver.find_element_by_id(Button).click()

print("The file being downloaded is... ", ??? )
print("File is being downloaded from...", ?url?)

1 个答案:

答案 0 :(得分:4)

这是获取最新下载文件名和URL的简单解决方案。

注意:在运行下面的代码之前,请考虑文件下载已完成。

如果您要脚本等待下载完成,请在答案末尾检查 getDownLoadedFileName 方法。

# open a new tab
driver.execute_script("window.open()")
# switch to new tab
driver.switch_to.window(driver.window_handles[-1])
# navigate to chrome downloads
driver.get('chrome://downloads')
# get the latest downloaded file name
fileName = driver.execute_script("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div#content  #file-link').text")
# get the latest downloaded file url
sourceURL = driver.execute_script("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div#content  #file-link').href")
# print the details
print(fileName)
print (sourceURL)
# close the downloads tab2
driver.close()
# switch back to main window
driver.switch_to.window(driver.window_handles[0])

如果需要,可以将其作为方法并在需要时调用。

编辑:不必担心,直到下载完成

您可以中继chrome下载状态,请检查以下方法。

只需在代码中调用以下方法,同时获取文件名

def getDownLoadedFileName(waitTime):
    downloadsList = driver.execute_script("return document.querySelector('downloads-manager').shadowRoot")
    endTime = time.time()+waitTime
    while True:
        try:
            fileName = driver.execute_script("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div#content  #file-link').text")
            if fileName:
                return fileName
        except:
            pass
        time.sleep(1)
        if time.time() > endTime:
            break

您可以如下所示调用此方法。

# wait until the download completes and get the file name
fileName = getDownLoadedFileName(180)
print(fileName)

Firefox :将以下方法用于Firefox。

def getDownLoadedFileName(waitTime):
    driver.execute_script("window.open()")
    WebDriverWait(driver,10).until(EC.new_window_is_opened)
    driver.switch_to.window(driver.window_handles[-1])
    driver.get("about:downloads")

    endTime = time.time()+waitTime
    while True:
        try:
            fileName = driver.execute_script("return document.querySelector('#contentAreaDownloadsView .downloadMainArea .downloadContainer description:nth-of-type(1)').value")
            if fileName:
                return fileName
        except:
            pass
        time.sleep(1)
        if time.time() > endTime:
            break

Java + Chrome::如果您正在寻找Java实现。

这是Java中的方法。

public String waitUntilDonwloadCompleted(WebDriver driver) throws InterruptedException {
      // Store the current window handle
      String mainWindow = driver.getWindowHandle();

      // open a new tab
      JavascriptExecutor js = (JavascriptExecutor)driver;
      js.executeScript("window.open()");
     // switch to new tab
    // Switch to new window opened
      for(String winHandle : driver.getWindowHandles()){
          driver.switchTo().window(winHandle);
      }
     // navigate to chrome downloads
      driver.get("chrome://downloads");

      JavascriptExecutor js1 = (JavascriptExecutor)driver;
      // wait until the file is downloaded
      Long percentage = (long) 0;
      while ( percentage!= 100) {
          try {
              percentage = (Long) js1.executeScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('#progress').value");
              //System.out.println(percentage);
          }catch (Exception e) {
            // Nothing to do just wait
        }
          Thread.sleep(1000);
      }
     // get the latest downloaded file name
      String fileName = (String) js1.executeScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div#content #file-link').text");
     // get the latest downloaded file url
      String sourceURL = (String) js1.executeScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div#content #file-link').href");
      // file downloaded location
      String donwloadedAt = (String) js1.executeScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div.is-active.focus-row-active #file-icon-wrapper img').src");
      System.out.println("Download deatils");
      System.out.println("File Name :-" + fileName);
      System.out.println("Donwloaded path :- " + donwloadedAt);
      System.out.println("Downloaded from url :- " + sourceURL);
     // print the details
      System.out.println(fileName);
      System.out.println(sourceURL);
     // close the downloads tab2
      driver.close();
     // switch back to main window
      driver.switchTo().window(mainWindow);
      return fileName;
  }

这是在Java脚本中调用它的方法。

// download triggering step 
downloadExe.click();
// now waituntil download finish and then get file name
System.out.println(waitUntilDonwloadCompleted(driver));