python-requests无法下载图像,下载的图像为0字节

时间:2018-09-28 06:20:52

标签: python-3.x web-scraping python-requests

我正在尝试下载新闻电子报纸(该电子报纸是图片)。我正在使用硒登录并获取图像src和请求模块以下载图像。

这是我使用的代码(请求部分):

def download(driver,pageNumber):
    page,filename = pageNumber,""
    if page in range(1,10):
        filename = str(currentDT) + "_kompas_{}"+str(page)+".jpg"
        filename = filename.format(0)
    else: filename = str(currentDT) + "_kompas_"+str(page)+".jpg"
    print("Downloading Page " + str(pageNumber) + " ...")
    div = driver.find_element_by_xpath("//div[@class='page-wrapper' and  @page='" + str(pageNumber) + "']")
    img = div.find_element_by_tag_name("img")
    imgsrc = img.get_attribute("src")
    imgsrc2 = imgsrc.replace("getmedium","getpreview")
    img.click()
    WebDriverWait(driver,200).until(EC.visibility_of_element_located((By.XPATH,"//img[@src = '"+imgsrc2+"']")))
    div2 = driver.find_element_by_xpath("//div[@class='page-wrapper' and @page='" + str(pageNumber) + "']")
    img2 = div2.find_element_by_tag_name("img")
    url = img2.get_attribute("src")
    url = url.replace("https","http")
    print(url)
    url = img2.get_attribute("src")
    r = requests.get(url)
    if r.status_code == 200:
        with open(download_path + "1.jpg", 'wb') as f:
            f.write(r.content)

运行代码后,下载图像的大小为0个字节。当我使用print(r.headers)检查标题时,它会抛出类似这样的内容:

  

{'Date':'Fri,28 Sep 2018 06:14:29 GMT','Content-Type':'text / html; charset = UTF-8','Transfer-Encoding':'chunked','Connection':'keep-alive','Set-Cookie':'__cfduid = d2770acf5454bb72630a1936eda1930561538115268; expires = 19年9月28日星期六,格林尼治标准时间;路径= /; domain = .epaper.id; HttpOnly,ci_session = db77e070cbe346e0ac183d686efae9989e8f2096;路径= /; HttpOnly”,“ X-Powered-By”:“ PHP / 5.6.37”,“ Expires”:“ Thu,1981年11月19日08:52:00 GMT”,“ Cache-Control”:“无存储,无-缓存,必须重新验证,后检查= 0,预检查= 0”,“编译指示”:“ no-cache”,“ Expect-CT”:“ max-age = 604800,report-uri =“ https:/ /report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct“','服务器':'cloudflare','CF-RAY':'461411eeef1c31aa-SIN','Content-Encoding':'gzip' }

我该怎么做才能解决此问题?请帮助我...

0 个答案:

没有答案