出现以下错误:
Traceback (most recent call last):
File "test.gyp", line 37, in <module>
for x in url_list["pdf"]:
KeyError: 'pdf'
以前的代码运行正常时。除了将实际的.gyp文件临时移至哪个目录之外,我没有更改代码。为什么这突然成为一个问题的线索?
#!/usr/bin/env python3
import os
import glob
import pdfx
import wget
import urllib.parse
import requests
## Accessing and Creating Six Digit File Code
pdf_dir = "./"
pdf_files = glob.glob("%s/*.pdf" % pdf_dir)
for file in pdf_files:
## Identify File Name and Limit to Digits
filename = os.path.basename(file)
newname = filename[0:6]
## Run PDFX to identify and download links
pdf = pdfx.PDFx(filename)
url_list = pdf.get_references_as_dict()
attachment_counter = (1)
for x in url_list["url"]:
if x[0:4] == "http":
parsed_url = urllib.parse.quote(x)
extension = os.path.splitext(x)[1]
r = requests.get(x)
with open('temporary', 'wb') as f:
f.write(r.content)
##Concatenate File Name Once Downloaded
os.rename('./temporary', str(newname) + '_attach' + str(attachment_counter) + str(extension))
##Increase Attachment Count
attachment_counter += 1
for x in url_list["pdf"]:
if x[0:4] == "http":
parsed_url = urllib.parse.quote(x)
extension = os.path.splitext(x)[1]
r = requests.get(x)
with open('temporary', 'wb') as f:
f.write(r.content)
##Concatenate File Name Once Downloaded
os.rename('./temporary', str(newname) + '_attach' + str(attachment_counter) + str(extension))
##Increase Attachment Count
attachment_counter += 1
这是我打印出整体url_list时的一个小片段,您可以看到它正在向字典中添加项目(此处出于隐私目的进行了编辑),并标记为“ pdf”-因此,我确实处于为什么最终会给我错误的原因不知所措。
'pdf': ['URLSHOWSHERE.pdf']}
答案 0 :(得分:1)
由于字典 url_list 没有名为“ pdf”的键,导致出现此错误。请通过显式打印字典来至少检查一下字典,以了解其内容。