谢谢您的帮助。
我正在编写一些代码,以便在不同的文件夹中浏览多个pdf文件并查找特定的单词。我的python知识充其量是最基本的知识,因为我只是为我的学士论文而学习。
当我在文件夹本身中运行该代码时,它可以正常工作,但我并没有试图使其自动在某个文件夹中的每个子文件夹中运行。
import PyPDF2
import os
rootdir = r"C:\Users\Tim Knickmann\Documents\LUBS\(3300) Dissertation\Data\Python Scripts for Earnigns Calls\Germany Transcripts"
extensions = ('.pdf')
pronoun_file = r"C:\Users\Tim Knickmann\Documents\LUBS\(3300) Dissertation\Data\Python Scripts for Earnigns Calls\pronoun_use.txt"
first_person_pronoun_file = r"C:\Users\Tim Knickmann\Documents\LUBS\(3300) Dissertation\Data\Python Scripts for Earnigns Calls\first_per_pronoun_use.txt"
def average_use(lst):
return sum(lst) / float(len(lst))
# running it for every file
for subdirs_1, dirs_1, files_1 in os.walk(rootdir):
for subdirs_1 in dirs_1:
working_folder_directory = os.path.join(rootdir, subdirs_1)
# reading in file into a seperate text document
for subdirs_2, dirs_2, files_2 in os.walk(working_folder_directory):
list_first_person_usage = []
pdfFileObj = open(subdirs_2, 'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
with open('working_doc.txt', 'w', encoding="utf-8") as f:
for i in range(0,pdfReader.numPages) :
pageObj = pdfReader.getPage(i)
f.write(pageObj.extractText())
每当我运行代码时,它都会返回以下错误日志:
runfile('C:/Users/Tim Knickmann/Documents/LUBS/(3300) Dissertation/Data/Python Scripts for Earnigns Calls/Germany Transcripts/190319 v10 Script for Earnings Calls.py', wdir='C:/Users/Tim Knickmann/Documents/LUBS/(3300) Dissertation/Data/Python Scripts for Earnigns Calls/Germany Transcripts')
Traceback (most recent call last):
File "<ipython-input-66-a9a93e480b59>", line 1, in <module>
runfile('C:/Users/Tim Knickmann/Documents/LUBS/(3300) Dissertation/Data/Python Scripts for Earnigns Calls/Germany Transcripts/190319 v10 Script for Earnings Calls.py', wdir='C:/Users/Tim Knickmann/Documents/LUBS/(3300) Dissertation/Data/Python Scripts for Earnigns Calls/Germany Transcripts')
File "C:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 704, in runfile
execfile(filename, namespace)
File "C:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 108, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/Tim Knickmann/Documents/LUBS/(3300) Dissertation/Data/Python Scripts for Earnigns Calls/Germany Transcripts/190319 v10 Script for Earnings Calls.py", line 24, in <module>
pdfFileObj = open(subdirs_2, 'rb')
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\Tim Knickmann\\Documents\\LUBS\\(3300) Dissertation\\Data\\Python Scripts for Earnigns Calls\\Germany Transcripts\\Deutsche Wohnen'
我已经解析了所有可用内容,但是找不到适用于这种情况的任何内容。
我可以肯定地说,我正在尝试打开一个已经打开的文件,但是找不到其他方法。
非常感谢所有帮助,再次感谢。
答案 0 :(得分:1)
如错误所示,在行上:
pdfFileObj = open(orginial_file_directory, 'rb')
orginial_file_directory
的值
C:\\Users\\Tim Knickmann\\Documents\\LUBS\\(3300) Dissertation\\Data\\Python Scripts for Earnigns Calls\\Germany Transcripts
这很有意义,因为您已将其设置为
orginial_file_directory = os.path.dirname(os.path.realpath(file))
正如变量名所暗示的那样,您了解这是一个目录,您当然不能将其作为文件打开。
我认为您想做
pdfFileObj = open(file, 'rb')