Python遍历目录并打开一个txt文件

时间:2016-12-20 23:06:27

标签: python list file os.walk

我试图使用saprql查询打开并处理我从维基百科下载的数千个文本文件。我使用以下代码:

list_words=[]
for roots, dirs, files in os.walk(path):
    for file in files:
        if file.endswith(".txt"):
           with open(file, 'r') as f:
                content= f.read()

                #remove the punct
                table=string.maketrans(string.punctuation,' '*len(string.punctuation)) 
                s= content.translate(table)


                #remove the stopwords
                text= ' '.join([word for word in s.split() if word not in stopwords])
                alfa= " ".join(text.split())

                #remove the verbs
                for word, pos in tag(alfa): # trovo tutti i verbi.
                    if pos != "VB": 
                        lower= word.lower()
                        lower_2= unicode(lower, 'utf-8', errors='ignore')
                        list_words.append(lower_2)

                #remove numbers 
                testo_2 = [item for item in list_words if not item.isdigit()]

print set(list_words)           

问题是脚本打开了一些文本文件,而其他人则给我错误:"不是这样的文件或目录:blablabla.txt"

有谁知道它为什么会发生,我该如何应对呢?

谢谢!

1 个答案:

答案 0 :(得分:2)

function observeMessages(attachment, read, senderId, sent, text, type, userName) { this.omAttachment = attachment; this.omRead = read; this.omSenderId = senderId; this.omSent = sent; this.omText = text; this.omType = type; this.omUserName = userName; } var messageHistoryArray = []; var newMessage = new observeMessages("abc", "abc", "abc", "abc", "abc", "abc", "abc"); messageHistoryArray.push(newMessage); console.log(messageHistoryArray[0])是相对的,您必须连接根和文件以获取如下的绝对文件名:

file

(应将其命名为absolute_filename = os.path.join(roots, file) with open(absolute_filename, 'r') as f: .... rest of code ,而不是root)。