Python-长.txt文件的文件夹,在每个.txt文件中搜索各种子字符串,并追加到列表中

时间:2018-07-12 15:50:25

标签: python list for-loop directory

所以我有一个.txt文件目录

~~~ c:\ users \ Admin \ Documents \ Exm ~~~

每个.txt文件都很广泛,我需要找到每个文件中包含的某些子字符串(“ Apple”,“ Pear”,“ Orange”等)。

我的理解是,我需要一个[for循环]遍历目录,另一个[for循环]遍历每个.txt文件。

~~~

目标:在每个.txt文件中找到的第一个子字符串的列表。因此,如果在1.txt文件中首先找到“ apple”,在2.txt文件中首先找到“ pear”,则该列表将如下所示:

['苹果','梨',...,'n水果']

~~~

这是我到目前为止所拥有的:

~~~

import glob
import os

names = os.listdir('c:\\users\Admin\Documents\Exm')
lst = []
lst_2 =[]

#Appends lst with all newly converted .txt Files (from .SIF files)
for name in names:
    if name.endswith(".txt"):
        lst.append(name)

#PROBLEM STARTS HERE - Intended to open each .txt file in directory 
for file in names:

    #Quick file verification (not each file is a .txt
    if file.endswith('.txt.'):

        #Opens a single .txt file in directory
        with open(os.path.join('c:\\users\Admin\Documents\Exm', file)) as f:

             #Iterates through each line in .txt file
             for line in f:

                 #Temporary String placeheader for further checking
                 content = f.readlines()

                 if content = 'Apple' :
                     lst_2.append('Apple')

                 elif content = 'Pear' :
                     lst_2.append('Pear')

                 elif content = 'Orange' :
                     lst_2.append('Orange')

                 else:
                     lst_2.append('Fruit not Found')

print(lst_2)

~~~

谢谢您的帮助

1 个答案:

答案 0 :(得分:0)

嗨,我将每一行分开,以产生用空格隔开的单词,并逐个而不是整行地处理它们。 区别:

names = os.listdir('c:\\users\Admin\Documents\Exm')
lst = []
lst_2 =[]

#Appends lst with all newly converted .txt Files (from .SIF files)
for name in names:
    if name.endswith(".txt"):
        lst.append(name)

#PROBLEM STARTS HERE - Intended to open each .txt file in directory 
for file in names:
    #Quick file verification (not each file is a .txt
    if file.endswith('.txt'):
        #Opens a single .txt file in directory
        with open(os.path.join('c:\\users\Admin\Documents\Exm', file)) as f:

             #Iterates through each line in .txt file
             for line in f:

                 #Temporary String placeheader for further checking
                 c = line.split(' ') # split the line content by spaces
                 for content in c: # for each word in line

                     if content == 'Apple' :
                         lst_2.append('Apple')

                     elif content == 'Pear' :
                         lst_2.append('Pear')

                     elif content == 'Orange' :
                         lst_2.append('Orange')

                     #else:
                         #lst_2.append('Fruit Not Found')
  print(lst_2)

关于整个代码,为什么不使用一个循环,如:

lst=[]
lst_2=[]
for f_name in os.listdir('path_here'):
    if f_name.endswith(".txt"):
        lst.append(f_name)# append the names
        with open(os.path.join('path_here', f_name)) as file:
            for line in file:
                pass #do the word search by line