Question

所以我有一个.txt文件目录

~~~ c：\ users \ Admin \ Documents \ Exm ~~~

每个.txt文件都很广泛，我需要找到每个文件中包含的某些子字符串（“ Apple”，“ Pear”，“ Orange”等）。

我的理解是，我需要一个[for循环]遍历目录，另一个[for循环]遍历每个.txt文件。

~~~

目标：在每个.txt文件中找到的第一个子字符串的列表。因此，如果在1.txt文件中首先找到“ apple”，在2.txt文件中首先找到“ pear”，则该列表将如下所示：

['苹果'，'梨'，...，'n水果']

~~~

这是我到目前为止所拥有的：

~~~

import glob
import os

names = os.listdir('c:\\users\Admin\Documents\Exm')
lst = []
lst_2 =[]

#Appends lst with all newly converted .txt Files (from .SIF files)
for name in names:
    if name.endswith(".txt"):
        lst.append(name)

#PROBLEM STARTS HERE - Intended to open each .txt file in directory 
for file in names:

    #Quick file verification (not each file is a .txt
    if file.endswith('.txt.'):

        #Opens a single .txt file in directory
        with open(os.path.join('c:\\users\Admin\Documents\Exm', file)) as f:

             #Iterates through each line in .txt file
             for line in f:

                 #Temporary String placeheader for further checking
                 content = f.readlines()

                 if content = 'Apple' :
                     lst_2.append('Apple')

                 elif content = 'Pear' :
                     lst_2.append('Pear')

                 elif content = 'Orange' :
                     lst_2.append('Orange')

                 else:
                     lst_2.append('Fruit not Found')

print(lst_2)

~~~

谢谢您的帮助

Answer 1

嗨，我将每一行分开，以产生用空格隔开的单词，并逐个而不是整行地处理它们。区别：

names = os.listdir('c:\\users\Admin\Documents\Exm')
lst = []
lst_2 =[]

#Appends lst with all newly converted .txt Files (from .SIF files)
for name in names:
    if name.endswith(".txt"):
        lst.append(name)

#PROBLEM STARTS HERE - Intended to open each .txt file in directory 
for file in names:
    #Quick file verification (not each file is a .txt
    if file.endswith('.txt'):
        #Opens a single .txt file in directory
        with open(os.path.join('c:\\users\Admin\Documents\Exm', file)) as f:

             #Iterates through each line in .txt file
             for line in f:

                 #Temporary String placeheader for further checking
                 c = line.split(' ') # split the line content by spaces
                 for content in c: # for each word in line

                     if content == 'Apple' :
                         lst_2.append('Apple')

                     elif content == 'Pear' :
                         lst_2.append('Pear')

                     elif content == 'Orange' :
                         lst_2.append('Orange')

                     #else:
                         #lst_2.append('Fruit Not Found')
  print(lst_2)

关于整个代码，为什么不使用一个循环，如：

lst=[]
lst_2=[]
for f_name in os.listdir('path_here'):
    if f_name.endswith(".txt"):
        lst.append(f_name)# append the names
        with open(os.path.join('path_here', f_name)) as file:
            for line in file:
                pass #do the word search by line

Python-长.txt文件的文件夹，在每个.txt文件中搜索各种子字符串，并追加到列表中

1 个答案: