所以我有一个.txt文件目录
~~~ c:\ users \ Admin \ Documents \ Exm ~~~
每个.txt文件都很广泛,我需要找到每个文件中包含的某些子字符串(“ Apple”,“ Pear”,“ Orange”等)。
我的理解是,我需要一个[for循环]遍历目录,另一个[for循环]遍历每个.txt文件。
~~~
目标:在每个.txt文件中找到的第一个子字符串的列表。因此,如果在1.txt文件中首先找到“ apple”,在2.txt文件中首先找到“ pear”,则该列表将如下所示:
['苹果','梨',...,'n水果']
~~~
这是我到目前为止所拥有的:
~~~
import glob
import os
names = os.listdir('c:\\users\Admin\Documents\Exm')
lst = []
lst_2 =[]
#Appends lst with all newly converted .txt Files (from .SIF files)
for name in names:
if name.endswith(".txt"):
lst.append(name)
#PROBLEM STARTS HERE - Intended to open each .txt file in directory
for file in names:
#Quick file verification (not each file is a .txt
if file.endswith('.txt.'):
#Opens a single .txt file in directory
with open(os.path.join('c:\\users\Admin\Documents\Exm', file)) as f:
#Iterates through each line in .txt file
for line in f:
#Temporary String placeheader for further checking
content = f.readlines()
if content = 'Apple' :
lst_2.append('Apple')
elif content = 'Pear' :
lst_2.append('Pear')
elif content = 'Orange' :
lst_2.append('Orange')
else:
lst_2.append('Fruit not Found')
print(lst_2)
~~~
谢谢您的帮助
答案 0 :(得分:0)
嗨,我将每一行分开,以产生用空格隔开的单词,并逐个而不是整行地处理它们。 区别:
names = os.listdir('c:\\users\Admin\Documents\Exm')
lst = []
lst_2 =[]
#Appends lst with all newly converted .txt Files (from .SIF files)
for name in names:
if name.endswith(".txt"):
lst.append(name)
#PROBLEM STARTS HERE - Intended to open each .txt file in directory
for file in names:
#Quick file verification (not each file is a .txt
if file.endswith('.txt'):
#Opens a single .txt file in directory
with open(os.path.join('c:\\users\Admin\Documents\Exm', file)) as f:
#Iterates through each line in .txt file
for line in f:
#Temporary String placeheader for further checking
c = line.split(' ') # split the line content by spaces
for content in c: # for each word in line
if content == 'Apple' :
lst_2.append('Apple')
elif content == 'Pear' :
lst_2.append('Pear')
elif content == 'Orange' :
lst_2.append('Orange')
#else:
#lst_2.append('Fruit Not Found')
print(lst_2)
关于整个代码,为什么不使用一个循环,如:
lst=[]
lst_2=[]
for f_name in os.listdir('path_here'):
if f_name.endswith(".txt"):
lst.append(f_name)# append the names
with open(os.path.join('path_here', f_name)) as file:
for line in file:
pass #do the word search by line