我有一个文件名列表,就像这样。
file_names = ['file1', 'file2']
另外,我有一个关键字列表,我试图从一些文件中提取。因此,关键字(list_1
,list_2
)列表以及来自file1
和file2
的文字字符串位于下方,
## list_1 keywords
list_1 = ['hi', 'hello']
## list_2 keywords
list_2 = ['I', 'am']
## Text strings from file_1 and file_2
big_list = ['hi I am so and so how are you', 'hello hope all goes well by the way I can help you']
我提取文字的功能,
def my_func(text_string, key_words):
sentences = re.findall(r"([^.]*\.)" ,text_string)
for sentence in sentences:
if all(word in sentence for word in key_words):
return sentence
现在,我将通过两个不同的for循环(如下所示)和funciton进行多个列表。在这些多个for循环的每次迭代结束后,我想使用file_names
列表中的文件名保存文件。
for a,b in zip(list_1,list_2):
for item in big_list:
sentence_1 = my_func(item, a.split(' '))
sentence_2 = my_func(item, b.split(' '))
## Here I would like to add the file name i.e (print(filename))
print(sentence_1)
print(sentence_2)
我需要一个看起来像这样的输出,
file1 is:
None
file2 is:
None
您现在可以忽略我的输出中的None
,因为我主要关注的是迭代文件名列表并将它们添加到我的输出中。我很感激任何帮助来实现这一目标。
答案 0 :(得分:0)
您可以在Python for循环中访问索引,并使用此索引查找字符串对应的文件。这样您就可以打印出当前文件。
以下是如何执行此操作的示例:
for a,b in zip(list_1,list_2):
# idx is the index here
for idx, item in enumerate(big_list):
sentence_1 = extract_text(item, a)
sentence_2 = extract_text(item, b)
prefix = file_names[idx] + " is: " # Use idx to get the file from the file list
if sentence_1 is not None:
print(prefix + sentence_1)
if sentence_2 is not None:
print(prefix + sentence_2)
<强>更新强>
如果要在迭代后打印结果,可以暂时将结果保存在字典中,然后遍历它:
for a,b in zip(list_1,list_2):
# idx is the index here
resMap = {}
for idx, item in enumerate(big_list):
sentence_1 = extract_text(item, a)
sentence_2 = extract_text(item, b)
if sentence_1 is not None:
resMap[file_names[idx]] = sentence_1
if sentence_2 is not None:
resMap[file_names[idx]] = sentence_2
for k in resMap.keys():
prefix = k + " is: " # Use idx to get the file from the file list
print (prefix + resMap[k])