我在文本文件中有单词列表,我有带文本的文本文件列表。
文本文件中的单词列表包含起始单词的单词,如下面包含这些单词的行。
相同的另一个文本文件包含起始单词的单词,就像考虑上面包含这些单词的行一样。
简而言之,我想用这些词来修剪我的文本文件。
以下是代码:
# Load the text file
Textfile=[]
lines=[]
path="Images for summarization _# CTScan/"
for text in sorted(os.listdir(path+'text/'),key=lambda x: os.path.splitext(x)[0]):
Textfile.append(open(path+"text/"+text,'r').read().lower())
lines.append(open(path+"text/"+text,'r').read().lower().splitlines())
## Trimming part
Trimed_words_top=open(path+"Words for Trimming above.txt",'r').read()
Trimed_words_below=open(path+"Words for Trimming below.txt",'r').read()
Trimed_words_top=Trimed_words_top.lower().splitlines()
Trimed_words_below=Trimed_words_below.lower().splitlines()
Word_index_top=[]
data=[]
Trimmed_text=[]
"""
for line in lines:
for word in Trimed_words_top:
if word
data=lines[Word_index_top[0]+1:]
Trimmed_text=' '.join(word for word in data)
"""
# This is one for single file , need to do it for all
SinglesFile=lines[0]
Word_index_top=[i for i, s in enumerate(SinglesFile) if 'ncct' in s]
## here is the logic for word trimming
cnt=0
Word_index_top=[]
for line in lines:
Word_index_top.append([i for i, s in enumerate(line) for word in Trimed_words_top if word in s])
cnt+=1