我有一个标记化单词列表,我正在搜索它中的一些单词并将附近的3个元素存储到找到的单词中。代码是:
Words_to_find - 要查找的单词列表
令牌 - 我必须从words_to_find
中找到的大型列表for x in words_to_find:
if x in tokens:
print "Matched word is", x
indexing = tokens.index(x)
print "This is index :", indexing
count = 0
lower_limit = indexing - 3
upper_limit = indexing + 3
print "Limits are", lower_limit,upper_limit
for i in tokens:
if count >= lower_limit and count <= upper_limit:
print "I have entered the if condition"
print "Count is : ",count
wording = tokens[count]
neighbours.append(wording)
else:
count +=1
break
count +=1
final_neighbour.append(neighbours)
print "I am in access here", final_neighbour
我无法找到此代码中的错误。我正在采取下限和上限并尝试将其保存在列表中并列出其中的列表(final_neighbour)。 请帮我找到问题。提前致谢
答案 0 :(得分:1)
我们可以使用切片来获取邻居而不是使用计数进行迭代。
tokens = [u'प्रीमियम',u'एंड',u'गिव',u'फ्रॉम',u'महाराष्ट्रा',u'मुंबई',u'इंश्योरेंस',u'कंपनी',u'फॉर',u'दिस']
words_to_find = [u'फ्रॉम',u'महाराष्ट्रा']
final_neighbours = {}
for i in words_to_find:
if i in tokens:
print "Matched word : ",i
idx = tokens.index(i)
print "this is index : ",idx
idx_lb = idx-3
idx_ub = idx+4
print "Limits : ",idx_lb,idx_ub
only_neighbours = tokens[idx_lb : idx_ub]
only_neighbours.remove(i)
final_neighbours[i]= only_neighbours
for k,v in final_neighbours.items():
print "\nKey:",k
print "Values:"
for i in v:
print i,
Output:
Matched word : फ्रॉम
this is index : 3
Limits : 0 7
Matched word : महाराष्ट्रा
this is index : 4
Limits : 1 8
Key: महाराष्ट्रा
Values:
एंड गिव फ्रॉम मुंबई इंश्योरेंस कंपनी
Key: फ्रॉम
Values:
प्रीमियम एंड गिव महाराष्ट्रा मुंबई इंश्योरेंस
答案 1 :(得分:0)
每个单词的邻居都会发生变化。因此,每个单词都为空。并且count也应该被分配给indexing-3,如果它的&gt; = 0则为lower_limit,否则为0,因为找到的单词中的前三个和后三个单词是你需要的。
for x in words_to_find:
neighbours=[] # the neighbour for the new word will change, therefore make it null!
if x in tokens:
print "Matched word is", x
indexing = tokens.index(x)
print "This is index :", indexing
lower_limit = indexing - 3
upper_limit = indexing + 3
count = lower_limit if lower_limit >=0 else 0# lower_limit starts from the index-3 of the word found!
print "Limits are", lower_limit,upper_limit,count
for i in tokens:
if count >= lower_limit and count <= upper_limit:
print "I have entered the if condition"
print "Count is : ",count
wording = tokens[count]
neighbours.append(wording)
else:
count +=1
break
count +=1
final_neighbour.append(neighbours)
print "I am in access here", final_neighbour
示例IO(一些随机令牌和words_to_find用于测试目的):
tokens='hi this is hi keerthana hello world hey hi hello'.split()
words_to_find=['hi','hello']
I am in access here [['hi', 'this', 'is', 'hi'], ['is', 'hi', 'keerthana', 'hello', 'world', 'hey', 'hi']]
<强>建议强>
您可以使用列表切片来获取匹配单词之前和之后的3个单词。这也将提供所需的输出!
lower_limit = lower_limit if lower_limit >=0 else 0
neighbours.append(tokens[lower_limit:upper_limit+1])
即,
final_neighbour=[]
for x in words_to_find:
neighbours=[] # the neighbour for the new word will change, therefore make it null!
if x in tokens:
print "Matched word is", x
indexing = tokens.index(x)
print "This is index :", indexing
lower_limit = indexing - 3
upper_limit = indexing + 3
lower_limit = lower_limit if lower_limit >=0 else 0# lower_limit starts from the index-3 of the word found!
print "Limits are", lower_limit,upper_limit
neighbours.append(tokens[lower_limit:upper_limit+1])
final_neighbour.append(neighbours)
print "I am in access here", final_neighbour
希望它有所帮助!
答案 2 :(得分:-1)
您在for循环中有以下行
neighbours.append(wording)
什么是“邻居”??
你应该初始化它(特别是在循环之外...更喜欢在你定义标记和Words_to_find的代码的开头使用)如下面的附加语句
neighbours[]