从字符串列表中获取分裂句子的索引

时间:2017-05-09 11:08:14

标签: python string text-mining

所需的结果是函数或查找字符串列表中句子的位置的方法。

sentence = 'The cat went to the pool yesterday'

structure = ['The cat went,', 'to the pool yesterday.','I wonder if you realize the effect you are having on me. It hurts. A lot.']

例如

def findsentence(sentence, list of strings):
      # do something to get the output, vec of positions to find the sentence in hte string list
return output

findsentence(sentence, structure)
> (0,1) # beacuse the phrase is splitted in the list...

注意!!

挑战不是找到确切的句子。看一下这个例子,这句话是句子位置0的一部分,也是结构张贴1的一部分。

所以这不是一个简单的字符串操作问题。

3 个答案:

答案 0 :(得分:3)

使用以下内容:

sentence = "foo sam bar go"
structure = ["rq", "foo sam", "bar go", "ca", "da"]

def findsentencelist(sentence, list_of_strings):
    l = []
    for item in list_of_strings:
        if item in sentence:
            l.append(list_of_strings.index(item))
    return l

print str(findsentencelist(sentence, structure))

希望这对你有帮助,Yahli。

编辑:

您的变量存在问题。 你的句子必须是一个字符串 - 而不是一个列表。 编辑变量并再次尝试此功能:)

第二次编辑: 我想我终于明白了你要做的事情。如果这个效果更好,请告诉我。

第三次编辑: 耶稣,希望这个可以解决你的问题。让我知道它是否有诀窍:))

答案 1 :(得分:2)

我只是删除structure上的标点符号以使其正常工作:

sentence = 'The cat went to the pool yesterday'

structure = ['The cat went,', 'to the pool yesterday.','I wonder if you realize the effect you are having on me. It hurts. A lot.','Life is too short as it is. In short, she had a cushion job.']

import string

def findsentence(sentence, list_of_strings):
    return tuple(i for i, s in enumerate(list_of_strings) if s.translate(None, string.punctuation) in sentence)


print findsentence(sentence, structure)
# (0, 1)

答案 2 :(得分:0)

删除标点符号后。您可以使用此代码获取索引

for i,j in enumerate(structure):
     if j in sentence:
          print(i) 

希望这可以解决您的问题。还有其他解决方案,因为python很灵活。