
时间:2013-08-16 16:47:55

标签: python nltk


The Large Hadron Collider (LHC) is the world’s largest and most powerfulparticle accelerator.This site includes the latest news from the project, accessible explanations of how the LHC works, how it is funded, who works there and what benefits it brings us.You can access a wide range of resources for the public, journalists and teachers and students, there are also many links to other sources of information.The Large Hadron Collider atCERNnear Geneva, Switzerland is opening new vistas on the deepest secrets of the universe, stretching the imagination with newly discovered forms of matter, forces of nature, and dimensions of space.


['large', 'big', 'heavy']

我不确定如何在变量中的关键字之前和之后拾取几个单词 例如:

keyword = 'large'


The Large Hadron


3 个答案:

答案 0 :(得分:3)

test_word = 'large'
my_string = 'The Large Hadron Collider (LHC) is the world’s largest and most powerfulparticle accelerator.This site includes the latest news from the project, accessible explanations of how the LHC works, how it is funded, who works there and what benefits it brings us' 
# I truncated your sentence

test_words = my_string.lower().split()
correct_case = my_string.split() # this will preserve the case of the original words
# and it will be identical in length to test words with each word in the same position
position = test_words.index(test_word)

my_new_string = ' '.join(correct_case[position-1:position+2]


答案 1 :(得分:0)


In [1]: s = 'The Large Hadron Collider (LHC) is the world’s largest and most powerfulparticle accelerator.'
In [2]: words = s.split() 
In [3]: words_lower = s.lower().split() #lowercase words so keyword matching is easy.
In [4]: keyword = 'large'
In [5]: i = words_lower.index(keyword)
In [6]: phrase = ' '.join(words[i-1:i+2])
In [7]: phrase
Out[7]: 'The Large Hadron'

答案 2 :(得分:0)

text = "The Large Hadron Collider (LHC) is the world’s largest and most powerfulparticle accelerator.This site includes the latest news from the project, accessible explanations of how the LHC works, how it is funded, who works there and what benefits it brings us.You can access a wide range of resources for the public, journalists and teachers and students, there are also many links to other sources of information.The Large Hadron Collider atCERNnear Geneva, Switzerland is opening new vistas on the deepest secrets of the universe, stretching the imagination with newly discovered forms of matter, forces of nature, and dimensions of space."
keywords = ['large', 'is', 'most']
text = text.lower().split(' ')
results = []
for word in keywords:
    indx = text.index(word)
    results.append(" ".join(text[indx-1:indx+2]))

print results