删除停用词,但应该作为一行返回

时间:2013-11-06 21:35:40

标签: python

我的问题可能显得很愚蠢。但由于我是Python的新手,请帮帮我。

我必须将一行传递给禁用词删除功能。它工作正常。但我的问题是函数的返回是附加单词。我希望它如下:

line = " I am feeling good , but I cant talk"

"I,but,cant"成为停用词。

传递给函数后,我的输出应为"am feeling good , talk"。 我现在得到的是[['am','feeling','good','talk']]

帮助我。

3 个答案:

答案 0 :(得分:0)

要将该列表作为字符串获取,您可以执行以下操作:

>>> out = [['am','feeling','good','talk']]
>>> " ".join(out[0])
'am feeling good talk'
>>>

但是,我认为这更符合您的要求:

>>> line = " I am feeling good , but I cant talk"
>>> [word for word in line.split() if word not in ("I", "but", "cant")]
['am', 'feeling', 'good', ',', 'talk']
>>> lst = [word for word in line.split() if word not in ("I", "but", "cant")]
>>> " ".join(lst)
'am feeling good , talk'
>>>

此处的重要部分包括str.joinstr.splitlist comprehension

答案 1 :(得分:0)

line = " I am feeling good , but I cant talk"
stop_words={'I','but','cant'}
li=[word for word in line.split() if word not in stop_words] 
print li
# prints ['am', 'feeling', 'good', ',', 'talk']
print ' '.join(li)
# prints 'am feeling good , talk'

答案 2 :(得分:0)

您可以使用列表理解来实现此目的:

def my_function(line, stopwords):
    return [word for word in line.split() if word not in stopwords]

stopwords = ['i', 'but', 'cant']
line = " I am feeling good , but I cant talk"
my_function(line, stopwords)

这与下面这段代码大致相同:

def my_function(line, stopwords):
        result = []
        for i in line.split(): #loop through the lines
        if i not in stopwords: #Check if the words are included in stopwords
            result.append(i)

结果:

['am', 'feeling', 'good,', 'talk']

希望这有帮助!