我有一个使用NLTK查找名词和动词的代码。
from nltk.corpus import wordnet as wn
from nltk import pos_tag
import nltk
sentence = "Hello my name is Abhishek Mitra"
sentence = nltk.word_tokenize(sentence)
sent = pos_tag(sentence)
print sent
它返回:
[('Hello', 'NNP'), ('my', 'PRP$'), ('name', 'NN'), ('is', 'VBZ'), ('Abhishek', 'NNP'), ('Mitra', 'NNP')]
如何从列表中仅删除“NN”字样。
答案 0 :(得分:3)
您可以使用列表推导来删除'NN'元素:
from nltk.corpus import wordnet as wn
from nltk import pos_tag
import nltk
sentence = "Hello my name is Abhishek Mitra"
sentence = nltk.word_tokenize(sentence)
sent = pos_tag(sentence)
print [s for s in sent if s[1] != 'NN']
答案 1 :(得分:0)
a = [('Hello', 'NNP'), ('my', 'PRP$'), ('name', 'NN'), ('is', 'VBZ'), ('Abhishek', 'NNP'), ('Mitra', 'NNP')]
c = [b for b in a if b[-1] != 'NN']
答案 2 :(得分:0)
我会使用过滤功能:
>>> filter(lambda (word, tag): tag != 'NN', sent)
[('Hello', 'NNP'), ('my', 'PRP$'), ('is', 'VBZ'), ('Abhishek', 'NNP'), ('Mitra', 'NNP')]
答案 3 :(得分:0)
还有另一种方法(利用元组的优势):
<div class="father">
<div class="son">
</div>
</div>
.father{
position:relative;
width:300px;
height:300px;
border:1px solid red;
}
.son{
position:absolute;
width:100px;
height:100px;
border:1px solid red;
}
.son:hover {
-webkit-transition: all 0.5s ease;
-webkit-animation: fadein_1 0.2s ease-in;
}
@-webkit-keyframes fadein_1 {
from { opacity: 1;left:auto;z-index:2; }
to { opacity: 0; left:10px;z-index:3; }
}
输出:
from nltk.corpus import wordnet as wn
from nltk import pos_tag
import nltk
sentence = "Hello my name is Abhishek Mitra"
sentence = nltk.word_tokenize(sentence)
sent = pos_tag(sentence)
sent_clean = [x for (x,y) in sent if y not in ('NN')]
print(sent_clean)
说明: 在代码中:
['Hello', 'my', 'is', 'Abhishek', 'Mitra']
在POS标记句子中的每个单词之后,您将尝试为由于POS标记而创建的元组提取单词。您指定要提取的条件是第二部分
类似地,如果您要消除多个POS:
sent_clean = [x for (x,y) in sent if y not in ('NN')]
输出:
sent_clean2 = [x for (x,y) in sent if y not in ('PRP$', 'VBZ', 'NN')]
print(sent_clean2)