我正在使用nltk lemmatizer如下。
<div class="container">
<div class="menu-social-container">
<ul id="menu-social" class="social">
<li id="menu-item-347" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-347"><a href="http://index.php"><i class="fa fa-google" aria-hidden="true"></i></a></li>
<li id="menu-item-348" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-348"><a href="http://index.php"><i class="fa fa-facebook" aria-hidden="true"></i></a></li>
<li id="menu-item-349" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-349"><a href="http://index.php"><i class="fa fa-twitter" aria-hidden="true"></i></a></li>
<li id="menu-item-350" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-350"><a href="http://index.php"><i class="fa fa-pinterest" aria-hidden="true"></i></a></li>
<li id="menu-item-351" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-351"><a href="http://index.php"><i class="fa fa-instagram" aria-hidden="true"></i></a></li>
</ul>
</div>
<div class="row">
我期待输出
from nltk.stem import WordNetLemmatizer
lemmatizer = WordNetLemmatizer()
mystring = "the sand rock needed to be mixed and shaked well before using it for construction works"
splits=mystring.split()
mystring = " ".join(lemmatizer.lemmatize(w) for w in splits)
print(mystring)
然而,在我得到的输出中(如下所述),sand rock need to be mix and shake well before use it for construction work
之类的单词似乎没有改为基本形式。
needed, mixed, shaked, using
有没有办法解决这个问题?
答案 0 :(得分:0)
您可以用此替换最后一行。
mystring = " ".join(lemmatizer.lemmatize(w,pos ='v') for w in splits)
pos是语音标记的一部分。