Question

我如何使用NLTK模块同时写出名词的单数和复数形式，或者告诉它在搜索单词的txt文件时不要区分单数和复数？我可以使用NLTK使程序不区分大小写吗？

Answer 1

您可以使用pattern.en执行此操作，但不太了解NLTK

>>> from pattern.en import pluralize, singularize
>>>  
>>> print pluralize('child') #children
>>> print singularize('wolves') #wolf

请参阅more

Answer 2

目前编写的模式不支持Python 3（尽管此处正在讨论https://github.com/clips/pattern/issues/62。

TextBlob https://textblob.readthedocs.io建立在模式和NLTK之上，还包括复数功能。它似乎做得很好，虽然它并不完美。请参阅下面的示例代码。

public class PostsListFragment extends Fragment {

private ArrayList<CustomPost> posts;

private Sorting sorting;

private String name;
private RecyclerView recyclerView;
private SubPostsAdapter adapter;
private LinearLayoutManager linearLayoutManager;
private Fetcher fetcher

public PostsListFragment() {
    this.posts = new ArrayList<>();
}

public static Fragment newInstance(String name) {
    PostsListFragment pf = new PostsListFragment();
    pf.name = name;

    return pf;
}

@Override
public View onCreateView (LayoutInflater inflater, ViewGroup container, Bundle savedInstanceState) {
    recyclerView = (RecyclerView) inflater.inflate(R.layout.posts_list_holder, container, false);

    //default sorting
    this.sorting = Sorting.HOT;
    this.fetcher = new Fetcher(name);

    loadItems();

    return recyclerView;
}

@Override
public void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setRetainInstance(true);
    setHasOptionsMenu(true);
}

private void loadItems() {
    if (posts.size() == 0) {
        new Thread() {
            @Override
            public void run() {
                posts.addAll(fetcher.fetchPosts(sorting));

                new Thread() {
                    @Override
                    public void run() {
                        linearLayoutManager = new LinearLayoutManager(recyclerView.getContext());

                        adapter = new SubPostsAdapter(posts, getActivity());

                        getActivity().runOnUiThread(new Runnable() {
                            @Override
                            public void run() {
                                recyclerView.setLayoutManager(linearLayoutManager);
                                recyclerView.setAdapter(adapter);
                            }
                        });

                    }
                }.start();
            }
        }.start();
    }
}
}

Answer 3

这是使用NLTK进行此操作的一种可能方法。想象一下，您正在搜索“功能”这个词：

from nltk.stem import WordNetLemmatizer
from nltk.tokenize import word_tokenize

wnl = WordNetLemmatizer()
text = "This is a small text, a very small text with no interesting features."
tokens = [token.lower() for token in word_tokenize(text)]
lemmatized_words = [wnl.lemmatize(token) for token in tokens]
'feature' in lemmatized_words

在所有单词中使用str.lower()处理区分大小写，当然，如果有必要，您还必须将搜索词变为lemmatize。

Answer 4

回答可能有点迟，但万一有人还在寻找类似的东西：

支持python 2.x和3.x的inflect（也可在github中使用）。您可以找到给定单词的单数或复数形式：

import inflect
p = inflect.engine()

words = "cat dog child goose pants"
print([p.plural(word) for word in words.split(' ')])
# ['cats', 'dogs', 'children', 'geese', 'pant']

值得注意的是，复数的p.plural会给你单数形成。此外，您可以提供POS（部分语音）标记或提供数字，并且lib确定它是否需要复数或单数：

p.plural('cat', 4)   # cats
p.plural('cat', 1)   # cat
# but also...
p.plural('cat', 0)   # cats

Python - 生成单数名词的复数名词

4 个答案: