我不知道哪个AI分支会解决我的问题

时间:2018-11-10 09:11:59

标签: database android-studio artificial-intelligence

如果我有一组单词,并且想要在它们之间找到一个模式,然后在长文本中寻找该模式,我应该使用什么,机器学习,文本分析或模式识别?

1 个答案:

答案 0 :(得分:0)

我将为所有单词构建n-gram。

from nltk import ngrams
from collections import Counter

words = ["aim", "aid", "bail", "bait"]


def build_ngrams(words, from_size, to_size):
    word_ngrams = []

    for word in words:
        for ngram_size in range(from_size, to_size + 1):
            ng = ngrams(word, ngram_size)
            word_ngrams.extend(ng)

    return word_ngrams


# construct all bigrams and trigrams
word_ngrams = build_ngrams(words, 2, 3)

# find the most common n-grams
counter = Counter(word_ngrams)
print(counter.most_common(3))

这将为您提供最常见的模式,以后您可以将其用于搜索。