应用错误收集

我正在尝试在同一嵌入空间中映射单字组，双字组和三字组向量，以查看短语和单个单词之间的相似性。为了获得这样的结果，我通过以下方式创建训练数据：

例如：Text = "Can I solve this problem?"

我有这句话的单字，二字和三字。

unigram_list = ["Can", "I", "solve", "this", "problem"]
bigram_list = [("Can", "I"), ("I", "solve"), ("solve", "this"), ("this", "problem")]

是否可以使用unigram，双字母组的所有可能组合来构造句子？

赞：

sentence_combo_1 = ["Can", ("I", "solve"), "this", "problem"]
sentence_combo_2 = ["Can", "I", ("solve", "this"), "problem"]
sentence_combo_3 = [("Can", "I"), ("solve", "this"), "problem"]

以此类推

如何使用单字组和双字组构造句子？

0 个答案: