Question

我一直在解决机器学习课程中的一个问题，我似乎无法弄明白。

如果我正确理解它，算法的要点是：

期望：

• For each sentence s in S:
    ○ For each word/tag pair (w,t):
        § For every occurence of w (at position i) in s:
            □ EmissionCounts(w,t) += (forward[t][i]*backward[t][i])/(sum of forward[tag][N] for all tags)
    ○ For every tag/tag pair:
        § For every adjacent pair of words (starting at position i):
            □ TransitionCounts(t1,t2) += forward[t1][i]*P(t2|t1)*P(w[i+1]|t2)*backward[t2][i+1] / (sum of forward[tag][N] for all tags)
    ○ For every tag:
        § For the first word in the sentence:
            □ InitialCounts(t) = pi(t)*P(w[1]|t)*backward[t][1] / (sum forward[t][N] for all tags)
• For each tag t:
    ○ For every word w:
        § TagCounts(t) += EmissionCounts(w,t)

最大化：

• PI(t) = InitalCounts(t)/(# sentences)
• P(t2|t1) = TransitionCounts(t1,t2)/TagCounts(t1)
• P(w|t) = EmissionCounts(w,t)/TagCounts(t)

检查收敛情况：

这是我的baum welch算法的链接。任何人都对我可能做错了什么有任何想法？

https://gist.github.com/dmcquillan314/4058b9048799e3488a05

这里还有一个指向整个仓库的链接： https://github.com/dmcquillan314/HW6

鲍姆韦尔奇实施

0 个答案: