我一直在解决机器学习课程中的一个问题,我似乎无法弄明白。
如果我正确理解它,算法的要点是:
期望:
• For each sentence s in S:
○ For each word/tag pair (w,t):
§ For every occurence of w (at position i) in s:
□ EmissionCounts(w,t) += (forward[t][i]*backward[t][i])/(sum of forward[tag][N] for all tags)
○ For every tag/tag pair:
§ For every adjacent pair of words (starting at position i):
□ TransitionCounts(t1,t2) += forward[t1][i]*P(t2|t1)*P(w[i+1]|t2)*backward[t2][i+1] / (sum of forward[tag][N] for all tags)
○ For every tag:
§ For the first word in the sentence:
□ InitialCounts(t) = pi(t)*P(w[1]|t)*backward[t][1] / (sum forward[t][N] for all tags)
• For each tag t:
○ For every word w:
§ TagCounts(t) += EmissionCounts(w,t)
最大化:
• PI(t) = InitalCounts(t)/(# sentences)
• P(t2|t1) = TransitionCounts(t1,t2)/TagCounts(t1)
• P(w|t) = EmissionCounts(w,t)/TagCounts(t)
检查收敛情况:
这是我的baum welch算法的链接。任何人都对我可能做错了什么有任何想法?
https://gist.github.com/dmcquillan314/4058b9048799e3488a05
这里还有一个指向整个仓库的链接: https://github.com/dmcquillan314/HW6