隐马尔可夫模型(HMM)是您观察一系列观测值的模型,但不知道模型经历的状态序列以生成观测值。对隐马尔可夫模型的分析试图从观测数据中恢复隐藏状态序列。
我有观察和隐藏状态的数据(观察是连续的值),其中隐藏状态由专家标记。我想训练一个HMM,它能够 - 基于(以前看不见的)观察序列 - 恢复相应的隐藏状态。
有没有R套件可以做到这一点?研究现有软件包(depmixS4,HMM,seqHMM - 仅用于分类数据)允许您仅指定多个隐藏状态。
修改
示例:
data.tagged.by.expert = data.frame(
hidden.state = c("Wake", "REM", "REM", "NonREM1", "NonREM2", "REM", "REM", "Wake"),
sensor1 = c(1,1.2,1.2,1.3,4,2,1.78,0.65),
sensor2 = c(7.2,5.3,5.1,1.2,2.3,7.5,7.8,2.1),
sensor3 = c(0.01,0.02,0.08,0.8,0.03,0.01,0.15,0.45)
)
data.newly.measured = data.frame(
sensor1 = c(2,3,4,5,2,1,2,4,5,8,4,6,1,2,5,3,2,1,4),
sensor2 = c(2.1,2.3,2.2,4.2,4.2,2.2,2.2,5.3,2.4,1.0,2.5,2.4,1.2,8.4,5.2,5.5,5.2,4.3,7.8),
sensor3 = c(0.23,0.25,0.23,0.54,0.36,0.85,0.01,0.52,0.09,0.12,0.85,0.45,0.26,0.08,0.01,0.55,0.67,0.82,0.35)
)
我想创建一个离散时间 t 的HMM,随机变量 x(t)代表 t 时的隐藏状态, x(t) {" Wake"," REM"," NonREM1",&#34 ; NonREM2" }和3个连续随机变量 sensor1(t),sensor2(t),sensor3(t)表示 t 时的观测值。
model.hmm = learn.model(data.tagged.by.user)
然后我想使用创建的模型来估计负责新测量观察的隐藏状态
hidden.states = estimate.hidden.states(model.hmm, data.newly.measured)
答案 0 :(得分:1)
为了能够为朴素贝叶斯分类器运行学习方法,我们需要更长的数据集
states = c("NonREM1", "NonREM2", "NonREM3", "REM", "Wake")
artificial.hypnogram = rep(c(5,4,1,2,3,4,5), times = c(40,150,200,300,50,90,30))
data.tagged.by.expert = data.frame(
hidden.state = states[artificial.hypnogram],
sensor1 = log(artificial.hypnogram) + runif(n = length(artificial.hypnogram), min = 0.2, max = 0.5),
sensor2 = 10*artificial.hypnogram + sample(c(-8:8), size = length(artificial.hypnogram), replace = T),
sensor3 = sample(1:100, size = length(artificial.hypnogram), replace = T)
)
hidden.hypnogram = rep(c(5,4,1,2,4,5), times = c(10,10,15,10,10,3))
data.newly.measured = data.frame(
sensor1 = log(hidden.hypnogram) + runif(n = length(hidden.hypnogram), min = 0.2, max = 0.5),
sensor2 = 10*hidden.hypnogram + sample(c(-8:8), size = length(hidden.hypnogram), replace = T),
sensor3 = sample(1:100, size = length(hidden.hypnogram), replace = T)
)
在解决方案中,我们使用了Viterbi算法-结合了朴素贝叶斯分类器。
在每个时钟时间 t ,隐马尔可夫模型组成
处于有限状态数的不可观察状态(在这种情况下,此状态表示为hidden.state)
states = c("NonREM1", "NonREM2", "NonREM3", "REM", "Wake")
一组观察到的变量(在这种情况下为sensor1,sensor2,sensor3)
根据转移概率分布进入新状态 (转换矩阵)。这可以很容易地从data.tagged.by.expert例如使用
library(markovchain)
emit_p <- markovchainFit(data.tagged.by.expert$hidden.state)$estimate
每次转换后,根据条件概率分布 (发射矩阵)产生观测值(sensor_i),该状态仅取决于hidden.state的当前状态H 。我们将用朴素贝叶斯分类器替换emmision矩阵。
library(caret)
library(klaR)
library(e1071)
model = train(hidden.state ~ .,
data = data.tagged.by.expert,
method = 'nb',
trControl=trainControl(method='cv',number=10)
)
为解决此问题,我们使用Viterbi algorithm的“唤醒”状态的初始概率为1,否则为0。 (我们希望患者在实验开始时保持清醒状态)
# we expect the patient to be awake in the beginning
start_p = c(NonREM1 = 0,NonREM2 = 0,NonREM3 = 0, REM = 0, Wake = 1)
# Naive Bayes model
model_nb = model$finalModel
# the observations
observations = data.newly.measured
nObs <- nrow(observations) # number of observations
nStates <- length(states) # number of states
# T1, T2 initialization
T1 <- matrix(0, nrow = nStates, ncol = nObs) #define two 2-dimensional tables
row.names(T1) <- states
T2 <- T1
Byj <- predict(model_nb, newdata = observations[1,])$posterior
# init first column of T1
for(s in states)
T1[s,1] = start_p[s] * Byj[1,s]
# fill T1 and T2 tables
for(j in 2:nObs) {
Byj <- predict(model_nb, newdata = observations[j,])$posterior
for(s in states) {
res <- (T1[,j-1] * emit_p[,s]) * Byj[1,s]
T2[s,j] <- states[which.max(res)]
T1[s,j] <- max(res)
}
}
# backtract best path
result <- rep("", times = nObs)
result[nObs] <- names(which.max(T1[,nObs]))
for (j in nObs:2) {
result[j-1] <- T2[result[j], j]
}
# show the result
result
# show the original artificial data
states[hidden.hypnogram]
要了解更多有关此问题的信息,请参见 VomlelJiří,KratochvílVáclav:“动态贝叶斯网络用于睡眠阶段分类”,第11届不确定性处理研讨会(WUPES’18),p。 205-215版,编者:KratochvílVáclav,VejnarováJiřina,不确定性处理研讨会(WUPES’18),(Třeboň,CZ,2018/06/06)[2018] Download