Question

我们使用它的设置：

#Please use the setup in the following **EDIT** section.
#df = data.frame(cusip = paste("A", 1:10, sep = ""), xt = c(1,2,3,2,3,5,2,4,5,1), xt1 = c(1,4,2,1,1,4,2,2,2,5))
   cusip xt xt1
1     A1  1   1
2     A2  2   4
3     A3  3   2
4     A4  2   1
5     A5  3   1
6     A6  5   4
7     A7  2   2
8     A8  4   2
9     A9  5   2
10   A10  1   5

根据该帖子中的答案，我们可以得到如下的转换矩阵：

res <- with(df, table(xt, xt1)) ## table() to form transition matrix
res/rowSums(res)                ## /rowSums() to normalize by row
#    xt1
# xt          1         2         4         5
#   1 0.5000000 0.0000000 0.0000000 0.5000000
#   2 0.3333333 0.3333333 0.3333333 0.0000000
#   3 0.5000000 0.5000000 0.0000000 0.0000000
#   4 0.0000000 1.0000000 0.0000000 0.0000000
#   5 0.0000000 0.5000000 0.5000000 0.0000000

我们注意到没有第3列，因为在时间t + 1没有状态3。但是在数学中，过渡矩阵必须是方形的。对于这种情况，我们仍然需要一个列3，其中[3,3] = 1，其他元素= 0（规则是对于任何缺失的列n或缺少行n，我们设置[n，n] = 1和其他元素在该行/列= 0）中，如下所示：

#    xt1
# xt          1         2         3         4         5
#   1 0.5000000 0.0000000 0.0000000 0.0000000 0.5000000
#   2 0.3333333 0.3333333 0.0000000 0.3333333 0.0000000
#   3 0.5000000 0.5000000 1.0000000 0.0000000 0.0000000
#   4 0.0000000 1.0000000 0.0000000 0.0000000 0.0000000
#   5 0.0000000 0.5000000 0.0000000 0.5000000 0.0000000

我可以在没有编写凌乱的for循环的情况下实现这一目标吗？谢谢。

修改请改用此数据集：

df = data.frame(cusip = paste("A", 1:10, sep = ""), xt = c(2,2,3,2,3,5,2,4,5,4), xt1 = c(1,4,2,1,1,4,2,3,2,5))
   cusip xt xt1
1     A1  2   1
2     A2  2   4
3     A3  3   2
4     A4  2   1
5     A5  3   1
6     A6  5   4
7     A7  2   2
8     A8  4   3
9     A9  5   2
10   A10  4   5

现在我们有如下转换矩阵：

res <- with(df, table(xt, xt1)) 
res/rowSums(res)                
   xt1
xt     1    2    3    4    5
  2 0.50 0.25 0.00 0.25 0.00
  3 0.50 0.50 0.00 0.00 0.00
  4 0.00 0.00 0.50 0.00 0.50
  5 0.00 0.50 0.00 0.50 0.00

请注意，缺少第1行。现在我想要一个新的行1，其中[1,1] = 1，其他元素= 0（这样该行总和为1）。得到类似的东西：

   xt1
xt     1    2    3    4    5
  1 1.00 0.00 0.00 0.00 0.00
  2 0.50 0.25 0.00 0.25 0.00
  3 0.50 0.50 0.00 0.00 0.00
  4 0.00 0.00 0.50 0.00 0.50
  5 0.00 0.50 0.00 0.50 0.00

我如何实现（添加缺失的行）？

Answer 1

这是一种方法（仅查看提出的第二个问题）：

# setup
df = data.frame(
  cusip = paste("A", 1:10, sep = ""), 
  xt = c(2,2,3,2,3,5,2,4,5,4), 
  xt1 = c(1,4,2,1,1,4,2,3,2,5)
)

df$xt   = factor(df$xt, levels=1:5)
df$xt1  = factor(df$xt1, levels=1:5)

# making the transition frequency table
tab = with(df, prop.table(table(xt,xt1), 1))

#    xt1
# xt     1    2    3    4    5
#   1                         
#   2 0.50 0.25 0.00 0.25 0.00
#   3 0.50 0.50 0.00 0.00 0.00
#   4 0.00 0.00 0.50 0.00 0.50
#   5 0.00 0.50 0.00 0.50 0.00

这是用于描述数据df中观察到的转换频率的正确表。但是，如果您想判断没有数据可用的转换规则，则有一些选项。 OP想要暗示任何未观察到的状态都是吸收状态＆＃34;：

r = rowSums(tab,na.rm=TRUE)==0

tab[r, ] <- diag(nrow(tab))[r,,drop=FALSE]

#    xt1
# xt     1    2    3    4    5
#   1 1.00 0.00 0.00 0.00 0.00
#   2 0.50 0.25 0.00 0.25 0.00
#   3 0.50 0.50 0.00 0.00 0.00
#   4 0.00 0.00 0.50 0.00 0.50
#   5 0.00 0.50 0.00 0.50 0.00

我不认为这是一个好主意，因为它隐藏了真实数据的特征。

R：添加缺少的行而不使用for循环

1 个答案: