Question

考虑以下数据框：

 df = data.frame(cusip = paste("A", 1:10, sep = ""), xt = c(1,2,3,2,3,5,2,4,5,1), xt1 = c(1,4,2,1,1,4,2,2,2,5))

数据分为五种状态，实际上是分位数： 1,2,3,4,5 。数据框的第一列表示 t 时的状态，第二列是 t + 1 时的状态。

我想计算一种五种状态的转换矩阵。矩阵的含义如下：

（Row，Col）=（1,1）： t 时分位数1中的尖头的百分比，并且时间停留在 t + 1
（Row，Col）=（1,2）： t 中分位数1中的尖头的百分比，以及在 t + 1
等...

我真的不确定如何以有效的方式做到这一点。我觉得答案是微不足道的，但我无法理解它。

有人可以帮忙吗？

Answer 1

res <- with(df, table(xt, xt1)) ## table() to form transition matrix
res/rowSums(res)                ## /rowSums() to normalize by row
#    xt1
# xt          1         2         4         5
#   1 0.5000000 0.0000000 0.0000000 0.5000000
#   2 0.3333333 0.3333333 0.3333333 0.0000000
#   3 0.5000000 0.5000000 0.0000000 0.0000000
#   4 0.0000000 1.0000000 0.0000000 0.0000000
#   5 0.0000000 0.5000000 0.5000000 0.0000000

## As an alternative to  2nd line above, use sweep(), which won't rely on 
## implicit recycling of vector returned by rowSums(res)
sweep(res, MARGIN = 1, STATS = rowSums(res), FUN = `/`)

Answer 2

如果要在转换矩阵的列中包含所有状态（1..5），可以尝试以下操作：

id x y time delta
1 x1 y1 10   4
1 x2 y2 14   0
2 x4 y4 8    4
2 x5 y5 12   0

过渡矩阵

2 个答案: