Question

我有一个数据框，其中有一列值（由随机分配的处理）1、2、3。

i,treatment
1,1
2,3
3,2
4,2
5,1
6,3
7,3
8,2
9,1
...

数据帧中每3行的块包含三个可用值的排列，例如对于(1,3,2)上方的1-3行，对于4-6 (2,1,3)行，对于7-9 (3,2,1)行，等等。数据帧中的行数可被3整除。 / p>

我需要计算排列的次数-我该怎么做？

Answer 1

以下，treatment是数据框中的该列（其长度是3的倍数）。仅使用您的示例数据，就有treatment <- c(1, 3, 2, 2, 1, 3, 3, 2, 1)。然后

M <- matrix(treatment, ncol = 3, byrow = TRUE)
radix <- 10 ^ (2:0)
ID <- M %*% radix
table(ID)

#132 213 321 
#  1   1   1

也许更容易访问的版本是使用paste0为置换索引ID生成ID <- apply(M, 1L, paste0, collapse = "")，但是这将比我上面使用的矩阵矢量乘法效率低得多一个非常长的treatment向量。

Answer 2

使用dplyr的基于count的解决方案可以是：

library(dplyr)

# Group of every 3 rows
df %>% group_by(grp = (row_number()-1)%/%3) %>%
  #use paste with argument 'collapse' to find distinct permutations. 
  summarise(Permutation = paste(treatment, collapse=",")) %>%
  count(Permutation)

# # A tibble: 3 x 2
#   Permutation     n
#   <chr>       <int>
# 1 1,3,2           1
# 2 2,1,3           1
# 3 3,2,1           1

数据：

df <- read.table(text=
"i,treatment
1,1
2,3
3,2
4,2
5,1
6,3
7,3
8,2
9,1",
header = TRUE, sep=",")

如何计算数据帧中值序列的出现次数？

2 个答案: