汇总频率分类表

时间:2019-03-23 08:39:32

标签: r dataframe

我使用R,并且有一个像这样的表XY

View( xy)
X             Y
21           A
33           B
24           B
16           A
25           B
31           A
17           B
14           A

现在,我想像这样在最后以10为步长来制作x和y以及频率的组


Class                  A          B
I (1-10)               0          0
II (11-20)             2          1
III (21-30)            1          2

And so on

2 个答案:

答案 0 :(得分:3)

首先使用注释掉的硬编码标签或计算的标签lab创建标签。然后使用cuttable创建结果表。

# lab <- c("I (1-10)", "II (11-20)", "III (21-30)", "IV (31-40)")
n <- ceiling(max(DF$X) / 10)  # 4
bounds <- seq(0, 10*n, 10)    # c(0, 10, 20, 30, 40)
lab <- sprintf("%s (%d-%d)", as.roman(1:n), head(bounds, -1) + 1, bounds[-1])

Class <- cut(DF$X, bounds, lab = lab)
table(Class, Y = DF$Y)

给予:

             Y
Class         A B
  I (1-10)    0 0
  II (11-20)  2 1
  III (21-30) 1 2
  IV (31-40)  1 1

注意

我们假设输入数据帧DF以可再现的形式显示如下:

Lines <- "
X            Y
21           A
33           B
24           B
16           A
25           B
31           A
17           B
14           A"
DF <- read.table(text = Lines, header = TRUE)

答案 1 :(得分:1)

一种tidyverse可能是:

df %>%
 mutate(Class = X %/% 10) %>%
 count(Y, Class) %>%
 group_by(Y) %>%
 complete(Class = seq(0, max(Class), 1)) %>%
 spread(Y, n, fill = 0) 

  Class     A     B
  <dbl> <dbl> <dbl>
1     0     0     0
2     1     2     1
3     2     1     2
4     3     1     1

或者,如果您还想要范围:

df %>%
 mutate(Class = X %/% 10) %>%
 count(Y, Class) %>%
 group_by(Y) %>%
 complete(Class = seq(0, max(Class), 1)) %>%
 spread(Y, n, fill = 0) %>%
 mutate(Class = paste(Class * 10 + 1, 
                      lead(Class * 10, default = ((last(Class) + 1) * 10)),
                      sep = "-"))

   Class     A     B
  <chr> <dbl> <dbl>
1 1-10      0     0
2 11-20     2     1
3 21-30     1     2
4 31-40     1     1

或者,如果您想要提供的确切输出:

df %>%
 mutate(Class = X %/% 10) %>%
 count(Y, Class) %>%
 group_by(Y) %>%
 complete(Class = seq(0, max(Class), 1)) %>%
 spread(Y, n, fill = 0) %>%
 mutate(Class = paste0("(",
                       Class * 10 + 1, 
                      "-",
                      lead(Class * 10, default = ((last(Class) + 1) * 10)),
                      ")"),
        Class = paste(as.roman(row_number()), Class, sep = " "))

  Class           A     B
  <chr>       <dbl> <dbl>
1 I (1-10)        0     0
2 II (11-20)      2     1
3 III (21-30)     1     2
4 IV (31-40)      1     1

或者X == 0的情况的可能性:

df %>%
 filter(X > 0) %>%
 mutate(Class = X %/% 10) %>%
 count(Y, Class) %>%
 group_by(Y) %>%
 complete(Class = seq(0, max(Class), 1)) %>%
 spread(Y, n, fill = 0) %>%
 mutate(Class = paste0("(",
                       Class * 10 + 1, 
                      "-",
                      lead(Class * 10, default = ((last(Class) + 1) * 10)),
                      ")"),
        Class = paste(as.roman(row_number()), Class, sep = " "))