我使用R,并且有一个像这样的表XY
View( xy)
X Y
21 A
33 B
24 B
16 A
25 B
31 A
17 B
14 A
现在,我想像这样在最后以10为步长来制作x和y以及频率的组
Class A B
I (1-10) 0 0
II (11-20) 2 1
III (21-30) 1 2
And so on
答案 0 :(得分:3)
首先使用注释掉的硬编码标签或计算的标签lab
创建标签。然后使用cut
和table
创建结果表。
# lab <- c("I (1-10)", "II (11-20)", "III (21-30)", "IV (31-40)")
n <- ceiling(max(DF$X) / 10) # 4
bounds <- seq(0, 10*n, 10) # c(0, 10, 20, 30, 40)
lab <- sprintf("%s (%d-%d)", as.roman(1:n), head(bounds, -1) + 1, bounds[-1])
Class <- cut(DF$X, bounds, lab = lab)
table(Class, Y = DF$Y)
给予:
Y
Class A B
I (1-10) 0 0
II (11-20) 2 1
III (21-30) 1 2
IV (31-40) 1 1
我们假设输入数据帧DF
以可再现的形式显示如下:
Lines <- "
X Y
21 A
33 B
24 B
16 A
25 B
31 A
17 B
14 A"
DF <- read.table(text = Lines, header = TRUE)
答案 1 :(得分:1)
一种tidyverse
可能是:
df %>%
mutate(Class = X %/% 10) %>%
count(Y, Class) %>%
group_by(Y) %>%
complete(Class = seq(0, max(Class), 1)) %>%
spread(Y, n, fill = 0)
Class A B
<dbl> <dbl> <dbl>
1 0 0 0
2 1 2 1
3 2 1 2
4 3 1 1
或者,如果您还想要范围:
df %>%
mutate(Class = X %/% 10) %>%
count(Y, Class) %>%
group_by(Y) %>%
complete(Class = seq(0, max(Class), 1)) %>%
spread(Y, n, fill = 0) %>%
mutate(Class = paste(Class * 10 + 1,
lead(Class * 10, default = ((last(Class) + 1) * 10)),
sep = "-"))
Class A B
<chr> <dbl> <dbl>
1 1-10 0 0
2 11-20 2 1
3 21-30 1 2
4 31-40 1 1
或者,如果您想要提供的确切输出:
df %>%
mutate(Class = X %/% 10) %>%
count(Y, Class) %>%
group_by(Y) %>%
complete(Class = seq(0, max(Class), 1)) %>%
spread(Y, n, fill = 0) %>%
mutate(Class = paste0("(",
Class * 10 + 1,
"-",
lead(Class * 10, default = ((last(Class) + 1) * 10)),
")"),
Class = paste(as.roman(row_number()), Class, sep = " "))
Class A B
<chr> <dbl> <dbl>
1 I (1-10) 0 0
2 II (11-20) 2 1
3 III (21-30) 1 2
4 IV (31-40) 1 1
或者X == 0的情况的可能性:
df %>%
filter(X > 0) %>%
mutate(Class = X %/% 10) %>%
count(Y, Class) %>%
group_by(Y) %>%
complete(Class = seq(0, max(Class), 1)) %>%
spread(Y, n, fill = 0) %>%
mutate(Class = paste0("(",
Class * 10 + 1,
"-",
lead(Class * 10, default = ((last(Class) + 1) * 10)),
")"),
Class = paste(as.roman(row_number()), Class, sep = " "))