我有一个像这样的data.frame:
24.8
23.2
22.8
22.5
22.5
22.4
22.4
22.4
22.3
22.2
22.2
22.2
22
21.9
21.9
21.8
我想根据频率添加一个值,以获得以下输出:
24.8 1
23.2 1
22.8 1
22.5 2
22.5 2
22.4 3
22.4 3
22.4 3
22.3 1
22.2 3
22.2 3
22.2 3
22 1
21.9 2
21.9 2
21.8 1
如何做到这一点? 换句话说,由于28.8发生一次,它将具有值1;由于22.5发生两次,它将具有值2,依此类推。
答案 0 :(得分:11)
您可以按如下方式使用ave()
:
myData <- data.frame(x = c(24.8, 23.2, 22.8, 22.5, 22.5, 22.4, 22.4, 22.4,
22.3, 22.2, 22.2, 22.2, 22, 21.9, 21.9, 21.8))
myData$Index <- ave(myData$x, myData$x, FUN = length)
myData
# x Index
# 1 24.8 1
# 2 23.2 1
# 3 22.8 1
# 4 22.5 2
# 5 22.5 2
# 6 22.4 3
# 7 22.4 3
# 8 22.4 3
# 9 22.3 1
# 10 22.2 3
# 11 22.2 3
# 12 22.2 3
# 13 22.0 1
# 14 21.9 2
# 15 21.9 2
# 16 21.8 1
您还可以使用data.table
包,如下所示:
myData2 <- data.table(x = c(24.8, 23.2, 22.8, 22.5, 22.5, 22.4, 22.4, 22.4,
22.3, 22.2, 22.2, 22.2, 22, 21.9, 21.9, 21.8),
key = "x")
# A `data.tabe` noob approach
# myData2[, Index := lapply(.SD, length), by = key(myData2)][]
# Or a better approach, as suggested by @Roland
myData2[, Index := .N, by = key(myData2)]
print(myData2)
# x Index
# 1: 21.8 1
# 2: 21.9 2
# 3: 21.9 2
# 4: 22.0 1
# 5: 22.2 3
# 6: 22.2 3
# 7: 22.2 3
# 8: 22.3 1
# 9: 22.4 3
# 10: 22.4 3
# 11: 22.4 3
# 12: 22.5 2
# 13: 22.5 2
# 14: 22.8 1
# 15: 23.2 1
# 16: 24.8 1
答案 1 :(得分:6)
可以使用merge
和table
:
dat <- data.frame(V1 = c(24.8, 23.2, 22.8, 22.5, 22.5, 22.4, 22.4, 22.4,
22.3, 22.2, 22.2, 22.2, 22, 21.9, 21.9, 21.8))
merge(dat, as.data.frame(table(dat$V1)), by.x = "V1", by.y = "Var1", sort = F)
# V1 Freq
# 1 24.8 1
# 2 23.2 1
# 3 22.8 1
# 4 22.5 2
# 5 22.5 2
# 6 22.4 3
# 7 22.4 3
# 8 22.4 3
# 9 22.3 1
# 10 22.2 3
# 11 22.2 3
# 12 22.2 3
# 13 22.0 1
# 14 21.9 2
# 15 21.9 2
# 16 21.8 1
答案 2 :(得分:3)
或使用包plyr
:
a <- c(24.8,23.2,22.8,22.5,22.5,22.4,22.4,22.4,22.3,22.2,22.2,22.2,22,21.9,21.9,21.8)
df <- data.frame(a)
library(plyr)
ddply(df,~a,transform,freq = length(a))
a freq
1 21.8 1
2 21.9 2
3 21.9 2
4 22.0 1
5 22.2 3
6 22.2 3
7 22.2 3
8 22.3 1
9 22.4 3
10 22.4 3
11 22.4 3
12 22.5 2
13 22.5 2
14 22.8 1
15 23.2 1
16 24.8 1