我有这样一个数据框(df),它只是一个样本:
group value condition
1 12 1
1 14 1
1 18 1
1 10 0
1 7 1
2 12 1
2 9 0
2 12 1
2 16 1
2 15 0
即;
df<-data.frame(group=c(1,1,1,1,1,2,2,2,2,2), value=c(12,14,18,10,7,12,9,12,16,15), condition=c(1,1,1,0,1,1,0,1,1,0))
我想创建3个名为&#34; rank1&#34;,&#34; rank2&#34;和&#34; rank3&#34;
即,所需的输出是:
group rank1 rank2 ran3
1 7 12 14
2 12 12 16
我怎么能用R做到这一点?我会很高兴得到任何帮助。非常感谢。
答案 0 :(得分:5)
使用 data.table :
library(data.table)
setDT(df)[condition == 1,
setNames(as.list(sort(value)[1:3]), paste0("rank", 1:3)),
by = group]
# group rank1 rank2 rank3
# 1: 1 7 12 14
# 2: 2 12 12 16
答案 1 :(得分:2)
以下是使用dplyr/tidyr
library(dplyr)
library(tidyr)
df %>%
group_by(group) %>%
filter(condition!=0)
arrange(value) %>%
slice(1:3) %>%
mutate(n=paste0('rank', row_number())) %>%
select(-condition)
spread(n, value)
# group rank1 rank2 rank3
#1 1 7 12 14
#2 2 12 12 16
或使用data.table
library(data.table)
dcast.data.table(setkey(setDT(df), value)[condition!=0,
list(rank=paste0('rank', 1:3), value[1:3]), group],
group~rank, value.var='V2')
# group rank1 rank2 rank3
#1: 1 7 12 14
#2: 2 12 12 16
或使用base R
df1 <- subset(df[order(df$value),], condition!=0 , select=1:2)
df2 <- subset(transform(df1, .id=ave(group, group, FUN=seq_along)), .id<4)
reshape(df2, idvar='group', timevar='.id', direction='wide')
# group value.1 value.2 value.3
#5 1 7 12 14
#6 2 12 12 16
答案 2 :(得分:1)
又一个dplyr回答......
myData <- read.csv(text=" group,value
1,12
1,14
1,18
1,10
1,7
2,12
2,9
2,12
2,16 ")
library(dplyr)
myData %>% filter(condition==1) %>% group_by(group) %>% summarise(rank1=nth(sort(value),1),
rank2=nth(sort(value),2),
rank3=nth(sort(value),3))