我需要为示例数据中的每个value
确定最小的class
(忽略NAs),并将其标记为' min'在使用data.table
示例数据:
df = structure(list(class = c("apple", "apple", "apple", "banana",
"banana", "berry", "berry", "grape", "grape", "grape", "grape",
"grape", "melon", "melon", "melon"), value = c(108816872, 108851837,
108890411, 108784778, NA, 108784778, 108816872, 108816872, 108850460,
NA, NA, NA, NA, NA, NA)), .Names = c("class", "value"), class = "data.frame", row.names = c(NA,
-15L))
期望的输出:
# class value anno
#1 apple 108816872 min
#2 apple 108851837 NA
#3 apple 108890411 NA
#4 banana 108784778 min
#5 banana NA NA
#6 berry 108784778 min
#7 berry 108816872 NA
#8 grape 108816872 min
#9 grape 108850460 NA
#10 grape NA NA
#11 grape NA NA
#12 grape NA NA
#13 melon NA NA
#14 melon NA NA
#15 melon NA NA
答案 0 :(得分:7)
I was going to suggest @eddies approach, but here's an alternative
setDT(df)[order(value), min := c("min", rep(NA, .N - 1)), by = class]
Edit, if you want the actuall value instead of "min", you could modify to
setDT(df)[order(value), min := c(value[1L], rep(NA, .N - 1L)), by = class]
答案 1 :(得分:6)
dt = as.data.table(df) # or convert in place using setDT
dt[dt[, .I[which.min(value)], by = class]$V1, anno := 'min']