我希望在新列中添加以下数据集中的第一个特征
mydf <- data.frame (customer= c(1,2,1,2,2,1,1) , feature =c("other", "a", "b", "c", "other","b", "c"))
customer feature
1 1 other
2 2 a
3 1 b
4 2 c
5 2 other
6 1 b
7 1 c
使用dplyr
。但是,我希望我的代码忽略数据集中的“其他”功能,并选择“其他”之后的第一个功能。
所以以下代码是不够的:
library (dplyr)
new <- mydf %>%
group_by(customer) %>%
mutate(firstfeature = first(feature))
如何忽略“其他”以便达到以下理想输出:
customer feature firstfeature
1 1 other b
2 2 a a
3 1 b b
4 2 c a
5 2 other a
6 1 b b
答案 0 :(得分:3)
使用dplyr
,我们可以按customer
分组,并为每个组取第一个feature
。
library(dplyr)
mydf %>%
group_by(customer) %>%
mutate(firstfeature = feature[feature != "other"][1])
# customer feature firstfeature
# <dbl> <chr> <chr>
#1 1 other b
#2 2 a a
#3 1 b b
#4 2 c a
#5 2 other a
#6 1 b b
#7 1 c b
同样,我们也可以使用基础R ave
mydf$firstfeature <- ave(mydf$feature, mydf$customer,
FUN= function(x) x[x!= "other"][1])
答案 1 :(得分:1)
另一个选项是data.table
library(data.table)
setDT(mydf)[, firstfeature := feature[feature != "other"][1], customer]