我对列表有疑问。
这是我的数据集:
# DATA
mydat <- data.frame(EAN=c(rep(250, 4), rep(251, 3), rep(252, 6)),
NO = c(rep(0.5, 5), 3, 4, 1, 1, 1, 2, 1, 1),
VAR = 0)
有关数据集的一些工作:
# SPLIT BY "EAN"
sp <- split(mydat, mydat$EAN)
# INDICES OF DUPLICATED ROWS
fkt <- function(x) {
which(duplicated(x) | duplicated(x[nrow(x):1, ])[nrow(x):1])
}
ldup <- lapply(sp, fkt)
# SET VALUES ACCORDING TO RULE
sp$`250`$VAR[ldup$`250`] <- 1
sp$`252`$VAR[ldup$`252`] <- 1
sp$`251`$VAR[ldup$`251`] <- 1
在不使用每个名称"250"
,"251"
和"252"
的情况下,有没有一种很好的R方式?
答案 0 :(得分:0)
如何使用names():
> sp[ names(sp)[1] ]
$`250`
EAN NO VAR
1 250 0.5 0
2 250 0.5 0
3 250 0.5 0
4 250 0.5 0
> sp[ names(sp)[2] ]
$`251`
EAN NO VAR
5 251 0.5 0
6 251 3.0 0
7 251 4.0 0
> sp[ names(sp)[3] ]
$`252`
EAN NO VAR
8 252 1 0
9 252 1 0
10 252 1 0
11 252 2 0
12 252 1 0
13 252 1 0
然后您可以将作业分配为:
sp[[ names(sp)[1] ]]$VAR[ ldup[[ names(sp)[1] ]] ] <- 1
sp[[ names(sp)[2] ]]$VAR[ ldup[[ names(sp)[2] ]] ] <- 1
sp[[ names(sp)[3] ]]$VAR[ ldup[[ names(sp)[3] ]] ] <- 1