我有以下数据框:
data.frame(id = c("a", "a", "a", "d", "d"),
value = c(5, 46, 12, 14, 32),
low = c(46, 8, NA, 0, 34),
high = c(56, 20, NA, 12, 60))
id value low high
1 a 5 46 56
2 a 46 8 20
3 a 12 NA NA
4 d 14 0 12
5 d 32 34 60
如果TRUE
超出了value
和low
定义的每个具有相同{{1}的行的间隔,则需要将新变量设置为high
}。
我想要的数据框是:
id
如何在基准R中做到这一点?我在一个只能访问基本R的限制性环境中工作。
答案 0 :(得分:1)
没有apply
,sapply
和map
功能:
isInDataframe <- function(data = data, value = "value", from = "low", to = "high", id = "id"){
result <- c()
for (i in 1:length(data[,1])) {
deeta <- data[data[id] == as.character(data[id][i,1]),]
subresult <- c()
for (j in 1:nrow(deeta)) {
subresult[j] <- (data[value][i,1] >= deeta[from][j,1] & data[value][i,1] <= deeta[to][j,1])
}
result[i] <- !any(subresult,na.rm = T)
}
data$result <- result
return(data)
}
isInDataframe(data = data, value = "value", from = "low", to = "high", id = "id")
id value low high result
1 a 5 46 56 TRUE
2 a 46 8 20 FALSE
3 a 12 NA NA FALSE
4 d 14 0 12 TRUE
5 d 32 34 60 TRUE
答案 1 :(得分:0)
我想出了一个丑陋且未优化的解决方案,但它可行!这是代码:
df <- data.frame(id = c("a", "a", "a", "d", "d"),
value = c(5, 46, 12, 14, 32),
low = c(46, 8, NA, 0, 34),
high = c(56, 20, NA, 12, 60))
list.inter <- list()
for(i in 1:nrow(df)){
if(is.na(df$low[i]) | is.na(df$low[i])) {
list.inter[[i]] <- NA
}else{
list.inter[[i]] <- seq(from = df$low[i], to = df$high[i])
}
}
result <- c()
for(i in 1:nrow(df)){
result[i] <- ! df$value[i] %in% unlist(list.inter[which(df$id[i]==df$id)])
}
df$result <- result
我希望这会有所帮助,并且很好奇看到其他用户提供的一些优化代码!
答案 2 :(得分:0)
为了进行此分析,我最终选择将一个数据帧中的id
和value
以及另一个数据帧中的id
,low
和high
分开。 / p>
但是,这是一个受the solutions suggested for this new approach启发的解决方案:
df <- data.frame(id = c("a", "a", "a", "d", "d"),
value = c(5, 46, 12, 14, 32),
low = c(46, 8, NA, 0, 34),
high = c(56, 20, NA, 12, 60))
temp <- merge(x = df[c("id",
"value")],
y = df[c("id",
"low",
"high")])
temp$result <- temp$value < temp$low | temp$value > temp$high
merge(x = df,
y = aggregate(formula = result ~ id + value,
data = temp,
FUN = all))
id value low high result
1 a 12 NA NA FALSE
2 a 46 8 20 FALSE
3 a 5 46 56 TRUE
4 d 14 0 12 TRUE
5 d 32 34 60 TRUE