我有一个包含800列的数据框。我想使用每列中的条件从数据框中选择行。如果没有巨大的which
之类的话,我怎么能这样做呢
data[which(data$V_1 < bound_1 & ...& data$V_n<bound_n),]
这是我的数据框的片段
type_Browser os_name_Windows XP ua_family_Chrome ua_name_Chrome0
[1,] 0.06453172 0.09318651 0.09849316 0.1962756
[2,] 0.06453172 0.09318651 0.09849316 0.1962756
[3,] 0.06453172 0.09318651 0.00000000 0.0000000
[4,] 0.06453172 0.00000000 0.00000000 0.0000000
[5,] 0.06453172 0.00000000 0.09849316 0.1962756
[6,] 0.06453172 0.09318651 0.00000000 0.0000000
[7,] 0.06453172 0.00000000 0.00000000 0.0000000
[8,] 0.06453172 0.09318651 0.00000000 0.0000000
[9,] 0.06453172 0.00000000 0.09849316 0.1962756
[10,] 0.06453172 0.09318651 0.00000000 0.0000000
这是kmeans之后的聚类中心片段
type_Browser os_name_Windows XP ua_family_Chrome ua_name_Chrome 0
1 0.9973870 0.9014791 0.8885468 0.9162910
2 0.1370203 0.9323763 0.3940263 0.8250081
3 0.7121533 0.9541988 0.1418068 0.6568214
4 0.9998909 0.9881944 0.9959341 0.3181853
5 0.9278844 0.9796447 0.9247542 0.9510941
6 0.9784205 0.8586415 0.8902691 0.8210114
7 0.7115432 0.9930360 0.9652756 0.9735471
8 0.9907865 0.9896360 0.9910279 0.9781258
9 0.9967735 0.9919486 0.9921240 0.9702438
10 0.9998825 0.9940538 0.9970676 0.9839453
然后我做了两个界限
lowerBound = centers - eps;
upperBound = centers + eps;
然后我想选择位于[center - eps,centers + eps]中的行。
for(i in 1:k){
ithLB = lowerBound[i,];
ithUB = upperBound[i,];
ithKernel <- data[ which(data[,1]<=lowerBound[1] & ...& which(data[,812]<=lowerBound[812],] # I want to change this expression for something more reasonable.
}
答案 0 :(得分:1)
你可以尝试
data[Reduce(`&`,Map('<', data, bound)),]
假设有“bound_1”,“bound_2”,......“bound_N”对象
bound <- mget(paste('bound', 1:ncol(data), sep="_"))
并使用与上面相同的代码
另一个不太理想的选择是将paste
与eval(parse
(不推荐)
str1 <- paste(paste(paste0('data$',paste('V', 1:ncol(data), sep="_")),
paste('bound', 1:ncol(data), sep="_"), sep=" < "), collapse=" & ")
data[eval(parse(text=str1)),]
set.seed(153)
data <- as.data.frame(matrix(sample(0:8, 5*20, replace=TRUE), ncol=5))
colnames(data) <- paste('V', 1:ncol(data), sep="_")
bound <- sample(1:15, 5, replace=TRUE)
如果你有“bound_1”,“bound_2”等,而不是“vector”
bound_1 <- 6
bound_2 <- 8
bound_3 <- 7
bound_4 <- 7
bound_5 <- 14