子集具有多于1个条件的data.frames

时间:2013-05-10 03:34:54

标签: r

我发现这个问题与我的相似,但我需要知道一种方法,用数学条件做同样的事情,例如小于或等于或等于或等于(< =,> =,>或<)

Subsetting a dataframe in R by multiple conditions

关于如何做到这一点的任何想法?

这是我的data.frame

的输入
structure(list(COLE_CODIGO_COLEGIO = c(99002L, 98640L, 125211L, 
3665L, 66845L, 80127L), Utilidad12.2 = c(1.74224925711619e+86, 
9.34905360125358e+85, 2.19766267893102e+86, 1.15601939883399e+86, 
3.57148506747902e+85, 4.46252885275553e+85), genero.x = c(0.473684210526316, 
0.34, 0.555555555555556, 0.529411764705882, 0.724137931034483, 
0.717557251908397), matricula = c(11, 12, 11, 0, 12, 11.9541984732824
), Utilidad11.2 = c(1.08554442429751e+85, 1.23904539384648e+86, 
1.41625733346334e+83, 2.98513077553705e+85, 2.85572879144527e+84, 
2.62027294905256e+85), genero.y = c(0.611111111111111, 0.547945205479452, 
0.333333333333333, 0.454545454545455, 0.529411764705882, 0.754545454545455
), Utilidad10.2 = c(5.29036316240969e+86, 3.60281682723794e+85, 
7.98683180997873e+83, 3.69815598910933e+85, 9.03907414307155e+85, 
9.93153119186974e+84), genero.x.1 = c(0.4, 0.493670886075949, 
0.461538461538462, 0.571428571428571, 0.666666666666667, 0.789915966386555
), Utilidad07.2 = c(3.60312171478886e+84, 6.71597127543324e+82, 
2.15239495295691e+81, 3.28836948865607e+84, 5.54815355937579e+84, 
3.56877032811493e+84), genero.y.1 = c(0.625, 0.423728813559322, 
0.6, 0.5, 0.631578947368421, 0.746835443037975), Utilidad06.2 = c(3.2313581771241e+84, 
3.60816597030472e+83, 2.28482656850331e+81, 1.6337350886429e+84, 
1.581561168503e+85, 1.3090089847287e+84), genero = c(0.631578947368421, 
0.53125, 0.444444444444444, 0.388888888888889, 0.5625, 1), UIdeal = c(2.71828182845905e+94, 
2.71828182845905e+94, 2.71828182845905e+94, 2.71828182845905e+94, 
2.71828182845905e+94, 2.71828182845905e+94), UProm = c(1.44190233217495e+86, 
5.07702439958696e+85, 4.41422028057935e+85, 3.74713824214323e+85, 
3.00650172282713e+85, 1.71274657045589e+85), GENProm = c(0.54827485380117, 
0.467318981022945, 0.478974358974359, 0.488854935913759, 0.622859061955091, 
0.801770823175676), LnUProm = c(198.388281303249, 197.344458246354, 
197.204564116063, 197.040725317712, 196.820510089025, 196.257831168032
), LnUIdeal = c(217.44299874144, 217.44299874144, 217.44299874144, 
217.44299874144, 217.44299874144, 217.44299874144), Distancia = c(19.0547174381914, 
20.0985404950863, 20.2384346253776, 20.4022734237286, 20.622488652415, 
21.1851675734087)), .Names = c("COLE_CODIGO_COLEGIO", "Utilidad12.2", 
"genero.x", "matricula", "Utilidad11.2", "genero.y", "Utilidad10.2", 
"genero.x.1", "Utilidad07.2", "genero.y.1", "Utilidad06.2", "genero", 
"UIdeal", "UProm", "GENProm", "LnUProm", "LnUIdeal", "Distancia"
), row.names = c(5149L, 5119L, 6631L, 230L, 3477L, 4131L), class = "data.frame")

我希望以一种方式过滤数据框,如果变量“GENProm”< 0.5和“matricula”> 10,那么我将保留这些观察结果(只是一个随机的例子)。

2 个答案:

答案 0 :(得分:0)

temp是原始数据集,temp.1是子集化后的数据集。

temp.1 <- temp[temp[,"GENProm"]<0.5 & temp[,"matricula"]>10,]

答案 1 :(得分:0)

只需将表达式与&(AND)或|(OR)

连接起来

使用subset

 subset(DF, GENProm < 0.5 & matricula > 10)

或者,more safely, [

DF[DF[['GENProm']] < 0.5 & DF[['matricula']] > 10, ]