如何根据数据框中任何位置可能出现的特定值过滤数据集,而不一定在任何一列或一行下?
假设我有一个类似这样的数据框。
id gender group Student_Math_1 Student_Math_2 Student_Read_1 Student_Read_2
46 M Red 23 45 37 56
46 M Red 34 36 33 78
46 M Red 56 63 58 NA
62 F Blue 59 NA NA 68
62 F Blue NA 68 87 73
38 M Red 78 57 NA 65
38 M Red NA 75 54 NA
17 F Blue 74 NA 56 72
17 F Blue 75 61 NA 79
17 F Blue NA 74 43 81
我正在尝试对此数据框进行子集,以便保留包含值68
的所有行和列,而不管它在数据框中的位置。
最终输出为
id gender group Student_Math_1 Student_Math_2 Student_Read_1 Student_Read_2
62 F Blue 59 NA NA 68
62 F Blue NA 68 87 73
欢迎任何提示或建议。提前谢谢。
df = structure(list(id = c(46, 46, 46, 62, 62, 38, 38, 17, 17, 17),
gender = structure(c(2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L,
1L), .Label = c("F", "M"), class = "factor"), group = structure(c(2L,
2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L), .Label = c("Blue", "Red"
), class = "factor"), Student_Math_1 = c(23, 34, 56, 59,
NA, 78, NA, 74, 75, NA), Student_Math_2 = c(45, 36, 63, NA,
68, 57, 75, NA, 61, 74), Student_Read_1 = c(37, 33, 58, NA,
87, NA, 54, 56, NA, 43), Student_Read_2 = c(56, 78, NA, 68,
73, 65, NA, 72, 79, 81)), .Names = c("id", "gender", "group",
"Student_Math_1", "Student_Math_2", "Student_Read_1", "Student_Read_2"
), row.names = c(NA, -10L), class = "data.frame")
答案 0 :(得分:3)
怎么样:
## use data from "Student_Math_1" column to "Student_Read_2" column
df[rowSums(df[4:7] == 68, na.rm = TRUE) > 0, ]
# id gender group Student_Math_1 Student_Math_2 Student_Read_1 Student_Read_2
#4 62 F Blue 59 NA NA 68
#5 62 F Blue NA 68 87 73
注意,df[4:7] == 68
会返回一个逻辑矩阵(NA
}),我们会将rowSums
与na.rm = TRUE
一起使用。在此类算术运算期间,TRUE
看到1,FALSE
看到0。
<强>后续强>
感谢Ben Bolker提醒这个更具可读性的解决方案,如果你正在学习R,你当然需要掌握它:
df[apply(df[4:7] == 68, 1L, any, na.rm = TRUE), ]
适用于行any
(带na.rm = TRUE
)。我不记得我在哪里比较这两个表现。但我不打算做一个快速的实验:
library(microbenchmark)
## For simplicity / neatness, I generate a logical matrix `X` without `NA`
X <- matrix(sample(c(TRUE, FALSE), 2000 * 10, replace = TRUE), 2000)
## also measuring 989's solution
microbenchmark(ZL = rowSums(X) > 0,
Ben = apply(X, 1L, any),
"989" = unique(which(X, arr.ind = T)[,1]))
#Unit: microseconds
# expr min lq mean median uq max neval cld
# ZL 144.24 149.76 183.3516 164.86 172.48 2077.80 100 a
# Ben 5610.08 5730.78 6003.0660 5779.20 5861.46 8021.72 100 c
# 989 1571.72 1639.58 2033.4224 1664.78 1721.18 5339.48 100 b
答案 1 :(得分:2)
可替换地,
unique
在这种情况下,您无需关心列的位置或return array('doctrine' =>
array(
'connection' => array('orm_default' => 'config',
'authentication' => array( // this part is for the Auth adapter from DoctrineModule/Authentication
'orm_default' => array(
'object_manager' => 'Doctrine\ORM\EntityManager',
// object_repository can be used instead of the object_manager key
'identity_class' => 'Easysurvey\Entity\User', //'Application\Entity\User',
'identity_property' => 'username', // 'username', // 'email',
'credential_property' => 'password', // 'password',
'credential_callable' => function(Entity\User $user, $passwordGiven) {
if ($user->getPassword() == Pbkdf2::calc('sha256', $passwordGiven, $user->getSalt(), 10000, 32)) {
return true;
} else {
return false;
}
},
),
),
出现的位置。如果案例68每行出现多次,则使用$("#button").click(function() {
$("#square").addClass("animate").delay(400).queue(function(next) {
$(this).removeClass("animate");
next();
});
});
。