我想基于列表和位于单独数据帧(df_main
)中的条件对我的主数据帧(df_keep
)进行子集,以便最终获得数据像df_goal
这样的框架。
如果某个变量位于变量名(df_main
的列表中,并且是df_keep$keep_var
或NA
({{ 1}})。
我的方法似乎可以工作到最后一行,我也不知道为什么。 谢谢你的帮助!
"r"
答案 0 :(得分:1)
您可以在df_keep
中获得满足条件的行,如下所示:
conditions_met <- df_keep$othvar == "r" | is.na(df_keep$othvar)
> conditions_met
[1] TRUE FALSE TRUE FALSE TRUE
然后,您可以使用它们在df_keep$keepvar
中获取正确的行:
kept_rows <- df_keep$keep_var[conditions_met]
> kept_rows
[1] coat book bottle
现在,仅返回df_main
中名称与kept_rows
中名称相同的列:
df_main[, as.character(kept_rows)]
coat book bottle
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
或一行:
> df_main[, as.character(df_keep$keep_var[df_keep$othvar == "r" |
+ is.na(df_keep$othvar)])]
coat book bottle
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
请注意,您的示例数据集不使用as.character
,因此需要stringsAsFactors = FALSE
。如果是这样,则可以省略as.character
参数,因此,如果实际数据以字符而不是因数表示,则应该可以删除as.character
。例如:
df_main <-
data.frame(
coat = c(1:5),
hanger = c(1:5),
book = c(1:5),
dvd = c(1:5),
bookcase = c(1:5),
clock = c(1:5),
bottle = c(1:5),
curtains = c(1:5),
wall = c(1:5),
stringsAsFactors = FALSE
)
df_keep <-
data.frame(
keep_var = c("coat", "hanger", "book", "wall", "bottle"),
othvar = c("r", "w", "r", "w", NA),
stringsAsFactors = FALSE
)
df_goal <- data.frame(coat = c(1:5),
book = c(1:5),
bottle = c(1:5))
df_main[, df_keep$keep_var[df_keep$othvar == "r" |
is.na(df_keep$othvar)]]
答案 1 :(得分:0)
这是一个dplyr
解决方案
library(dplyr)
# Filter based on 'othvar' and convert factor to string.
keep.vec <- as.character(
(df_keep %>% dplyr::filter(is.na(othvar) | othvar == 'r'))$keep_var
)
df_main %>% dplyr::select(keep.vec)
## coat book bottle
## 1 1 1 1
## 2 2 2 2
## 3 3 3 3
## 4 4 4 4
## 5 5 5 5