根据两个非数字值在dplyr中进行过滤

时间:2020-01-24 22:26:02

标签: r dataframe dplyr data-cleaning

我正在过滤数据集,该块工作正常:

    dwell <- rail %>% 
  filter_(Measure == "Average Terminal Dwell Time (Excluding Cars on Run Through Trains) (Hours)",
         Variable == "System") %>% 
  gather(Date, Hrs, -("railroad":"Sub-Variable"))

但是我想运行以下代码,在变量下添加第二个选项:

dwell <- rail %>% 
  filter_(Measure == "Average Terminal Dwell Time (Excluding Cars on Run Through Trains) (Hours)",
         Variable == "System" & "System (U.S.)") %>% 
  gather(Date, Hrs, -("railroad":"Sub-Variable"))

但是,当我这样做时,我得到了以下错误:“只能对数字,逻辑或复杂类型进行操作。”我尝试将&换为|那也不起作用。一旦有人告诉我,我觉得这将是一个简单的转换。谢谢!

2 个答案:

答案 0 :(得分:1)

尝试将Variable == "System" & "System (U.S.)"的{​​{1}}更改为Variable == "System" | Variable == "System (U.S.)"。应该可以。

答案 1 :(得分:0)

如果我们尝试对多个元素(例如> 1)进行固定匹配,则可以将%in%Variable一起使用。对于%in%,我们可以将任意数量的元素作为{{ 1}}

vector

如果可以与library(dplyr) library(tidyr) rail %>% filter_(Measure == "Average Terminal Dwell Time (Excluding Cars on Run Through Trains) (Hours)", Variable %in% c("System", "System (U.S.)")) %>% gather(Date, Hrs, -("railroad":"Sub-Variable")) 匹配,则可能会更容易

regex

在OP的代码rail %>% filter_(Measure == "Average Terminal Dwell Time (Excluding Cars on Run Through Trains) (Hours)", startsWith(Variable, "System")) %>% gather(Date, Hrs, -("railroad":"Sub-Variable")) 中,Variable == "System" & "System (U.S.)"部分未进行评估,因为我们需要两次指定“变量”,但这仍然不正确,因为该列不能包含两个元素相同的位置