Question

我是R的新手。

使用名为SE_CSVLinelist_clean的表，我想提取名为where_case_travelled_1的变量不包含字符串"Outside Canada"或"Outside province/territory of residence but within Canada"的行。然后创建一个名为SE_CSVLinelist_filtered的新表。

SE_CSVLinelist_filtered <- filter(SE_CSVLinelist_clean, 
where_case_travelled_1 %in% -c('Outside Canada','Outside province/territory of residence but within Canada'))

上面的代码在我使用＆＃34; c＆＃34;而不是＆＃34; -c＆＃34;。
那么，当我真的想要排除包含国家或省外的行时，如何指定上述内容？

非常感谢

Answer 1

请注意，%in%会返回TRUE和FALSE的逻辑向量。要否定它，您可以在逻辑语句前面使用!：

SE_CSVLinelist_filtered <- filter(SE_CSVLinelist_clean, 
 !where_case_travelled_1 %in% 
   c('Outside Canada','Outside province/territory of residence but within Canada'))

关于使用-c(...)的原始方法，-是一个一元运算符，＆＃34;对数字或复数向量（或可以强制它们的对象）执行算术运算＆＃34; （来自help("-")）。由于您正在处理无法强制转换为数字或复数的字符向量，因此无法使用-。

Answer 2

尝试将搜索条件放在方括号中，如下所示。这将在括号内返回条件查询的结果。然后将其结果设置为FALSE，以测试其结果是否为负（即不属于向量中的任何选项）。

SE_CSVLinelist_filtered <- filter(SE_CSVLinelist_clean, 
(where_case_travelled_1 %in% c('Outside Canada','Outside province/territory of residence but within Canada')) == FALSE)

Answer 3

请注意之前的解决方案，因为它们需要准确输入您要检测的字符串。

例如，问问自己“外面”这个词是否足够。如果是这样，那么：

data_filtered <- data %>% 
  filter(!str_detect(where_case_travelled_1, "Outside")

reprex 版本：

iris

iris %>% 
  filter(!str_detect(Species, "versicolor"))

如何指定＆＃34;不包含＆＃34;在R

3 个答案: