Question

我的数据包括针对不同模型和场景（即变量）彼此列出的频率表。我想选择这个数据库来为每个子集制作图表。我的大多数变量都是分类和文本（例如天气，场景）。我找不到一种方法来允许来自分类变量的多个值（大多数%in% c()用于数字）。我尝试了以下方法：

ThisSelection <- subset (Hist, all_seeds==0 & weather == "normal" & scenario %in% c("intact","depauperate"))

哪个不起作用

ThisSelection <- subset (Hist, all_seeds==0 & scenario =="intact" | scenario =="depauperate")

仅提供＆＃34; inatct＆＃34;场景。

如果答案很简单，我很抱歉，我在网上搜索但无法找到错误的地方，而且我认为除了将字符串变量值转换为数字值之外，还有其他方法。我是R的首发......

Answer 1

你的第一个应该工作。犹豫不决，但你的拼写是“沮丧”一致（包括案例？）：

Hist<-data.frame(all_seeds=0, weather=sample(c("normal","odd"),20,T),scenario=sample(c("intact","depauperate"),20,T))
ThisSelection <- subset (Hist, all_seeds==0 & weather == "normal" & scenario %in% c("intact","depauperate"))
ThisSelection


   all_seeds weather    scenario
1          0  normal      intact
3          0  normal      intact
4          0  normal      intact
5          0  normal depauperate
6          0  normal      intact
10         0  normal depauperate
14         0  normal      intact
15         0  normal      intact

Answer 2

不要忘记逻辑运算符优先级：

set.seed(3099627)
Hist <- data.frame(first=sample(letters[1:3], 20, rep=T), second=sample(letters[4:6], 20, rep=T))
subset (Hist, first=="a" & (second=="d" | second=="e"))

   first second
1      a      e
4      a      d
15     a      e
20     a      d

subset (Hist, first=="a" & (second %in% c("d", "e")))

   first second
1      a      e
4      a      d
15     a      e
20     a      d

选择分类变量（列）可以具有2个值的子集

2 个答案: