是否有一种方法可以针对列(因子)中多个级别的多个条件进行过滤?
Individual<-c("a1.2", "a1.2","bd3.d","bd3.d", "k20.d","k20.d", "dfd.2","dfd.2", "d3.d","d3.d", "df3.1","df3.1")
Treat <- c('hot','hot','hot','hot','hot','hot','cold',"cold",'cold',"cold",'cold',"cold")
Time <- c("T1", "T9", "T1", "T9","T1", "T9","T1", "T9","T1", "T9","T1", "T9")
Area<- c("0.1", "0.5", "0", "0.645","0.1", "0","0.1", "0.587","0", "0.78","0.23", "0.78")
df.Area22 <- data.frame(Individual, Treat,Time,Area)
head(df.Area22, n=20)
Individual Treat Time Area
1 a1.2 hot T1 0.1
2 a1.2 hot T9 0.5
3 bd3.d hot T1 0
4 bd3.d hot T9 0.645
5 k20.d hot T1 0.1
6 k20.d hot T9 0
7 dfd.2 cold T1 0.1
8 dfd.2 cold T9 0.587
9 d3.d cold T1 0
10 d3.d cold T9 0.78
11 df3.1 cold T1 0.23
12 df3.1 cold T9 0.78
例如,我只想从Individual
列中选择Area
T1和T9的Time
值都大于零的个人?
该函数因此将删除第3、6和9行。
谢谢!
答案 0 :(得分:1)
首先,您必须将Area变量转换为数字,因为R已将其解释为因子变量。
如果使用as.numeric
,则会丢失小数位。因此,您必须使用as.numeric
和levels
来确保保留小数位数。
接着,您将T1和T9的过滤器应用于时间变量,并将值大于0的变量应用于区域变量。
library(dplyr)
df.Area22$Area <- as.numeric(levels(df.Area22$Area))[df.Area22$Area]
df <- df.Area22 %>%
filter((Time == "T1" | Time == "T9") & Area > 0)
最终结果就是您所需要的(删除第3、6和9行)。
df
Individual Treat Time Area
1 a1.2 hot T1 0.100
2 a1.2 hot T9 0.500
3 bd3.d hot T9 0.645
4 k20.d hot T1 0.100
5 dfd.2 cold T1 0.100
6 dfd.2 cold T9 0.587
7 d3.d cold T9 0.780
8 df3.1 cold T1 0.230
9 df3.1 cold T9 0.780
答案 1 :(得分:1)
我猜想诀窍是在stringsAsFactors=FALSE
函数中设置data.frame
。
library(dplyr)
Individual<-c("a1.2", "a1.2","bd3.d","bd3.d", "k20.d","k20.d", "dfd.2","dfd.2", "d3.d","d3.d", "df3.1","df3.1")
Treat <- c('hot','hot','hot','hot','hot','hot','cold',"cold",'cold',"cold",'cold',"cold")
Time <- c("T1", "T9", "T1", "T9","T1", "T9","T1", "T9","T1", "T9","T1", "T9")
Area<- c("0.1", "0.5", "0", "0.645","0.1", "0","0.1", "0.587","0", "0.78","0.23", "0.78")
df.Area22 <- data.frame(Individual, Treat,Time,Area, stringsAsFactors=FALSE)
head(df.Area22, n=20)
df.Area22 %>%
filter(Time %in% c('T1', 'T9'),
Area > 0)
在filter
中,您可以仅使用,
添加多个过滤器命令,将其视为&
。
答案 2 :(得分:0)
Base-R解决方案。根据@demariod的建议,您需要在443
函数中使用stringsAsFactors=FALSE
。
data.frame
输出:
# Select the time T1 or T9 and Area >0
df[(df$Time=='T1' | df$Time=='T9') & df$Area>0,]