我希望过滤表1中显示的数据框,使其看起来像表2,方法是删除类列中包含“Pathogenic”的任何行,并在验证列中删除0。虽然,我不确定应该使用哪种工具来实现这一目标。
Table1
Class Validated
Pathogenic 1
Pathogenic 1
Pathogenic 0
Pathogenic 0
Likely Pathogenic 1
Likely Pathogenic 0
Likely Pathogenic 1
Uncertain 0
Uncertain 1
Table2
Class Validated
Pathogenic 1
Pathogenic 1
Likely Pathogenic 1
Likely Pathogenic 0
Likely Pathogenic 1
Uncertain 0
Uncertain 1
答案 0 :(得分:3)
假设“已验证”列的类型为数字:
table2 <- table1[!(table1$Class == "Pathogenic" & table1$Validated == 0),]
答案 1 :(得分:0)
基于评论中OP的澄清的一个选项是使用data.table
library(data.table)
setDT(Table1)[!(Class == "Pathogenic" & Validated == 0) ]
# Class Validated
#1: Pathogenic 1
#2: Pathogenic 1
#3: Likely Pathogenic 1
#4: Likely Pathogenic 0
#5: Likely Pathogenic 1
#6: Uncertain 0
#7: Uncertain 1
或者在设置key
setDT(Table1, key = c("Class", "Validated"))[!.("Pathogenic", 0)]
# Class Validated
#1: Likely Pathogenic 0
#2: Likely Pathogenic 1
#3: Likely Pathogenic 1
#4: Pathogenic 1
#5: Pathogenic 1
#6: Uncertain 0
#7: Uncertain 1
编辑:以前,我遵循不同的逻辑,因为OP的初始帖子是我希望过滤表1中显示的数据框,所以它看起来像表2.虽然,我不确定我应该使用哪个工具实现这一目标。
df1 <- structure(list(Class = c("Pathogenic", "Pathogenic", "Pathogenic",
"Pathogenic", "Likely Pathogenic", "Likely Pathogenic", "Likely Pathogenic",
"Uncertain", "Uncertain"), Validated = c(1, 1, 0, 0, 1, 0, 1,
0, 1)), .Names = c("Class", "Validated"), row.names = c(NA, -9L
), class = "data.frame")