我有一个像这样的数据框
ID <- c ("ABC_10","AZM_11","ABC_11","ABC_12",
"ABC_13","AZM_12","ABC_14","ABC_15",
"CZX_10","CZX_11","CZX_12","CZX_13",
"FIN_10","FIN_11","FIN_12","FIN_13",
"FNM_10","FNM_11","FXS_10","FXS_11")
Id.n <- c(345,380,339,361,
245,390,639,661,
545,580,539,261,
345,180,139,261,
1045,1580,39,161)
df <- data.frame(ID,Id.n)
我正在尝试使用以下条件对此数据框进行子集化
Threshold of ID.n's for FXS - 100
Threshold of ID.n's for FIN - 200
Threshold of ID.n's for all Other ID's - 300
所需的输出是
ID Id.n
ABC_10 345
AZM_11 380
ABC_11 339
ABC_12 361
AZM_12 390
ABC_14 639
ABC_15 661
CZX_10 545
CZX_11 580
CZX_12 539
FIN_10 345
FIN_13 261
FNM_10 1045
FNM_11 1580
FXS_11 161
我试图这样做,但只是没有做对。
df <- subset(df,ifelse(grepl("FXS",df$ID), df$ID.n > 100,))
有人能指出我正确的方向吗?
答案 0 :(得分:4)
使用dplyr
:
library(dplyr)
df2 <- df %>%
filter((grepl("FXS", ID) & Id.n > 100) |
(grepl("FIN", ID) & Id.n > 200) |
(!grepl("FXS|FIN", ID) & Id.n > 300))
df2
# ID Id.n
# ABC_10 345
# AZM_11 380
# ABC_11 339
# ABC_12 361
# AZM_12 390
# ABC_14 639
# ABC_15 661
# CZX_10 545
# CZX_11 580
# CZX_12 539
# FIN_10 345
# FIN_13 261
# FNM_10 1045
# FNM_11 1580
# FXS_11 161
答案 1 :(得分:4)
使用经过清理的数据更简单。使用data.table,看起来像......
library(data.table)
setDT(df)
df[, c("x", "y") := tstrsplit(ID, "_")][, ID := NULL ]
xDT = data.table(x = unique(df$x))
xDT[, th := 300 ]
xDT[.(x = c("FXS", "FIN"), th = c(100, 200)), on=.(x), th := i.th ]
然后非equi联接用于过滤:
df[xDT, on=.(x, Id.n > th)]
Id.n x y
1: 300 ABC 11
2: 300 ABC 10
3: 300 ABC 12
4: 300 ABC 14
5: 300 ABC 15
6: 300 AZM 11
7: 300 AZM 12
8: 300 CZX 12
9: 300 CZX 10
10: 300 CZX 11
11: 200 FIN 13
12: 200 FIN 10
13: 300 FNM 10
14: 300 FNM 11
15: 100 FXS 11
关于这里的grepl
,我认为它是a bad idea。
答案 2 :(得分:1)
df[(grepl("FXS",df$ID) & df$Id.n >= 100) |
(grepl("FIN",df$ID) & df$Id.n >= 200) |
(!(grepl("FXS",df$ID) | grepl("FIN", df$ID)) & df$Id.n >= 300),]
# ID Id.n
#1 ABC_10 345
#2 AZM_11 380
#3 ABC_11 339
#4 ABC_12 361
#6 AZM_12 390
#7 ABC_14 639
#8 ABC_15 661
#9 CZX_10 545
#10 CZX_11 580
#11 CZX_12 539
#13 FIN_10 345
#16 FIN_13 261
#17 FNM_10 1045
#18 FNM_11 1580
#20 FXS_11 161