选择R

时间:2018-09-10 23:58:11

标签: r dataframe filter dplyr subset

我有一个数据帧,外观如下,但由数百行和列组成,这使得R中的常规过滤成为一个挑战。

简化图如下所示:

The rows represent values from a test and the columns represent different treatments

如何为每个“处理”列选择值在-0.5到1之间的所有行(即测试)并将其生成为输出?非常感谢您的想法!

1 个答案:

答案 0 :(得分:3)

创建示例数据:

df <- data.frame(
    test = paste0("test", 1:18),
    d1 = c(rep(-57, 7), 0, rep(-99, 10)),
    d2 = c(rep(-4, 14), 1, 0.1, -99, -99),
    d3 = c(rep(-89, 3), 0.99, -47, 0.8, rep(-55, 8), -1.56, 0.1, 1, 0),
    d4 = c(rep(-99, 6), rep(-57, 5), 0.7, -3, -13, -99, 0.98, -99, 0.99),
    d5 = c(rep(-57, 2), 0.4, rep(-99, 14), -57),
    stringsAsFactors = FALSE
)

如果您只需要获取元素:

# get TRUE/FALSE matrix of whether element meets your criteria
meets_criteria <- sapply(df[,-1], function(x) x >= -0.5 & x <= 1)

# "extract" elements that meet your criteria; result is a vector
df[,-1][meets_criteria]

如果您还想保留与元素关联的行/列值

(遵循上述评论中@thelatemail的方法):

# reshape to long
dflong <- tidyr::gather(df, dvar, dvalue, d1:d5)

# subset to meet your criteria
dflong[dflong$dvalue >= -0.5 & dflong$dvalue <= 1, ]