使用R

时间:2018-03-21 18:20:08

标签: r dataframe subset

我有这个数据集,我想过滤满足至少一个要求的行,在padj的任何列(padj.1,padj.2等)中至少有一个小于0.001的值

           log2FoldChange         padj log2FoldChange.1        padj.1 log2FoldChange.2       padj.2 log2FoldChange.3       padj.3
AT1G01010     -0.006572090 9.998981e-01     -0.309669318  9.999662e-01      0.438877524 8.309090e-01      0.561092027 5.682979e-01
AT1G01020      0.017360429 9.998981e-01      0.073469652  9.999662e-01     -0.069928545 9.563338e-01      0.125792938 8.207208e-01
AT1G01030     -0.085871384 9.998981e-01     -0.325335794  9.999662e-01     -0.399593376 6.736412e-01      0.441444783 4.456086e-01
AT1G01040     -0.060245264 9.998981e-01     -0.232347149  7.785949e-01      0.122877554 7.833058e-01     -0.069458761 8.334858e-01
AT1G01050     -0.052917601 9.998981e-01      0.014403301  9.999662e-01      0.148566949 8.249185e-01     -0.282904657 2.909831e-01
AT1G01060      0.047175423 9.998981e-01     -0.144865979  8.393350e-01      0.164981232 2.485552e-01     -0.349949093 4.481910e-04
AT1G01080      0.008088133 9.998981e-01     -0.076365069  9.999662e-01      0.074804359 8.919594e-01     -0.129912325 5.543105e-01
AT1G01090      0.056330307 9.998981e-01     -0.014582717  9.999662e-01      0.034778584 9.455045e-01     -0.024959676 9.186005e-01
AT1G01100     -0.060240311 9.998981e-01      0.142628251  9.999662e-01      0.012350145 9.886096e-01      0.214901315 3.117450e-01
AT1G01110      0.013833213 9.998981e-01      1.153241053  3.916769e-01      0.612459109 5.970020e-01     -1.326637272 1.308788e-02
AT1G01120      0.103551581 9.998981e-01     -0.318902875  3.272385e-01      0.234447384 3.078595e-01     -0.683988096 1.880000e-08
AT1G01130      0.189351439 9.998981e-01     -0.462637322  9.999662e-01     -0.170228839 9.550798e-01      0.434854000 7.426886e-01
AT1G01140     -0.005671191 9.998981e-01     -0.313300814  2.749158e-01     -1.053242873 2.890000e-18      1.185748730 3.650000e-26
AT1G01160     -0.074044311 9.998981e-01      0.099783450  9.999662e-01     -0.082539415 9.512494e-01      0.093031718 8.912551e-01
AT1G01170      0.037769428 9.998981e-01     -0.013651764  9.999662e-01     -0.117681977 8.691103e-01      0.068321467 8.790404e-01
AT1G01180      0.248586221 9.998981e-01     -0.659122042  2.585560e-01      0.018832015 9.916358e-01      0.027246808 9.725611e-01
AT1G01210     -0.082339205 9.998981e-01      0.337750634  9.999662e-01      0.108632580 9.549792e-01     -0.255164413 7.487148e-01
AT1G01220      0.074423950 9.998981e-01     -0.184146594  9.999662e-01      0.108415967 8.895097e-01     -0.109752150 7.787131e-01
AT1G01225     -0.157080073 9.998981e-01      0.088379552  9.999662e-01      0.368176470 7.468810e-01     -0.051841400 9.582530e-01
AT1G01230      0.124919482 9.998981e-01      0.038120336  9.999662e-01     -0.038279670 9.724470e-01      0.156677162 6.437230e-01
AT1G01240      0.101444575 9.998981e-01      0.041297975  9.999662e-01     -0.020448865 9.855355e-01      0.105263537 7.626743e-01
AT1G01250      0.065966313 9.998981e-01     -0.091808753  9.999662e-01      0.141797906 9.673330e-01      1.565726558 4.506212e-03
AT1G01260      0.471038145 9.998981e-01      1.528887763  1.200000e-18     -0.346919177 1.656233e-01     -0.810355958 2.440000e-07
AT1G01290      0.118170716 9.998981e-01     -0.047075401  9.999662e-01     -0.056431451 9.684373e-01      0.246362831 5.652973e-01

这是所需的输出

  log2FoldChange         padj log2FoldChange.1        padj.1 log2FoldChange.2       padj.2 log2FoldChange.3       padj.3
AT1G01060      0.047175423 9.998981e-01     -0.144865979  8.393350e-01      0.164981232 2.485552e-01     -0.349949093 4.481910e-04
AT1G01120      0.103551581 9.998981e-01     -0.318902875  3.272385e-01      0.234447384 3.078595e-01     -0.683988096 1.880000e-08
AT1G01140     -0.005671191 9.998981e-01     -0.313300814  2.749158e-01     -1.053242873 2.890000e-18      1.185748730 3.650000e-26
AT1G01260      0.471038145 9.998981e-01      1.528887763  1.200000e-18     -0.346919177 1.656233e-01     -0.810355958 2.440000e-07

问候并感谢

2 个答案:

答案 0 :(得分:1)

我们可以将filter_atdplyr一起使用。如果存在行名称,则最好在执行管道之前将其添加为单独的列,因为行名称已删除。

library(tidyverse)
df1 %>%
    rownames_to_column('rn') %>%
    filter_at(vars(matches("padj")), any_vars(. < 0.001))
#     rn log2FoldChange      padj log2FoldChange.1       padj.1 log2FoldChange.2       padj.2 log2FoldChange.3      padj.3
#1 AT1G01060    0.047175423 0.9998981       -0.1448660 8.393350e-01        0.1649812 2.485552e-01       -0.3499491 4.48191e-04
#2 AT1G01120    0.103551581 0.9998981       -0.3189029 3.272385e-01        0.2344474 3.078595e-01       -0.6839881 1.88000e-08
#3 AT1G01140   -0.005671191 0.9998981       -0.3133008 2.749158e-01       -1.0532429 2.890000e-18        1.1857487 3.65000e-26
#4 AT1G01260    0.471038145 0.9998981        1.5288878 1.200000e-18       -0.3469192 1.656233e-01       -0.8103560 2.44000e-07

答案 1 :(得分:0)

我认为你可以用哪个函数做到这一点。

试试这个,它对我有用:

i = which((data$padj.1 > 0.001) | (data$padj.2 > 0.001)|(data$padj.3 > 0.001) )
data[i,]