我有一个包含3个数据框(DvE, DvS, EvS
)的列表:
str(Table.list2)
List of 3
$ DvE:'data.frame': 18482 obs. of 4 variables:
..$ gene : Factor w/ 18482 levels "c10000_g1_i3|m.32237",..: 1 2 3 4 5 6 7 8 9 10 ...
..$ FDR : num [1:18482] 0.502 0.982 0.936 0.411 0.461 ...
..$ log2FC : num [1:18482] 0.415 -0.245 0.728 -0.384 0.474 ...
..$ annotation: Factor w/ 4939 levels "","[Genbank](myosin heavy-chain) kinase [Calothrix sp. PCC 6303] ",..: 1 2204 2980 2204 1 2204 4622 2980 1 241 ...
$ DvS:'data.frame': 18482 obs. of 4 variables:
..$ gene : Factor w/ 18482 levels "c10000_g1_i3|m.32237",..: 1 2 3 4 5 6 7 8 9 10 ...
..$ FDR : num [1:18482] 1.25e-01 7.18e-01 2.02e-01 2.72e-13 6.02e-01 ...
..$ log2FC : num [1:18482] -0.417 0.583 2.148 1.689 -0.167 ...
..$ annotation: Factor w/ 4939 levels "","[Genbank](myosin heavy-chain) kinase [Calothrix sp. PCC 6303] ",..: 1 2204 2980 2204 1 2204 4622 2980 1 241 ...
$ EvS:'data.frame': 18482 obs. of 4 variables:
..$ gene : Factor w/ 18482 levels "c10000_g1_i3|m.32237",..: 1 2 3 4 5 6 7 8 9 10 ...
..$ FDR : num [1:18482] 1.78e-03 6.04e-01 4.09e-01 3.42e-19 3.20e-02 ...
..$ log2FC : num [1:18482] -0.832 0.828 1.42 2.073 -0.641 ...
..$ annotation: Factor w/ 4939 levels "","[Genbank](myosin heavy-chain) kinase [Calothrix sp. PCC 6303] ",..: 1 2204 2980 2204 1 2204 4622 2980 1 241 ...
所有3个数据帧具有相似的结构,例如:
> head(Table.list2$DvE)
gene FDR log2FC annotation
1 c10000_g1_i3|m.32237 0.5024600 0.4149066
2 c10000_g1_i4|m.32240 0.9818297 -0.2449509 [Pfam]Calcium-activated chloride channel
3 c10000_g1_i4|m.32242 0.9361868 0.7277203 [Pfam]LSM domain
4 c10000_g1_i5|m.32244 0.4114795 -0.3835745 [Pfam]Calcium-activated chloride channel
5 c10000_g1_i6|m.32245 0.4605157 0.4739777
6 c10000_g1_i6|m.32246 0.4965353 -0.4607749 [Pfam]Calcium-activated chloride channel
我想要做的是在每个数据框中,取出具有FDR < 0.05
和log2FC > 0
的数据并输入新的数据框,然后取出{{1}的数据}和FDR < 0.05
并放入另一个数据框。
因此,从3个数据帧的列表中,我将获得6个名为的新数据帧:
log2FC < 0
DvE.+
DvE.-
DvS.+
DvS.-
EvS.+
EvS.-
的输出示例:
DvE.+
我想知道是否有更优雅的方式/循环可以完成所有这些而不是反复写出类似的命令行?
更新
我试过这样做:
gene FDR log2FC annotation
47 c10010_g1_i4|m.32346 8.609296e-15 1.9188013 [Genbank]conserved unknown protein [Ectocarpus siliculosus]
48 c10010_g1_i4|m.32348 5.625766e-09 1.8240089 [Genbank]hypothetical protein THAOC_07134 [Thalassiosira oceanica]
155 c10037_g1_i4|m.32582 2.666894e-02 0.6669399 [Pfam]LETM1-like protein
211 c10050_g2_i2|m.32706 8.154555e-03 1.6900611 [Genbank]hypothetical protein SELMODRAFT_84252 [Selaginella moellendorffii]
243 c10057_g1_i1|m.32812 1.936893e-02 0.8141790 [Pfam]Fibrinogen alpha/beta chain family
265 c10061_g4_i2|m.32861 3.614401e-02 1.7059034 [Pfam]Maf1 regulator
但我收到了这个错误:
警告讯息:
1:在assign(paste(i,“。+”,sep =“”),value = pos)中: 只有第一个元素用作变量名 2:在assign(paste(i,“.-”,sep =“”),value = neg)中: 只有第一个元素用作变量名 3:在assign(paste(i,“。+”,sep =“”),value = pos)中: 只有第一个元素用作变量名 4:在assign(paste(i,“.-”,sep =“”),value = neg)中: 只有第一个元素用作变量名 5:在assign(paste(i,“。+”,sep =“”),value = pos)中: 只有第一个元素用作变量名 6:在assign(paste(i,“.-”,sep =“”),value = neg)中: 只有第一个元素用作变量名
答案 0 :(得分:0)
Not tested:
listdf<-list(DvE, DvS, EvS)
library(dplyr) # filtering the data
alldf<-lapply(listdf, function(i) { # Each list contains two filtered dataframes
df1<-filter(i,FDR < 0.05 & log2FC > 0) # dfs have not been properly named here
df2<-filter(i,FDR < 0.05 & log2FC < 0)
list(df1,df2)
}