根据行值选择列

时间:2014-05-15 16:43:42

标签: r syntax subset

我想将基于第一行编号的R中数据框中的列子集合为2个数据帧(ab

我的输入dat

    NE001457   NE001458   NE001494   NE001532   NE001533   NE001542
        1          0          1          1          1          1
 0.22951556  0.39403959  0.11835655  0.40844591 0.28442078 0.24236144
 0.04566349  0.14491037  0.03184582  0.0819644  0.18298301 -0.0339111
 0.04543686  0.05876655  0.23562874 -0.0717529 -0.0320517  -0.0851137

我的预期输出a

     NE001457     NE001494   NE001532   NE001533   NE001542
        1             1          1          1          1
  0.22951556    0.11835655  0.40844591 0.28442078 0.24236144
  0.04566349    0.03184582  0.0819644  0.18298301 -0.0339111
  0.04543686   0.23562874  -0.0717529 -0.0320517  -0.0851137

我的预期输出b

   NE001494 
       0          
  0.39403959  
  0.14491037  
  0.05876655  

我已经尝试了类似a <- subset(dat, dat[1,] == 1)b <- subset(dat, dat[1,] == 0)之类的内容但不起作用。

一些想法?

干杯!

1 个答案:

答案 0 :(得分:1)

您可以选择列,例如a<-dat[,(dat[1,]) == 1];唯一的技巧是在最终提取单个列时重新设置列名。

datatxt <- "
NE001457   NE001458   NE001494   NE001532   NE001533   NE001542
 1          0          1          1          1          1
0.22951556  0.39403959  0.11835655  0.40844591 0.28442078 0.24236144 
0.04566349  0.14491037  0.03184582  0.0819644  0.18298301 -0.0339111
0.04543686  0.05876655  0.23562874 -0.0717529 -0.0320517  -0.0851137"

df <- read.table(text=datatxt,header=TRUE)

df
#    NE001457   NE001458   NE001494   NE001532   NE001533   NE001542
#1 1.00000000 0.00000000 1.00000000  1.0000000  1.0000000  1.0000000
#2 0.22951556 0.39403959 0.11835655  0.4084459  0.2844208  0.2423614
#3 0.04566349 0.14491037 0.03184582  0.0819644  0.1829830 -0.0339111
#4 0.04543686 0.05876655 0.23562874 -0.0717529 -0.0320517 -0.0851137

acols <- (df[1,] == 1)
bcols <- !acols

acols
#  NE001457 NE001458 NE001494 NE001532 NE001533 NE001542
#1     TRUE    FALSE     TRUE     TRUE     TRUE     TRUE

bcols
#  NE001457 NE001458 NE001494 NE001532 NE001533 NE001542
#1    FALSE     TRUE    FALSE    FALSE    FALSE    FALSE

a <- as.data.frame(df[,acols])
colnames(a) <- colnames(df)[acols]
b <- as.data.frame(df[,bcols])
colnames(b) <- colnames(df)[bcols]

a
#    NE001457   NE001494   NE001532   NE001533   NE001542
#1 1.00000000 1.00000000  1.0000000  1.0000000  1.0000000
#2 0.22951556 0.11835655  0.4084459  0.2844208  0.2423614
#3 0.04566349 0.03184582  0.0819644  0.1829830 -0.0339111
#4 0.04543686 0.23562874 -0.0717529 -0.0320517 -0.0851137

b
#    NE001458
#1 0.00000000
#2 0.39403959
#3 0.14491037
#4 0.05876655