R - 如何基于grepping行名称进行子集化

时间:2012-01-22 06:07:18

标签: r dataframe rows

我有一个数据框,看起来像下面创造性地称为“统计数据”

在psuedo-R代码中,我想这样做:

justMeans<-stats[rowname(stats)=="CD.Mean*",] 

*是通配符。

我也尝试过使用以下内容

justMeans<-stats[substr(names(stats),1,7)=="CD.Mean"),]

......这不仅不起作用我意识到我缺乏对正在发生的事情的基本理解。但我已经尝试了好几个小时!请帮忙! ; O)

鲍勃


>stats

           icntr     iexpt angle overlap        stat0
CD.Mean        1   1R50_50     0       0  100.0074705
CD.Max         1   1R50_50     0       0   102.265565
CD.Min         1   1R50_50     0       0    97.540612
CD.Sigma       1   1R50_50     0       0   1.44676377
CD.Mean1       2   1R50_50    30       0   99.9647655
CD.Max1        2   1R50_50    30       0  102.1616205
CD.Min1        2   1R50_50    30       0   97.6584145
CD.Sigma1      2   1R50_50    30       0   1.43740901
CD.Mean2       3   1R50_50    45       0    99.966388
CD.Max2        3   1R50_50    45       0    106.46566
CD.Min2        3   1R50_50    45       0   94.2393295
CD.Sigma2      3   1R50_50    45       0   3.59254625
CD.Mean3       4   1R50_40     0      10   100.012901
CD.Max3        4   1R50_40     0      10    101.82303
CD.Min3        4   1R50_40     0      10   98.1111155
CD.Sigma3      4   1R50_40     0      10  1.109652465
CD.Mean4       5   1R50_40    30      10    99.999638
CD.Max4        5   1R50_40    30      10   101.840065
CD.Min4        5   1R50_40    30      10   98.0084015
CD.Sigma4      5   1R50_40    30      10  1.170049515
CD.Mean5       6   1R50_40    45      10   99.9709865
CD.Max5        6   1R50_40    45      10   102.388835
CD.Min5        6   1R50_40    45      10     97.63445
CD.Sigma5      6   1R50_40    45      10  1.340972695
CD.Mean6       7   1R50_30     0      20  100.0440445
CD.Max6        7   1R50_30     0      20   101.311025
CD.Min6        7   1R50_30     0      20    98.697445
CD.Sigma6      7   1R50_30     0      20  0.785208705
CD.Mean7       8   1R50_30    30      20  100.0201235
CD.Max7        8   1R50_30    30      20   101.538165
CD.Min7        8   1R50_30    30      20    98.417954
CD.Sigma7      8   1R50_30    30      20   0.94661223
CD.Mean8       9   1R50_30    45      20  100.0167915
CD.Max8        9   1R50_30    45      20  101.5269425
CD.Min8        9   1R50_30    45      20   98.4979645
CD.Sigma8      9   1R50_30    45      20  0.940915119
CD.Mean9      10  1R100_75     0      25  100.0645345
CD.Max9       10  1R100_75     0      25    104.51514
CD.Min9       10  1R100_75     0      25   95.8851895
CD.Sigma9     10  1R100_75     0      25    2.6710193
CD.Mean10     11  1R100_75    30      25  100.0337035
CD.Max10      11  1R100_75    30      25     104.5674
CD.Min10      11  1R100_75    30      25   93.5928325
CD.Sigma10    11  1R100_75    30      25    3.5593778
CD.Mean11     12  1R100_75    45      25  100.1049655
CD.Max11      12  1R100_75    45      25   118.187185
CD.Min11      12  1R100_75    45      25    83.948139
CD.Sigma11    12  1R100_75    45      25    11.668272
CD.Mean12     13 1R100_100     0       0  100.0499555
CD.Max12      13 1R100_100     0       0   101.648892
CD.Min12      13 1R100_100     0       0    98.417499
CD.Sigma12    13 1R100_100     0       0 1.0151079265
CD.Mean13     14 1R100_100    30       0  100.1393825
CD.Max13      14 1R100_100    30       0   123.641395
CD.Min13      14 1R100_100    30       0    80.930049
CD.Sigma13    14 1R100_100    30       0    14.127094
CD.Mean14     15 1R100_140     0      60   100.079064
CD.Max14      15 1R100_140     0      60   100.753091
CD.Min14      15 1R100_140     0      60    99.389116
CD.Sigma14    15 1R100_140     0      60  0.423668595
CD.Mean15     16 1R100_140    30      60  100.0650495
CD.Max15      16 1R100_140    30      60   101.310065
CD.Min15      16 1R100_140    30      60   98.7794605
CD.Sigma15    16 1R100_140    30      60   0.76266793
CD.Mean16     17 1R100_150     0      50  100.0795465
CD.Max16      17 1R100_150     0      50   100.868755
CD.Min16      17 1R100_150     0      50   99.2802315
CD.Sigma16    17 1R100_150     0      50 0.5030329375
CD.Mean17     18 1R100_150    30      50   100.060051
CD.Max17      18 1R100_150    30      50   101.919065
CD.Min17      18 1R100_150    30      50   98.4232085
CD.Sigma17    18 1R100_150    30      50   0.99587342
CD.Mean18     19 1R100_150    45      50  100.0583935
CD.Max18      19 1R100_150    45      50   103.077655
CD.Min18      19 1R100_150    45      50    95.523467
CD.Sigma18    19 1R100_150    45      50    2.1692677
CD.Mean19     20 1R100_160     0      40  100.0773445
CD.Max19      20 1R100_160     0      40   101.637125
CD.Min19      20 1R100_160     0      40     98.18457
CD.Sigma19    20 1R100_160     0      40  0.948741865
CD.Mean20     21 1R100_160    30      40  100.0551155
CD.Max20      21 1R100_160    30      40   101.796255
CD.Min20      21 1R100_160    30      40   98.4833945
CD.Sigma20    21 1R100_160    30      40  0.985182275
CD.Mean21     22 1R100_160    45      40    99.982039
CD.Max21      22 1R100_160    45      40    107.18366
CD.Min21      22 1R100_160    45      40    90.728452
CD.Sigma21    22 1R100_160    45      40    5.4489308
CD.Mean22     23 1R100_200     0       0  100.0499555
CD.Max22      23 1R100_200     0       0   101.648892
CD.Min22      23 1R100_200     0       0    98.417499
CD.Sigma22    23 1R100_200     0       0 1.0151079265
.
.
.

3 个答案:

答案 0 :(得分:9)

stats[grepl("CD.Mean*", rownames(stats)), ]

答案 1 :(得分:3)

以下是一个例子:

> head(d)
         icntr   iexpt angle overlap      stat0
CD.Mean      1 1R50_50     0       0 100.007470
CD.Max       1 1R50_50     0       0 102.265565
CD.Min       1 1R50_50     0       0  97.540612
CD.Sigma     1 1R50_50     0       0   1.446764
CD.Mean1     2 1R50_50    30       0  99.964765
CD.Max1      2 1R50_50    30       0 102.161620

> d[grep("^CD\\.Mean.*", rownames(d)), ]
          icntr     iexpt angle overlap     stat0
CD.Mean       1   1R50_50     0       0 100.00747
CD.Mean1      2   1R50_50    30       0  99.96477
CD.Mean2      3   1R50_50    45       0  99.96639
CD.Mean3      4   1R50_40     0      10 100.01290
CD.Mean4      5   1R50_40    30      10  99.99964
CD.Mean5      6   1R50_40    45      10  99.97099
CD.Mean6      7   1R50_30     0      20 100.04404
CD.Mean7      8   1R50_30    30      20 100.02012
CD.Mean8      9   1R50_30    45      20 100.01679
CD.Mean9     10  1R100_75     0      25 100.06453
CD.Mean10    11  1R100_75    30      25 100.03370
CD.Mean11    12  1R100_75    45      25 100.10497
CD.Mean12    13 1R100_100     0       0 100.04996
CD.Mean13    14 1R100_100    30       0 100.13938
CD.Mean14    15 1R100_140     0      60 100.07906
CD.Mean15    16 1R100_140    30      60 100.06505
CD.Mean16    17 1R100_150     0      50 100.07955
CD.Mean17    18 1R100_150    30      50 100.06005
CD.Mean18    19 1R100_150    45      50 100.05839
CD.Mean19    20 1R100_160     0      40 100.07734
CD.Mean20    21 1R100_160    30      40 100.05512
CD.Mean21    22 1R100_160    45      40  99.98204
CD.Mean22    23 1R100_200     0       0 100.04996

substr字符串的子集,而grep是,grep是......,grep代表什么?

答案 2 :(得分:1)

你快到了:names(stats)给出了数据框的列名,而不是行名。你想要的是

justMeans<-stats[substr(row.names(stats),1,7)=="CD.Mean"),]