Run calculations on specific columns in a dataframe by looping

时间:2018-01-23 19:41:55

标签: r loops dataframe subset

Here is an example of the data:

  Test.Statistic          P     FDR_P Bonferroni_P Control_mean NH4._mean
1       8.203199 0.01654619 0.7405529            1         0.00  0.000000
2       7.622793 0.02211727 0.7405529            1         0.00  1.095238
3       7.501205 0.02350357 0.7405529            1         2.10  1.761905
4       6.510000 0.03858082 0.7405529            1         0.85  0.000000
5       6.149339 0.04620490 0.7405529            1         0.65  5.095238
6       6.052381 0.04850005 0.7405529            1         0.00  1.428571
  NO3._mean
1 0.4285714
2 1.1904762
3 1.1428571
4 0.0000000
5 3.4285714
6 0.0000000

I want to apply the formula (trt_mean/control_mean)-1 to each treatment column (NH4 and NO3). I incorporated some comments but am still having trouble calling column 1 (control_mean) in dt.

dt <- as.data.frame.table(kw_res)
cols <- grep("_mean", colnames(dt))
rel_abund_function <- function(z) {
  return((z / z[, 1])-1)
}

dt[, lapply(cols, rel_abund_function)]

Any suggestions?

1 个答案:

答案 0 :(得分:0)

这样的事情,也许是:

> head(dt)
   ctrl1 ctrl2 ctrl3 ctrl4 ctrl5 treatment1_mean treatment2_mean treatment3_mean treatment4_mean treatment5_mean rawval
1:  21.0     6   160   110  3.90           2.620           16.46               0               1               4      4
2:  21.0     6   160   110  3.90           2.875           17.02               0               1               4      4
3:  22.8     4   108    93  3.85           2.320           18.61               1               1               4      1
4:  21.4     6   258   110  3.08           3.215           19.44               1               0               3      1
5:  18.7     8   360   175  3.15           3.440           17.02               0               0               3      2
6:  18.1     6   225   105  2.76           3.460           20.22               1               0               3      1

数据

_mean

代码

在此示例中获取具有特定名称格式(cols <- grep("_mean", colnames(dt)) my_mean_func <- function(z){ return((z-mean(z))/100) } dt[, lapply(.SD, my_mean_func), .SDcols = cols] )的列,并应用自定义函数:

> head(dt[, lapply(.SD, my_mean_func), .SDcols = cols])
   treatment1_mean treatment2_mean treatment3_mean treatment4_mean treatment5_mean
1:      -0.0059725      -0.0138875       -0.004375       0.0059375        0.003125
2:      -0.0034225      -0.0082875       -0.004375       0.0059375        0.003125
3:      -0.0089725       0.0076125        0.005625       0.0059375        0.003125
4:      -0.0000225       0.0159125        0.005625      -0.0040625       -0.006875
5:       0.0022275      -0.0082875       -0.004375      -0.0040625       -0.006875
6:       0.0024275       0.0237125        0.005625      -0.0040625       -0.006875

<强>输出

build.gradle