我的数据框如下所示:
quant_final_means <- data.frame( exposure_time_factor = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("200ms", "500ms"), class = "factor"),
protein_factor = c("background", "background", "EpQ_11_prot_0.25", "EpQ_11_prot_0.25", "EpQ_11_prot_0.5", "EpQ_11_prot_0.5", "EpQ_11_prot_1", "EpQ_11_prot_1", "rK39_prot_0.01", "rK39_prot_0.01", "rK39_prot_0.1", "rK39_prot_0.1", "serum", "serum", "background", "background", "EpQ_11_prot_0.25", "EpQ_11_prot_0.25", "EpQ_11_prot_0.5", "EpQ_11_prot_0.5", "EpQ_11_prot_1", "EpQ_11_prot_1", "rK39_prot_0.01", "rK39_prot_0.01", "rK39_prot_0.1", "rK39_prot_0.1", "serum", "serum"),
serum_factor = c("NEHC", "VL", "NEHC", "VL", "NEHC", "VL", "NEHC", "VL", "NEHC", "VL", "NEHC", "VL", "NEHC", "VL", "NEHC", "VL", "NEHC", "VL", "NEHC", "VL", "NEHC", "VL", "NEHC", "VL", "NEHC", "VL", "NEHC", "VL"),
avg_fluorescence = c(24139.615, 25796.83875, 24242.2557142857, 26019.7985714286, 25369.1971428571, 30682.4342857143, 26148.9542857143, 29101.9914285714, 24121.2328571429, 32350.1428571429, 24142.0014285714, 62122.6628571429, 57192.968, 53372.702, 40067.6985714286, 38922.4814285714, 40243.0528571429, 38932.78, 42290.35, 48867.015, 43334.3925, 46181.4542857143, 40383.8257142857, 57257.7614285714, 40378.8071428571, 65535, 65535, 65524.968) )
基本上我要做的是创建另一个列(称为avg_fluorescence_minus_background
),其中我将减去background
值(取决于exposure_time_factor
和serum_factor
)来自每行的avg_fluorescence
。
例如,考虑第三行(exposure_time_factor=="200ms"
和serum_factor=="NEHC"
我会得到24242.26-24139.62 = 102.64。对于第四行(exposure_time_factor=="200ms"
和serum_factor=="VL"
我会有26019.80 - 25796.84 = 222.96等等,对于表格的所有行。
从exposure_time_factor=="200ms
开始,我尝试了以下代码:
quant_final_means %>% filter(exposure_time_factor=="200ms") %>% mutate(avg_fluorescence_minus_background = ifelse(test = serum_factor=="NEHC", yes = avg_fluorescence - (filter(protein_factor=="background", serum_factor=="NEHC")) %>% select(avg_fluorescence)), no = avg_fluorescence - (filter(protein_factor=="background", serum_factor=="VL")) %>% select(avg_fluorescence))
但是在尝试运行此代码时出现以下错误消息:
Error in mutate_impl(.data, dots) :
no applicable method for 'filter_' applied to an object of class "logical"
dplyr
或data.table
答案 0 :(得分:2)
我们可以通过serum_factor
操作来创建一个组,然后创建列
library(dplyr)
quant_final_means %>%
filter(exposure_time_factor=="200ms") %>%
group_by(serum_factor) %>%
mutate(avg_fluorescence_minus_background = avg_fluorescence -
avg_fluorescence[protein_factor=='background'])
或spread
到&#39;范围&#39;格式,然后这可以很容易地减去,最后改为长期&#39;格式为gather
library(dplyr)
library(tidyr)
quant_final_means %>%
filter(exposure_time_factor=="200ms") %>%
spread(serum_factor, avg_fluorescence) %>%
mutate_at(vars('NEHC', 'VL'), funs(. - .[protein_factor=='background'])) %>%
gather(serum_factor, avg_fluorescence, NEHC:VL)