如何从每个数字中删除平均值并从某个区域获得平均值

时间:2018-02-15 23:18:34

标签: r

我有这样的数据

df<- structure(list(X1 = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), X2 = structure(c(1L, 2L, 3L, 
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 
18L, 19L, 20L, 21L, 22L, 23L, 24L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 
6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 
19L, 20L, 21L, 22L, 23L, 24L), .Label = c("B02", "B03", "B04", 
"B05", "B06", "B07", "C02", "C03", "C04", "C05", "C06", "C07", 
"D02", "D03", "D04", "D05", "D06", "D07", "G02", "G03", "G04", 
"G05", "G06", "G07"), class = "factor"), X3 = c(0.005648642, 
0.005876389, 0.00592532, 0.006244456, 0.005987075, 0.006075874, 
0.006198667, 0.006003758, 0.006041885, 0.006186987, 0.006041323, 
0.006071594, 0.005902391, 0.005976096, 0.00593805, 0.005866524, 
0.0059831, 0.005902586, 0.005914309, 0.005887304, 0.006054509, 
0.005931266, 0.005936195, 0.005895191, 0.005840959, 0.005849247, 
0.005808851, 0.005833586, 0.005825153, 0.00584873, 0.005983976, 
0.00598669, 0.006011548, 0.005997747, 0.005851022, 0.005919044, 
0.005854566, 0.0058226, 0.00578052, 0.005784874, 0.005933198, 
0.005996407, 0.005898848, 0.00595775, 0.005918857, 0.005882898, 
0.005877808, 0.005803604, 0.006235161, 0.005808725)), .Names = c("X1", 
"X2", "X3"), class = "data.frame", row.names = c(NA, -50L))

我试图获得几个数字的平均值,然后从该数据中的每个数字中减去它,然后获得特定数字的平均值

这就是我做的事情

我首先尝试获得&#34; G05&#34;,&#34; G06&#34;,&#34; G07&#34;的平均值。每套(X1) 然后我从每个值减去它

df2 <- df1 %>%
  filter(X2 %in% paste0(paste0("G0", 5:7)) %>%
  group_by(X1) %>%
  summarise_at(vars(-X2), funs(mean(.))) 

哪个应该给我两个数字用于第1组和第2组(基于X1)

平均值(C(0.005931266,0.005936195,0.005895191)) [1] 0.005920884

平均值(C(0.005803604,0.006235161,0.005808725)) [1] 0.005949163

然后我想根据组

从组1和组2中的每个数字中删除此值 例如,

0.005648642- 0.005920884 。 。 。 。 0.005840959- 0.005949163

简单来说

1-我们得到两组的G05,G06和G07的平均值,其中X1是1或2

例如

mean(c(0.005931266,0.005936195,0.005895191)) [1] 0.005920884

mean(c(0.005803604,0.006235161,0.005808725)) [1] 0.005949163

2-我们从每个数字中删除这些平均值 例如

0.005648642- 0.005920884 
.
.
.
.
0.005840959- 0.005949163

3-在此更正之后然后我想对两个组的特定行进行过多的处理

例如

两组的B02和B03

average(c(0.005648642- 0.005920884,0.005876389- 0.005920884))

average(c(0.005808851- 0.005949163,0.005833586 - 0.005949163))

1 个答案:

答案 0 :(得分:1)

我这是你之后的事情?

  1. 步骤1和2:

    X1上拆分(即按X1分组)并根据X3G05,{{1}的平均值将值放在G06中心}}:

    G07
  2. 第3步

    对于每个组,lst <- lapply(split(df, df$X1), function(w) { w.G0567 <- subset(w, grepl("G0[567]", w$X2)); print(mean(w.G0567$X3)); w$X3 <- w$X3 - mean(w.G0567$X3); return(w); }) #[1] 0.005920884 #[1] 0.005949163 lst; #$`1` # X1 X2 X3 #1 1 B02 -0.000272242 #2 1 B03 -0.000044495 #3 1 B04 0.000004436 #4 1 B05 0.000323572 #5 1 B06 0.000066191 #6 1 B07 0.000154990 #7 1 C02 0.000277783 #8 1 C03 0.000082874 #9 1 C04 0.000121001 #10 1 C05 0.000266103 #11 1 C06 0.000120439 #12 1 C07 0.000150710 #13 1 D02 -0.000018493 #14 1 D03 0.000055212 #15 1 D04 0.000017166 #16 1 D05 -0.000054360 #17 1 D06 0.000062216 #18 1 D07 -0.000018298 #19 1 G02 -0.000006575 #20 1 G03 -0.000033580 #21 1 G04 0.000133625 #22 1 G05 0.000010382 #23 1 G06 0.000015311 #24 1 G07 -0.000025693 # #$`2` # X1 X2 X3 #25 2 C02 -1.082043e-04 #26 2 C03 -9.991633e-05 #27 2 B02 -1.403123e-04 #28 2 B03 -1.155773e-04 #29 2 B04 -1.240103e-04 #30 2 B05 -1.004333e-04 #31 2 B06 3.481267e-05 #32 2 B07 3.752667e-05 #33 2 C02 6.238467e-05 #34 2 C03 4.858367e-05 #35 2 C04 -9.814133e-05 #36 2 C05 -3.011933e-05 #37 2 C06 -9.459733e-05 #38 2 C07 -1.265633e-04 #39 2 D02 -1.686433e-04 #40 2 D03 -1.642893e-04 #41 2 D04 -1.596533e-05 #42 2 D05 4.724367e-05 #43 2 D06 -5.031533e-05 #44 2 D07 8.586667e-06 #45 2 G02 -3.030633e-05 #46 2 G03 -6.626533e-05 #47 2 G04 -7.135533e-05 #48 2 G05 -1.455593e-04 #49 2 G06 2.859977e-04 #50 2 G07 -1.404383e-04 X3的平均居中B02值。

    B03