我希望这里有人可以帮助我。我正在尝试为我的工作学习R编码。
随着时间的推移,我一直在追踪植物的生长(称为生态型),用模拟物或Xcc细菌对其进行处理。我有2个不同的实验(在不同的时间完成),经过图像处理后,我得到了Area。
我想计算每种生态型的Normalized_Area = Area(t1)/ Area(t0),对于每个实验的每种处理(Manip),将时间除以该生态型的面积àArea实验(t0)。每个植物在时间0处具有不同的面积,并且不同的实验具有不同的开始时间。 (Normalized_Area列中的预期结果示例)
请在下面找到我的df
# A tibble: 24 x 6
Manip Traitment Ecotype Date Area Normalized_Area
<dbl> <chr> <chr> <dttm> <dbl> <dbl>
1 1 mock a1-2 2017-12-12 00:00:00 17699 1
2 1 mock a1-2 2017-12-13 00:00:00 24538 1.39
3 1 mock a1-2 2017-12-14 00:00:00 27958 1.58
4 1 xcc a1-2 2017-12-12 00:00:00 19857 1
5 1 xcc a1-2 2017-12-13 00:00:00 27973 1.41
6 1 xcc a1-2 2017-12-14 00:00:00 35875 1.81
7 2 mock a1-2 2018-03-20 00:00:00 18177 1
8 2 mock a1-2 2018-03-21 00:00:00 20251 1.11
9 2 mock a1-2 2018-03-23 00:00:00 36679 2.02
10 2 xcc a1-2 2018-03-20 00:00:00 17261 1
11 2 xcc a1-2 2018-03-21 00:00:00 18697 1.08
12 2 xcc a1-2 2018-03-23 00:00:00 35345 2.05
13 1 mock a1-10 2017-12-12 00:00:00 22853 1
14 1 mock a1-10 2017-12-13 00:00:00 34641 1.52
15 1 mock a1-10 2017-12-14 00:00:00 40311 1.76
16 1 xcc a1-10 2017-12-12 00:00:00 23754 1
17 1 xcc a1-10 2017-12-13 00:00:00 33247 1.40
18 1 xcc a1-10 2017-12-14 00:00:00 40603 1.71
19 2 mock a1-10 2018-03-20 00:00:00 28201 1
20 2 mock a1-10 2018-03-21 00:00:00 30306 1.07
21 2 mock a1-10 2018-03-23 00:00:00 49086 1.74
22 2 xcc a1-10 2018-03-20 00:00:00 27217 1
23 2 xcc a1-10 2018-03-21 00:00:00 29844 1.10
24 2 xcc a1-10 2018-03-23 00:00:00 46540 1.71
我使用For循环编写了一段代码,但它引发了一些错误,我想使用dplyr将其转换为更具可读性的代码
date_debut=c("2017-12-12", "2018-03-20") # starting_time
data$Normalized_Area = NA
for(manips in levels(as.factor(data$Manip))){ # for each manip
for(ecoty in levels(as.factor(data$Ecotype))){ # for each ecotype
for(traity in levels(as.factor(data$Traitement))){ # for each treatment
for( dd in levels(as.factor(date_debut))){ # for each level
tmp = subset(data,subset=c(Traitement==traity & Ecotype == ecoty & Manip == manips)) # creation d'un fichier tmp
if(dim(tmp)[1] != 0){
#tmp = ordered(tmp$date[1:length(tmp$date-1)])
# compute Area mean at D=0 for each Experiement
if(dd %in% as.character(tmp$Date)!=F){
A0 = tmp$Area[as.character(tmp$Date)== dd] # Select A0 in tmp$Area corresponding to dd
Norm_Area = tmp$Area /A0
data$Normalized_Area[data$Traitement == traity & data$Ecotype== ecoty & data$Manip == manips] = Norm_Area
}
}
}
}
}
这是我的新代码的开头,但是我被卡住了
gpeData %>%
group_by(Traitement, Ecotype, Manip ) %>%
mutate_( Normalized_Area = Area / Area[wich(Date %in% date_debut)] )
有人知道该怎么做吗?我为难懂的代码表示歉意,但我一个人学 最好
答案 0 :(得分:0)
您非常需要自己解决问题。这是我的解决方案,我使用which.min
从每个组中查找最早的日期的索引,然后在计算中使用了该索引值。
gpeData<-structure(list(Manip = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L),
Traitment = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L,
2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L
), .Label = c("mock", "xcc"), class = "factor"), Ecotype = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("a1-10", "a1-2"
), class = "factor"), Date = structure(c(1513036800, 1513123200,
1513209600, 1513036800, 1513123200, 1513209600, 1521504000,
1521590400, 1521763200, 1521504000, 1521590400, 1521763200,
1513036800, 1513123200, 1513209600, 1513036800, 1513123200,
1513209600, 1521504000, 1521590400, 1521763200, 1521504000,
1521590400, 1521763200), class = c("POSIXct", "POSIXt"), tzone = "GMT"),
Area = c(17699L, 24538L, 27958L, 19857L, 27973L, 35875L,
18177L, 20251L, 36679L, 17261L, 18697L, 35345L, 22853L, 34641L,
40311L, 23754L, 33247L, 40603L, 28201L, 30306L, 49086L, 27217L,
29844L, 46540L), Normalized_Area = c(1, 1.39, 1.58, 1, 1.41,
1.81, 1, 1.11, 2.02, 1, 1.08, 2.05, 1, 1.52, 1.76, 1, 1.4,
1.71, 1, 1.07, 1.74, 1, 1.1, 1.71)), row.names = c(NA, -24L
), class = "data.frame")
library(dplyr)
ans<-gpeData %>%
group_by(Traitment, Ecotype, Manip ) %>%
mutate(NormArea=Area[which.min(Date)], Normalized= Area/NormArea)