我尝试按How to make a great R reproducible example?:
的建议制作可重现的数据structure(list(ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("ANK.26.1",
"ANK.35.10"), class = "factor"), DAY = c(2L, 3L,
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L,
18L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L), carbon = c(1684.351094778, 3514.451339358,
6635.877888654, 10301.700591252, 11361.360992769, 11891.934331254,
12772.885869486, 13545.127224369, 14022.00520767, 14255.045990397,
14479.813468278, 14611.749542181, 14746.382638335, 14942.733363567,
14961.338739162, 15049.433738817, 15047.197961499, 1705.361701104,
3293.593040601, 4788.872254899, 6025.622715999, 6670.80499518,
7150.526272512, 7268.955557607, 7513.61998338, 7896.202773246,
8017.953574608, 8146.09464786, 8286.148260324, 8251.229520243,
8384.244997158, 8413.034235219, 8461.066691601, 8269.360979031
), g.rate.perc = c(NA, 1.08653133557123, 0.888168948119852,0.55242467750436,
0.102862667394628, 0.0466998046116733, 0.0740797513417739, 0.060459426536321,
0.0352066079115925, 0.0166196474238596, 0.0157675729725753, 0.00911172469120847,
0.00921402983026387, 0.0133151790542558, 0.00124511193115184,
0.00588817626489591, -0.000148562222127446, NA, 0.931316411333049,
0.45399634862756, 0.258255053647507, 0.107073129133681, 0.0719135513148148,
0.0165623173150578, 0.0336588143694119, 0.0509185706373581,0.0154189051191185,
0.0159817679236518, 0.0171927308137518, -0.00421410998016991,
0.0161206856006937, 0.00343373053515927, 0.00570929049366353,
-0.0226573929218994), max.carb = c(15049.433738817, 15049.433738817,
15049.433738817, 15049.433738817, 15049.433738817, 15049.433738817,
15049.433738817, 15049.433738817, 15049.433738817, 15049.433738817,
15049.433738817, 15049.433738817, 15049.433738817, 15049.433738817,
15049.433738817, 15049.433738817, 15049.433738817, 8461.066691601,
8461.066691601, 8461.066691601, 8461.066691601, 8461.066691601,
8461.066691601, 8461.066691601, 8461.066691601, 8461.066691601,
8461.066691601, 8461.066691601, 8461.066691601, 8461.066691601,
8461.066691601, 8461.066691601, 8461.066691601, 8461.066691601
)), .Names = c("ID", "DAY", "carbon", "g.rate.perc", "max.carb"
), row.names = c(NA, 34L), class = "data.frame")
'data.frame': 34 obs. of 5 variables:
$ ID : Factor w/ 150 levels "ANK.26.1","ANK.35.10",..: 1 1 1 1 1 1 1 1 1 1 ...
$ DAY : int 2 3 4 5 6 7 8 9 10 11 ...
$ carbon : num 1684 3514 6636 10302 11361 ...
$ g.rate.perc: num NA 1.087 0.888 0.552 0.103 ...
$ max.carb : num 15049 15049 15049 15049 15049 ...
在样本数据中,ID只有两个级别,而不是指示的150.
我的nls看起来像那样:
res.sample <- ddply (
d, .(ID),
function(x){
mod <- nls(carbon~phi1/(1+exp(-(phi2 + phi3 * DAY))),
start=list(
phi1 = x$max.carb,
phi2 = int[1],
phi3 = mean(x$g.rate.perc)),
data=x,trace=TRUE)
return(coef(mod))
}
)
phi2实际上是来自
的拦截结果 int <- coef(lm(DAY~carbon,data=sample))
不幸的是它不再起作用,因为我试图将它包装到ddply周围,但我不能手动浏览所有原始的150级ID。
最重要的是,我想将phi1-phi3的所有三个输出值存储在具有相应ID的数据帧/列表中。我打算通过
来做到这一点return(coef(mod))
顶部的樱桃将是实际数据的曲线图和顶部的拟合曲线。手动进行子集化我也可以这样做,但这太费时间了。 我减少的ggplot代码是
ggplot(data=n, aes(x = DAY, y = carbon))+
geom_point(stat="identity", size=2) +
geom_line( aes(DAY,predict(logMod) ))+
ggtitle("ID")
如果以某种方式包含三重信息的ID不太有用,以下是如何将其返回到另一个版本
sep_sample <- sample %>% separate(ID, c("algae", "id", "nutrient"))
我觉得这个问题太多了,但我真的很努力,而且我只能在这上花很多天。
以下是摘要:
我需要在ID /每种藻类组合的每个级别上运行模型。如果分开它就会有营养。
输出phi应该存储在某种框架/列表/表格中,并且各自标识它们所属的位置。
理想情况下,有一种方法可以在所有这些中包含ggplots,这些方法也会自动生成并存储。
正如我所说,模型本身已经有效,但当我输入ddply结构时,我收到以下错误消息:
Error in numericDeriv(form[[3L]], names(ind), env) :
Missing value or an infinity produced when evaluating the model
我希望这是你可以用某种方式工作的东西,这似乎是一个合理的问题。如果有些页面已经提供了我未找到的类似解决方案,我很乐意看一下。
非常感谢!
答案 0 :(得分:0)
soo我想出了这个解决方案,这不是我想要的,但我认为距离它已经更近了,因为它正在运行
coef_list <- list()
curve_list <- list()
for(i in levels(d$ALGAE)) {
for(j in levels(d$NUTRIENT)) {
dat = d[d$ALGAE == i & d$NUTRIENT == j,]
#int <- coef(lm(DAY~carbon,data=dat))
mod <- nls(carbonlog~phi1/(1+exp(-(phi2+phi3*DAY))),
start=list(
phi1=9.364,
phi2=0,
phi3= 0.135113),
data=dat,trace=TRUE)
coef_list[[paste(i, j, sep = "_")]] = coef(mod)
plt <- ggplot(data = dat, aes(x = DAY, y = carbonlog)) + geom_point()+
geom_line( aes(DAY,predict(mod) ))+
ggtitle(paste(i,"RATIO",j,sep=" ")) +
theme.plot
curve_list[[paste(i, j, sep = "_")]] = plt
}
}
遗憾的是,参数是静态的,并不依赖于各自的因子组合。我估计这封信会更有帮助找到合适的人选。
如果我申请
curve_list[["ANK_1"]]
我收到了一条错误消息:
Error: Aesthetics must be either length 1 or the same as the data (17): x, y
当我使用对数转换的碳值时,我只收到消息。当我以原始格式使用碳时,它会绘制所有内容