通过R中的一组数据点拟合自定义函数

时间:2017-04-12 13:20:27

标签: r ggplot2

我有一些与y = a / (b + x) + c函数匹配良好的数据,如下图所示,使用python

enter image description here

绘制的数据属于列CV2和以下数据的平均值:

peptide,glycan,Step,CV,CV2,Mean
MIgGI,H3N4F1,A,0.0202,2.02,38.7611
MIgGI,H4N4F1,A,0.0204,2.04,32.366
MIgGI,H4N4F1G1,A,0.0591,5.91,9.4399
MIgGI,H5N4F1,A,0.0233,2.33,6.8238
MIgGI,H5N4F1G1,A,0.0567,5.67,7.145
MIgGI,H5N4F1G2,A,0.1202,12.02,2.2624
MIgGI,H6N4F1,A,0.0815,8.15,0.9127
MIgGI,H6N4F1G1,A,0.1017,10.17,2.289
MIgGII,H3N4F1,A,0.0213,2.13,27.5121
MIgGII,H4N4F1,A,0.012,1.2,50.7153
MIgGII,H4N4F1G1,A,0.1161,11.61,4.7289
MIgGII,H5N4F1,A,0.057,5.7,9.7925
MIgGII,H5N4F1G1,A,0.0775,7.75,7.2511
MIgGIII,H3N4F1,A,0.0109,1.09,38.7333
MIgGIII,H4N4F1,A,0.0108,1.08,33.4383
MIgGIII,H4N4F1G1,A,0.0289,2.89,7.5136
MIgGIII,H5N4F1,A,0.019,1.9,9.7476
MIgGIII,H5N4F1G1,A,0.0333,3.33,6.8778
MIgGIII,H5N4F1G2,A,0.0725,7.25,2.5125
MIgGIII,H6N4F1,A,0.1009,10.09,0.5561
MIgGIII,H6N4F1G1,A,0.0964,9.64,0.6207
MIgGI,H3N4F1,B,0.013,1.3,38.9716
MIgGI,H4N4F1,B,0.0113,1.13,32.6984
MIgGI,H4N4F1G1,B,0.0306,3.06,9.2867
MIgGI,H5N4F1,B,0.0144,1.44,6.9923
MIgGI,H5N4F1G1,B,0.0372,3.72,7.2527
MIgGI,H5N4F1G2,B,0.084,8.4,2.0331
MIgGI,H6N4F1,B,0.0729,7.29,0.8519
MIgGI,H6N4F1G1,B,0.068,6.8,1.9135
MIgGII,H3N4F1,B,0.0154,1.54,27.9812
MIgGII,H4N4F1,B,0.009,0.9,50.5831
MIgGII,H4N4F1G1,B,0.0626,6.26,4.6042
MIgGII,H5N4F1,B,0.027,2.7,9.7946
MIgGII,H5N4F1G1,B,0.0673,6.73,7.0369
MIgGIII,H3N4F1,B,0.0063,0.63,38.9712
MIgGIII,H4N4F1,B,0.0058,0.58,33.3185
MIgGIII,H4N4F1G1,B,0.0142,1.42,7.4533
MIgGIII,H5N4F1,B,0.0111,1.11,9.7274
MIgGIII,H5N4F1G1,B,0.0203,2.03,6.8541
MIgGIII,H5N4F1G2,B,0.046,4.6,2.4587
MIgGIII,H6N4F1,B,0.071,7.1,0.5977
MIgGIII,H6N4F1G1,B,0.0557,5.57,0.6191
MIgGI,H3N4F1,C,0.0105,1.05,38.9007
MIgGI,H4N4F1,C,0.0103,1.03,32.9509
MIgGI,H4N4F1G1,C,0.0286,2.86,9.1911
MIgGI,H5N4F1,C,0.0157,1.57,7.1153
MIgGI,H5N4F1G1,C,0.0313,3.13,7.1339
MIgGI,H5N4F1G2,C,0.0824,8.24,1.9618
MIgGI,H6N4F1,C,0.0805,8.05,0.8522
MIgGI,H6N4F1G1,C,0.0601,6.01,1.8941
MIgGII,H3N4F1,C,0.0112,1.12,27.9775
MIgGII,H4N4F1,C,0.0079,0.79,50.6428
MIgGII,H4N4F1G1,C,0.0315,3.15,4.6341
MIgGII,H5N4F1,C,0.0178,1.78,9.8879
MIgGII,H5N4F1G1,C,0.0378,3.78,6.8578
MIgGIII,H3N4F1,C,0.0074,0.74,38.9393
MIgGIII,H4N4F1,C,0.0073,0.73,33.4493
MIgGIII,H4N4F1G1,C,0.0205,2.05,7.4305
MIgGIII,H5N4F1,C,0.0201,2.01,9.7543
MIgGIII,H5N4F1G1,C,0.022,2.2,6.8209
MIgGIII,H5N4F1G2,C,0.0507,5.07,2.4011
MIgGIII,H6N4F1,C,0.0699,6.99,0.5916
MIgGIII,H6N4F1G1,C,0.0636,6.36,0.613
MIgGI,H3N4F1,D,0.0161,1.61,38.6871
MIgGI,H4N4F1,D,0.0116,1.16,32.5154
MIgGI,H4N4F1G1,D,0.0321,3.21,9.4093
MIgGI,H5N4F1,D,0.0164,1.64,7.0342
MIgGI,H5N4F1G1,D,0.0436,4.36,7.3668
MIgGI,H5N4F1G2,D,0.089,8.9,2.1486
MIgGI,H6N4F1,D,0.069,6.9,0.8602
MIgGI,H6N4F1G1,D,0.0591,5.91,1.9785
MIgGII,H3N4F1,D,0.0088,0.88,27.794
MIgGII,H4N4F1,D,0.0065,0.65,50.5292
MIgGII,H4N4F1G1,D,0.0588,5.88,4.6524
MIgGII,H5N4F1,D,0.029,2.9,9.93
MIgGII,H5N4F1G1,D,0.0265,2.65,7.0944
MIgGIII,H3N4F1,D,0.0144,1.44,38.8735
MIgGIII,H4N4F1,D,0.0119,1.19,33.2681
MIgGIII,H4N4F1G1,D,0.0361,3.61,7.5291
MIgGIII,H5N4F1,D,0.0143,1.43,9.6721
MIgGIII,H5N4F1G1,D,0.0427,4.27,6.9168
MIgGIII,H5N4F1G2,D,0.1004,10.04,2.5116
MIgGIII,H6N4F1,D,0.0627,6.27,0.5986
MIgGIII,H6N4F1G1,D,0.1028,10.28,0.6303
MIgGI,H3N4F1,E,0.0075,0.75,38.5785
MIgGI,H4N4F1,E,0.0069,0.69,32.7503
MIgGI,H4N4F1G1,E,0.0104,1.04,9.2578
MIgGI,H5N4F1,E,0.014,1.4,7.1626
MIgGI,H5N4F1G1,E,0.0232,2.32,7.2941
MIgGI,H5N4F1G2,E,0.0574,5.74,2.0376
MIgGI,H6N4F1,E,0.0605,6.05,0.8892
MIgGI,H6N4F1G1,E,0.0399,3.99,2.0299
MIgGII,H3N4F1,E,0.0122,1.22,27.9317
MIgGII,H4N4F1,E,0.0067,0.67,50.6464
MIgGII,H4N4F1G1,E,0.0305,3.05,4.6096
MIgGII,H5N4F1,E,0.0259,2.59,9.9381
MIgGII,H5N4F1G1,E,0.045,4.5,6.8741
MIgGIII,H3N4F1,E,0.0054,0.54,38.954
MIgGIII,H4N4F1,E,0.0054,0.54,33.2632
MIgGIII,H4N4F1G1,E,0.0121,1.21,7.4881
MIgGIII,H5N4F1,E,0.0128,1.28,9.7186
MIgGIII,H5N4F1G1,E,0.0157,1.57,6.8945
MIgGIII,H5N4F1G2,E,0.0237,2.37,2.4624
MIgGIII,H6N4F1,E,0.0557,5.57,0.5882
MIgGIII,H6N4F1G1,E,0.0561,5.61,0.6311

但是,我正在尝试使用R的ggplot生成这样的图像,并且我似乎无法理解如何在stat_smooth内定义函数。这是我一直在尝试的:

# Work in progress
ggplot(data=data, aes(x=Mean, y=CV2)) +
    geom_point() +
    stat_smooth(method='nls',formula=y~(a/(b+x))+c) +
    labs(title = "RA vs CV (All analytes)") +
    labs(x = "Mean [%]") +
    labs(y = "CV [%]") +
    theme(plot.title = element_text(hjust = 0.5))

这会产生以下警告和错误:

Warning messages:
1: In (function (formula, data = parent.frame(), start, control = nls.control(),  :
  No starting values specified for some parameters.
Initializing ‘a’, ‘b’ to '1.'.
Consider specifying 'start' or using a selfStart model
2: Computation failed in `stat_smooth()`:
non-numeric argument to binary operator

关于消息,我很好,R使用1作为ab的初始估计,因为这也是我在Python中所做的。我很惊讶我没有看到有关c术语的警告。因此,我根据下面的链接question调整了一种方法来手动确定(1,1,1)的起始值,如下所示:

# Work in progress
ggplot(data=data, aes(x=Mean, y=CV2)) +
    geom_point() +
    stat_smooth(method='nls',formula=y~(a/(b+x))+c, method.args = list(start = c(a=1, b=1,c=1))) +
    labs(title = "RA vs CV (All analytes)") +
    labs(x = "Mean [%]") +
    labs(y = "CV [%]") +
    theme(plot.title = element_text(hjust = 0.5))

这会产生更多有关崩溃的可用信息,具体来说是:

Warning message:
Computation failed in `stat_smooth()`:
step factor 0.000488281 reduced below 'minFactor' of 0.000976562 

总之,我认为这是由于模型试图拟合y = 0.这是我的Python脚本只是发出警告然后高兴地忽略的东西。因此,如果情况确实如此,我想知道如何忽视或处理这样的问题。

1 个答案:

答案 0 :(得分:2)

我发现如果您更改用于查找解决方案的算法,这将有效。您还必须关闭默认的标准误差计算。这对我的数据很有用

{{1}}

并给出了这个情节

enter image description here