如何计算R中的均方根偏差?

时间:2019-11-22 09:59:23

标签: r aggregate

我声明我以前从未使用过均方根偏差。 我只是想重现我在文章中发现的内容。

通常,我必须量化“方法”的噪声(由于两种仪器的耦合,这是不同噪声的结果),在正常操作之外的三个不同点上测量方法的噪声,我们知道它只能测量噪音。 最后,在我要遵循的过程中,您必须计算这三个点之间的标准偏差,并将其乘以第95个百分位数置信区间的1.96因子(这样就可以得出方法)。
时间分辨率为30分钟,然后是三点之间的标准差,然后是随后的三点,依此类推。我已经有一个以此方式组织的数据集,并且已经计算出标准差。

因为我正在按照文章的方法进行操作,所以他们比较了使用标准偏差和使用均方根偏差计算出的检出限。 因为最后我必须使用此检测限制来按噪声过滤数据,所以我想像他们一样比较哪种方法最适合我的情况。

如何计算均方根差以及标准差?每三个点(三个不同的列),然后是三个随后的点(相同的三列,但下一行),依此类推?

我已经尝试使用rmse包的Metrics函数,但是问题在于它只需要两个值:实际值和预测值。 当然,就像我进行标准差计算一样,我应该使用聚合来对函数每行每三列进行迭代。

编辑

我在此文章的后面贴了一篇文章,试图使您更好地理解我的需求

“对于时滞与真实时滞有很大不同,通量的标准偏差提供了一种影响通量的随机误差的度量” .......”将该随机误差的度量乘以α得出在给定的置信区间(第95个百分位数,α= 1.96;第99个百分位数,α= 3)下,测量精度的估计值可用作检测的通量极限(LoD)(即LoDσ=α×REσ)。 .....“LoDσ方法的一种修改是基于通量从零开始的均方根偏差(RMSE)计算随机误差,这反映了这些区域的交叉协方差函数的可变性,也反映了其与零

我与您分享部分数据集以及我尝试使用的代码。

数据集

structure(list(`101_LOD` = c(-0.00647656063436054, 0.00645714072316343, 
0.00174533523902105, -0.000354643362187957, -0.000599093190801188, 
0.00086188829059792), `101_LOD.1` = c(0.00380625456526623, -0.00398115037246045, 
0.00158673927930099, -0.00537583996746438, -0.00280048350643599, 
0.00348232298529063), `101_LOD.2` = c(-0.00281100080425964, -0.00335537844222041, 
0.00611652518452308, -0.000738139825060029, 0.00485039477849737, 
0.00412428118507656), `107_LOD` = c(0.00264717678436649, 0.00339296025595841, 
0.00392733001719888, 0.0106686039973083, 0.00886643251752075, 
0.0426091484273961), `107_LOD.1` = c(0.000242380702002215, -0.00116108069669281, 
0.0119784744970561, 0.00380805756323248, 0.00190407945251567, 
0.00199684331869391), `107_LOD.2` = c(-0.0102716279438754, -0.00706528150567528, 
-0.0108745954674186, -0.0122962259781756, -0.00590383880635847, 
-0.00166664119985051), `111_LOD` = c(-0.00174374098054644, 0.00383270191075735, 
-0.00118363208946644, 0.00107908760333878, -9.30127551375776e-05, 
-0.00141500588842743), `111_LOD.1` = c(0.000769378300959002, 
0.00253820252869653, 0.00110643824418424, -0.000338050323261079, 
-0.00313666295753596, 0.0043919374295125), `111_LOD.2` = c(0.000177265973907964, 
0.00199829884609846, -0.000490950219515303, -0.00100263695578483, 
0.00122606902671889, 0.00934018452187161), `113_LOD` = c(0.000997977666838309, 
0.0062400770296875, -0.00153620247996209, 0.00136849054508488, 
-0.00145700847633675, -0.000591288575933268), `113_LOD.1` = c(-0.00114161441697546, 
0.00152607521404826, 0.000811193628975422, -0.000799514037634276, 
-0.000319008435039752, -0.0010086036089075), `113_LOD.2` = c(-0.000722312098377764, 
0.00364767954707251, 0.000547744649351312, 0.000352509651080838, 
-0.000852173274761947, 0.00360487150682726), `135_LOD` = c(-0.00634051802134062, 
0.00426062889500736, 0.00484049067127332, 0.00216220020394825, 
0.00165634168942681, -0.00537970105199375), `135_LOD.1` = c(-0.00209301968088832, 
0.00535855274344209, -0.00119679744329422, 0.0041216882161451, 
0.00512978202611836, 0.0014048506490567), `135_LOD.2` = c(0.00022377545723911, 
0.00400550696583795, 0.00198972253447825, 0.00301341644871015, 
0.00256802839330668, 0.00946109288597202), `137_LOD` = c(-0.0108508893475138, 
-0.0231919072487789, -0.00346546003410657, -0.00154066625155414, 
0.0247266017774909, -0.0254464953061609), `137_LOD.1` = c(-0.00363025194918789, 
-0.00291104074373261, 0.0024998477144967, 0.000877707284759669, 
0.0095477003599792, 0.0501795740749602), `137_LOD.2` = c(0.00930498343499501, 
-0.011839104725282, 0.000274929503053888, 0.000715665078729413, 
0.0145503185102915, 0.0890428314632625), `149_LOD` = c(-0.000194406250680231, 
0.000355157226357547, -0.000353931679163222, 0.000101471293242973, 
-0.000429409422518444, 0.000344585379249552), `149_LOD.1` = c(-0.000494386150759807, 
0.000384907974061922, 0.000582537329068263, -0.000173285705433721, 
-6.92758935962043e-05, 0.00237942557324254), `149_LOD.2` = c(0.000368606958615297, 
0.000432568466833549, 3.33092313366271e-05, 0.000715304544370804, 
-0.000656902381786168, 0.000855422043674721), `155_LOD` = c(-0.000696168382693618, 
-0.000917607266525328, 4.77049670728094e-06, 0.000140297660927979, 
-5.99898679530658e-06, 6.71169142984434e-06), `155_LOD.1` = c(-0.000213644203677328, 
-3.44396001911029e-07, -0.000524232671878577, -0.000830180665933627, 
1.47799998238307e-06, -5.97640014667251e-05), `155_LOD.2` = c(-0.000749882784933487, 
0.000345737159390042, -0.00076916001239521, -0.000135205762575321, 
-2.55352420251723e-06, -3.07199008030628e-05), `31_LOD` = c(-0.00212014938530172, 
0.0247411322547065, -0.00107990654365844, -0.000409195814154659, 
-0.00768439381433953, 0.001860128524035), `31_LOD.1` = c(-0.00248488588195854, 
-0.011146734518705, -0.000167943850441196, -0.0021998906531997, 
0.0166775965182051, -0.0156939303287719), `31_LOD.2` = c(0.00210626277375321, 
-0.00327815351414411, -0.00271043947479133, 0.00118991079627845, 
-0.00838520090692615, 0.0255825346347586), `33_LOD` = c(0.0335175783154054, 
0.0130192144768818, 0.0890608024914352, -0.0142431454793663, 
0.00961009674973182, -0.0429774973256228), `33_LOD.1` = c(0.018600175159935, 
0.04588362587764, 0.0517479021554752, 0.0453766081395813, -0.0483559729403664, 
0.123771869764484), `33_LOD.2` = c(0.01906507758481, -0.00984821669825455, 
0.134177176083007, -0.00544320457445977, 0.0516083894733814, 
-0.0941500564321804), `39_LOD` = c(-0.148517395684098, -0.21311281527214, 
0.112875846920874, -0.134256453140454, 0.0429030528286934, -0.0115143877745049
), `39_LOD.1` = c(-0.0431568202849291, -0.159003698955288, 0.0429009071238143, 
-0.126060096927082, -0.078848020069061, -0.0788748111534866), 
    `39_LOD.2` = c(-0.16276833960171, 0.0236589399437796, 0.0828435027244962, 
    -0.50219849047847, -0.105196237549017, -0.161206838628339
    ), `42_LOD` = c(-0.00643926654994104, -0.0069253267922805, 
    7.63419856289838e-05, -0.0185223126108671, 0.00120855708103566, 
    -0.00275288147011515), `42_LOD.1` = c(-0.000866169150506504, 
    -0.00147791175852563, -0.000670310173141084, -0.00757733007180311, 
    0.0151353172950393, -0.00114193461500327), `42_LOD.2` = c(0.00719928454572906, 
    0.00311615354837406, 0.00270759483782046, -0.0108062423259522, 
    0.00158765505419478, -0.0034831499672973), `45_LOD` = c(0.00557787518897268, 
    0.022337270533665, 0.00657118689440082, -0.00247269227623608, 
    0.0191646343214611, 0.0233090596023039), `45_LOD.1` = c(-0.0305395220788143, 
    0.077105031761457, -0.00101713990356452, 0.0147500116150713, 
    -5.43009569586179e-05, -0.0235006181977403), `45_LOD.2` = c(-0.0216498682456909, 
    -0.0413426968184435, -0.0210779895848601, -0.0147549519865421, 
    0.00305229143870313, -0.0483293292336662), `47_LOD` = c(-0.00467568767221499, 
    -0.0199796182799552, 0.00985966068611855, -0.031010117051163, 
    0.0319279109813341, 0.0350743318265918), `47_LOD.1` = c(0.00820166533285921, 
    -0.00748186905620154, -0.010483251821707, -0.00921919551377505, 
    0.0129546148757833, 0.000223462281435923), `47_LOD.2` = c(0.00172469728530889, 
    0.0181683409295075, 0.00264937907258855, -0.0569837400476351, 
    0.00514558635349483, 0.0963339573489031), `59_LOD` = c(-0.00664210061621158, 
    -0.062069664217766, 0.0104345353700492, 0.0115323589989968, 
    -0.000701276829098035, -0.0397759501000331), `59_LOD.1` = c(-0.00844888486350536, 
    0.0207426674766074, -0.0227755432761471, -0.00370561240222376, 
    0.0152046240483297, -0.0127327412801225), `59_LOD.2` = c(-0.000546590647534814, 
    0.0178115310450356, 0.00776130696191998, 0.00162470375408126, 
    -0.036140754156005, 0.0197791914089296), `61_LOD` = c(0.00797528044191513, 
    -0.00358928087671818, 0.000662870138322471, -0.0412142836466128, 
    -0.00571822580078707, -0.0333870884803465), `61_LOD.1` = c(0.000105849888219735, 
    -0.00694734283847093, -0.00656216592134899, 0.00161225110022219, 
    0.0125744958934939, -0.0178560868664668), `61_LOD.2` = c(0.0049288443167774, 
    0.0059411543659837, -0.00165857112209555, -0.0093669075333705, 
    0.00655185371925189, 0.00516436591134869), `69_LOD` = c(0.0140014747729604, 
    0.0119645827116724, 0.0059880663080946, -0.00339119330845176, 
    0.00406436116298777, 0.00374425148741196), `69_LOD.1` = c(0.00465076983995792, 
    0.00664902297016735, -0.00183936649215524, 0.00496509351837152, 
    -0.0224812403463345, -0.0193087796456654), `69_LOD.2` = c(-0.00934638876711703, 
    -0.00802183076602164, 0.00406752039394799, -0.000421337136630527, 
    -0.00406768983408334, -0.0046016148041856), `71_LOD` = c(-0.00206064862123214, 
    0.0058604630066848, -0.00353440181333921, -0.000305197461077327, 
    0.00266085011303462, -0.00105635261106644), `71_LOD.1` = c(3.66652318354654e-06, 
    0.00542612739642576, 0.000860385212430484, 0.00157520645492044, 
    -0.00280256517377998, -0.00474358065422048), `71_LOD.2` = c(-0.00167098030843413, 
    0.0059622082597603, -0.00121597491543965, -0.000791592953383716, 
    -0.0022790991468459, 0.00508978650148816), `75_LOD` = c(NA, 
    -0.00562613898652477, -0.000103076958936504, -3.76628574664693e-05, 
    -0.000325767611573817, 0.000117404893823389), `75_LOD.1` = c(NA, 
    NA, -0.000496324358203359, -0.000517476831074487, -0.00213096062838051, 
    -0.00111202867609916), `75_LOD.2` = c(NA, NA, -0.000169651845347418, 
    -4.72864955070539e-05, -0.00144880109085214, 0.00421635976535877
    ), `79_LOD` = c(-0.0011901810540199, 0.00731686066269579, 
    0.00538551997145174, -0.00578723012473479, -0.0030246805255648, 
    0.00146141135533218), `79_LOD.1` = c(-0.00424278455960268, 
    -0.010593752642875, 0.0065136497427927, -0.00427355522802769, 
    0.000539975609490915, -0.0206849687839064), `79_LOD.2` = c(-0.00366739576561779, 
    -0.00374066839898667, -0.00132764684703939, -0.00534145222725701, 
    0.00920940542227595, -0.0101871763957068), `85_LOD` = c(-0.0120254177480422, 
    0.00369546541331518, -0.00420718877886963, 0.00414911885475517, 
    -0.00130381692844529, -0.00812757789798261), `85_LOD.1` = c(-0.00302024868281014, 
    0.00537704163310547, 0.00184264538884543, -0.00159032685888543, 
    -0.0062127769817834, 0.00349476605688194), `85_LOD.2` = c(0.0122689407380797, 
    -0.00509605601025503, -0.00641413996554198, 0.000592176121486696, 
    0.00131237912317341, -0.00535018996837309), `87_LOD` = c(0.00613621268007298, 
    0.000410268892659307, -0.00239014321624482, -0.00171179729894864, 
    -0.00107159765522861, -0.00708388174601732), `87_LOD.1` = c(0.00144787264098156, 
    -0.0025946273860992, -0.00194897899110034, 0.00157863310440493, 
    -0.0048913305554607, -0.000585669821053749), `87_LOD.2` = c(-0.00224691693198253, 
    -0.00277315666829267, 0.00166487067514155, -0.00173757960229744, 
    -0.00362252480121682, -0.0101992979591839), `93_LOD` = c(-0.0234225447373586, 
    0.0390095666365413, 0.00606244490932179, 0.0264258422783391, 
    0.0161211132913951, -0.0617678157059), `93_LOD.1` = c(-0.0124876313221369, 
    -0.0309636779639578, 0.00610883313140442, -0.0192442672220773, 
    0.0129557286224975, -0.00869066964782635), `93_LOD.2` = c(-0.0219837540560547, 
    -0.00521242297372905, 0.0179965615561871, 0.0081370991723329, 
    1.45427765512579e-06, -0.0111199632179688), `99_LOD` = c(0.00412086456443205, 
    -0.00259940538393106, 0.00742537463584133, -0.00302091572866969, 
    -0.00320466045653491, -0.00168702410433936), `99_LOD.1` = c(0.00280546156134205, 
    -0.00472591065687533, 0.00518402193979284, -0.00130887074314965, 
    0.00148769905391341, 0.00366250488078969), `99_LOD.2` = c(-0.00240469207099292, 
    -9.57307699040024e-05, -0.000145493235845501, 0.000667454164326723, 
    -0.0057445759245933, 0.00433464631989088), H_LOD = c(-6248.9128518109, 
    -10081.9540490064, -6696.91582671427, -5414.20614601348, 
    -3933.64339240365, -13153.7509294302), H_LOD.1 = c(-6.2489128518109, 
    -10.0819540490064, -6.69691582671427, -5.41420614601348, 
    -3.93364339240365, -13.1537509294302), H_LOD.2 = c(-6248.9128518109, 
    -10081.9540490064, -6696.91582671427, -5414.20614601348, 
    -3933.64339240365, -13153.7509294302)), row.names = c(NA, 
6L), class = "data.frame")

代码

LOD_rdu=sapply(split.default(LOD_ut, rep(seq((ncol(LOD_ut) / 3)), each = 3)), function(i)
  apply(i, 1, rmse))

我收到此错误Error in mse(actual, predicted) : argument "predicted" is missing, with no default

2 个答案:

答案 0 :(得分:1)

很难准确地理解您的需求,我会尽力回答您,

从Wikipedia中,RMSD可以将模型(我猜为您的文章中的模型)生成的数据集与观察到的分布进行比较。

在CRAN中,建模程序包中的RMSE函数具有两个参数:模型和数据:

modelr::rmse(model = ,data = )

此功能将使您的模型适合数据。第一个参数是模型,这意味着您可能会使用lm()之类的函数来生成它。因为您没有详细说明模型,所以我无法为您提供更多帮助。 第二个参数是数据集,您提供的参数对我来说很不安。 R将期望有两列的整洁集x观察时间,y值。

答案 1 :(得分:1)

您可以首先对列进行分组:

path

您也可以使用pedro,只是以上内容可以将您归为一组。

我们可以像这样使用上面的内容:

GRP = sub("[.][0-9]*","",colnames(LOD_ut))
head(GRP)
[1] "101_LOD" "101_LOD" "101_LOD" "107_LOD" "107_LOD" "107_LOD"

这将调出您的前三个分组列。现在,如果您确实套用了(..,1,sd),则得到了标准差,现在我们就对所有组都进行了

1:(ncol(LOD_ut)/3)

如果您必须执行RMSE,则使用预测的平均值:

LOD_ut[,GRP=="101_LOD"]
        101_LOD    101_LOD.1     101_LOD.2
1 -0.0064765606  0.003806255 -0.0028110008
2  0.0064571407 -0.003981150 -0.0033553784
3  0.0017453352  0.001586739  0.0061165252
4 -0.0003546434 -0.005375840 -0.0007381398
5 -0.0005990932 -0.002800484  0.0048503948
6  0.0008618883  0.003482323  0.0041242812