如何在R中运行分段分位数回归?

时间:2019-12-13 07:02:19

标签: r quantile-regression

我正在尝试使用R中的分位数回归将“约束线”拟合为双变量散点图。我的响应变量(WaterEr)代表每年的土壤流失,而我的自变量代表(RainfallEr )降雨侵蚀力(请参见下图)。 enter image description here

我要获得的结果基于Hao et al. 2017的一系列出版物,其中x轴上的值平均分为100个部分(或bin)以创建100列(请参见下文)。然后,他们使用分段分位数回归来定义与适合约束线的列一样多的边界点。有关信息,他们使用Origin 9(美国,OriginLabs)进行了这些计算。 enter image description here

由于响应似乎不是线性的,因此我尝试通过nlrq()包中的quantreg::函数应用非线性分位数回归。 这是我已应用的代码示例:

library(quantreg)

# create a dummy range of rainfall erosivity that we use to predict soil erosion from our fitted model
predict_range <- data.frame(RainfallEr = seq(26, 6000, length = 10000))


# visualise part of the data.frame (first 20 rows, thre are 188,000 rows in total)
combined_df <- data.frame(x = c(-100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354, -100099.882870354),
                y = c(6309106.01324476, 6309606.01324476, 6310106.01324476, 6313106.01324476, 6313606.01324476, 6314106.01324476, 6314606.01324476, 6315106.01324476, 6315606.01324476, 6316106.01324476, 6316606.01324476, 6317106.01324476, 6317606.01324476, 6318106.01324476, 6318606.01324476, 6319106.01324476, 6319606.01324476, 6320106.01324476, 6320606.01324476, 6321106.01324476),
                WaterEr = c(0.00868059950549645, 0.0091251355161706, 0.0326287519829422, 0.000178796614549174, 4.91241536357219e-05, 4.85026861560795e-05, 5.04674993751928e-05, 4.08148993464175e-05, 4.00002440967491e-05, 4.33874760799851e-05, 3.69049351915739e-05, 7.04070050153354e-05, 0.00103683921729508, 0.0172595102342795, 0.0162371833835457, 0.0166259941567392, 0.0170966498768553, 0.0143875260855259, 0.00268803246521925, 0.00201286512639929), 
                RainfallEr = c(220.511938095093, 220.511938095093, 271.830009102821, 266.356387972832, 266.356387972832, 266.356387972832, 276.096389293671, 276.096389293671, 276.096389293671, 281.761203408241, 281.761203408241, 281.761203408241, 246.739157915115, 246.739157915115, 246.739157915115, 210.058827996254, 210.058827996254, 210.058827996254, 180.8417532444, 180.8417532444), 
                Year = c("2007", "2007", "2007", "2007", "2007", "2007", "2007", "2007", "2007", "2007", "2007", "2007", "2007", "2007", "2007", "2007", "2007", "2007", "2007", "2007"))


# Test 1: with polynomial order equation
my.equation <- WaterEr ~ a * RainfallEr^2 + b * RainfallEr  + c

test_nlrq <- nlrq(my.equation, data = combined_df, start = list(a = 0.01, b = 1, c = 0.6), tau = 0.9) # note the a, b, and c values are chosen randomly
# summary(test_nlrq)

my.line90 <- within(predict_range, 
             WaterEr <- predict(test_nlrq, 
                            newdata = predict_range))


# Test 2: with nlrq with automatic SSlogis() function
test_nlrq2 <- nlrq(WaterEr ~ SSlogis(RainfallEr, Asym, mid, scal), data = combined_df, tau = 0.9)
#summary(test_nlrq2 )

my.line90_v2 <- within(predict_range, 
             WaterEr <- predict(test_nlrq2, 
                            newdata = predict_range))
# plot the results (see attachment below)
plot(WaterEr ~ RainfallEr, data = combined_df, pch = 16, cex = 0.5, 
     xlab = "Rainfall erosivity (MJ.mm/ha/h/y)", ylab = "Annual water erosion (t/ha/y)")
lines(WaterEr ~ RainfallEr, data = my.line90, col = "red", lty = 2)
lines(WaterEr ~ RainfallEr, data = my.line90_v2, col = "blue", lty = 2)
legend("topright", legend = c("non-linear rq - poly", "non-linear rq - autoSSlogis"), col = c("red", "blue"), lty = 2)

使用quantreg::包生成的图: enter image description here

您可能会看到,我测试过的nlrq()函数的回归线看起来并不像已发布的图。

您有什么建议吗?

0 个答案:

没有答案