我有一个名为ndvi
的数据框,如下所示:
Year Running NDVI
0 1984 0 0.423529
1 1984 48 0.664205
2 1984 112 0.341656
3 1985 367 0.477519
4 1985 399 0.588417
5 1986 434 0.669474
6 1986 466 0.698148
7 1987 469 0.566785
8 1987 485 0.501238
9 1988 805 0.399277
10 1989 1140 0.666282
11 1990 1492 0.606567
12 1990 1540 0.505155
13 1991 1876 0.597450
14 1992 2180 0.280612
15 1992 2276 0.498419
16 1993 2563 0.413074
17 1993 2579 0.547831
18 1994 2915 0.345050
19 1994 2931 0.460600
我正在运行这样的线性模型:
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.stats.outliers_influence import summary_table
#run linear model
ndvi_x = sm.add_constant(ndvi['Running'])
ndvi_y = ndvi['NDVI']
ndvi_regr = sm.OLS(ndvi_y, ndvi_x)
ndvi_res = ndvi_regr.fit()
# Get fitted values from model to plot
ndvi_st, ndvi_data, ndvi_ss2 = summary_table(ndvi_res, alpha=0.05)
ndvi_fitted_values = ndvi_data[:,2]
#get confidence intervals
ndvi_predict_mean_ci_low, ndvi_predict_mean_ci_upp = ndvi_data[:,4:6].T
ndvi_CI_df = pd.DataFrame(columns = ['x_data', 'low_CI', 'upper_CI'])
ndvi_CI_df['x_data'] = ndvi['Year']
ndvi_CI_df['low_CI'] = ndvi_predict_mean_ci_low
ndvi_CI_df['upper_CI'] = ndvi_predict_mean_ci_upp
ndvi_CI_df.sort_values('x_data', inplace = True)
#plot the data
fig, ax = plt.subplots(figsize = (11, 6), sharey = True)
ax.scatter(ndvi['Year'], ndvi['NDVI'], color = 'black')
ax.plot(ndvi['Year'], ndvi_fitted_values, lw = 2, color = 'k')
ax.fill_between(ndvi_CI_df['x_data'], ndvi_CI_df['low_CI'], ndvi_CI_df['upper_CI'], color = 'gray', alpha = 0.4, label = '95% CI')
ax.set_xlabel("Year")
ax.set_ylabel("NDVI")
返回:
我不明白的是为什么最佳拟合线实际上并不是线性的,而是看起来有断裂?
答案 0 :(得分:2)
回归中的解释变量是Running
。因此,您的模型在此变量中将是线性的。但是,在创建绘图时,x轴表示Year
。