使用numpy和csv文件进行回归

时间:2019-04-29 00:32:11

标签: numpy

我有一个作业问题,我认为我应该找到包含两个数据集的回归线的斜率:GDPC1和PCECCC96。我的老师给了我们这个文件,开始出现问题,但我不断收到错误消息。您能帮我找到代码的问题吗?

import pandas as pd
import numpy as np

s = pd.Series(np.random.randn(5), name='returns')
s.describe()
s.index = ['AMZN', 'AAPL', 'MSFT', 'GOOG','FB']
df = pd.read_csv('data.csv')

#Accessing Data
import requests
r = requests.get('http://research.stlouisfed.org/fred2/series/GDPC1/downloaddata/GDPC1.csv')
url = 'http://research.stlouisfed.org/fred2/series/GDPC1/downloaddata/GDPC1.csv'
source = requests.get(url).content.decode().split("\n")
data = pd.read_csv(url, index_col=0, parse_dates=True)
data.head()
data[-76:]



df['GDPC1']
df.iloc[2:5, 0:4] #select both rows and columns
df.loc[df.index[:76], ['GDPC1', 'GDPPOT']] # select rows and columns using a mixture of integers and labels
df = df.set_index('DATE')
df['I_ratio']=df['GPDI']/df['GDPC1']

import matplotlib.pyplot as plt

df['GDPC1'].plot()
plt.show()
df.plot(x='GDPC1', y='PCECC96', kind='scatter')
df['const'] = 1

import statsmodels.api as sm

reg1 = sm.OLS(endog=df['PCECC96'], exog=df[['const', 'GDPC1']], missing='drop')
type(reg1)
results = reg1.fit()
type(results)
print(results.summary())
  • 我知道一个问题必须是df = pd.read_csv('data.csv'),但是即使我键入文件名,我仍然会收到有关GPDI GDPC1和PCECC96的错误消息

0 个答案:

没有答案