Question

无法理解错误并找到错误的解决方案。我被卡住了。我正在关注https://pythonprogramming.net/forecasting-predicting-machine-learning-tutorial关于机器学习的教程而不是那么困难的线性回归。我试图将列表更改为不可变，但我认为跟随的难度是我正在收集的数据，似乎与本教程使用的数据有很大不同。我正在尝试使用自己的数据。您可以将该站点的代码与此处的代码进行比较。我究竟做错了什么？我怎样才能克服这个障碍？

import csv
import numpy as np
import pandas as pd
from sklearn import preprocessing, cross_validation, svm
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
from matplotlib import style
import datetime
import math

style.use('ggplot')

df = {}

bid = []
btemp = []
ask = []
atemp = []
low = []
high = []
close = []

file=open("C:/documents/EURUSD.csv", "r")
reader = csv.reader(file)

for line in reader:
t=line[0],line[1],line[2],line[3],line[4],line[5],line[6],line[7],line[8]
    btemp = line[2] + line[3]
    bid.append(btemp)
    atemp = line[4] + line[5]
    ask.append(atemp)
    low.append(line[6])
    high.append(line[7])
    close.append(line[8])

bid.pop(0)
ask.pop(0)
low.pop(0)
high.pop(0)
close.pop(0)

nBid = [float(i) for i in bid]
nAsk = [float(i) for i in ask]
nHigh = [float(i) for i in high]
nLow = [float(i) for i in low]
nClose = [float(i) for i in close]

df['nClose'] = nClose

diffHighLow = [(x1 - x2) for (x1, x2) in zip(nHigh, nLow)]
sumBidAsk = [x1 + x2 for (x1, x2) in zip(nBid, nAsk)]
nSumBidAsk = []
for a in sumBidAsk:
    aTemp = (a / 2) * 100
    nSumBidAsk.append(aTemp)
df['HL_PCT'] = [x1 / x2 for (x1, x2) in zip(diffHighLow, nSumBidAsk)]

diffCloseBid = [(x1 - x2) for (x1, x2) in zip(nClose, nBid)]
divDiffCloseBid = [(x1 / x2) for (x1, x2) in zip(diffCloseBid, nBid)]
nPCT_change = []
for b in divDiffCloseBid:
    bTemp = b * 100
    nPCT_change.append(bTemp)
df['PCT_change'] = nPCT_change

df['forecast_col'] = df['nClose']
df['forecast_out'] = int(math.ceil(0.01 * len(df)))

df['laebl'] = df['forecast_col'].shift(-forecast_out)
X = np.array(df.drop(['label'], 1))

已编辑|现在包括堆栈跟踪

File "<ipython-input-4-006cfd724c3e>", line 1, in <module>
runfile('C:/Users/venichhe/Desktop/test3.py', 
wdir='C:/Users/venichhe/Desktop')

File "C:\Users\venichhe\Anaconda2\lib\site-
packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
execfile(filename, namespace)

File "C:\Users\venichhe\Anaconda2\lib\site-
packages\spyder\utils\site\sitecustomize.py", line 87, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)

File "C:/Users/venichhe/Desktop/test3.py", line 69, in <module>
df['laebl'] = df['forecast_col'].shift(-forecast_out)

AttributeError: 'list' object has no attribute 'shift'

Answer 1

您根本没有使用pandas数据框，您使用的是名为df的字典，然后尝试使用它，就像它是一个数据帧一样。尝试使用pandas.read_csv加载数据。

Answer 2

请在纠正拼写错误后尝试：

df['**laebl**'] = df['forecast_col'].shift(-forecast_out)

到

df['label'] = df['forecast_col'].shift(-forecast_out)

Python机器学习线性回归numpy列表错误

2 个答案: