我使用了数据集“ ex1data1.txt”,但是当我运行它进行转换时,它显示以下错误:
AttributeError Traceback (most recent call last)
<ipython-input-52-7c523f7ba9e1> in <module>()
1 # Converting loaded dataset into numpy array
2
----> 3 X = np.concatenate((np.ones(len(population)).reshape(len(population), 1), population.reshape(len(population),1)), axis=1)
4
5
AttributeError: 'tuple' object has no attribute 'reshape'
代码如下:
import csv
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import pandas as pd
import numpy as np
# Loading Dataset
with open('ex1data1.txt') as csvfile:
population, profit = zip(*[(float(row['Population']), float(row['Profit'])) for row in csv.DictReader(csvfile)])
# Creating DataFrame
df = pd.DataFrame()
df['Population'] = population
df['Profit'] = profit
# Plotting using Seaborn
sns.lmplot(x="Population", y="Profit", data=df, fit_reg=False, scatter_kws={'s':45})
# Converting loaded dataset into numpy array
X = np.concatenate((np.ones(len(population)).reshape(len(population), 1), population.reshape(len(population),1)), axis=1)
y = np.array(profit).reshape(len(profit), 1)
# Creating theta matrix , theta = [[0], [0]]
theta = np.zeros((2, 1))
# Learning rate
alpha = 0.1
# Iterations to be taken
iterations = 1500
# Updated theta and calculated cost
theta, cost = gradientDescent(X, y, theta, alpha, iterations)
我不知道该如何解决重塑问题。谁能告诉我如何解决这个问题?
答案 0 :(得分:0)
根据您的定义,population
是一个元组。我建议两个选择,第一个是将其转换为数组,即
population = np.asarray(population)
或者,您可以使用DataFrame列.values
属性,该属性本质上是一个numpy数组:
X = np.concatenate((np.ones(len(population)).reshape(len(population), 1), df['Population'].values.reshape(len(population),1)), axis=1)