将DF转换为Numpy数组进行计算

时间:2015-09-18 23:53:30

标签: python arrays numpy dataframe toarray

我有数据帧格式的数据,我将使用用户构建的函数进行线性回归计算。这是代码:

from sklearn.datasets import load_boston
boston = load_boston()

bos = pd.DataFrame(boston.data) # convert to DF
bos.columns = boston.feature_names
bos['PRICE'] = boston.target
y = bos.PRICE
x = bos.drop('PRICE', axis = 1)  # DROP PRICE since only want X-type variables (not Y-target)

xw = df.to_array(x)

xw = np.insert(xw,0,1, axis = 1) # to insert a column of "1" values

但是,我收到错误:

AttributeError                            Traceback (most recent call last)
<ipython-input-131-272f1b4d26ba> in <module>()
  1 import copy
  2 
----> 3 xw = df.to_array(x)

AttributeError: 'int' object has no attribute 'to_array'

我不确定问题出在哪里。我需要将一个值数组(在本例中为x)传递给函数以执行一些矩阵运算

插入功能正在逐步进行代码开发,但出于某种原因在这里失败了。

我试过了:

xw = copy.deepcopy(x)

没有成功

有什么想法吗?

1 个答案:

答案 0 :(得分:0)

x.as_matrix()而非df.to_array(x)
有关as_matrix()

的更多详细信息,请参阅pandas文档

以下是可行的代码

from sklearn.datasets import load_boston
import pandas as pd
import numpy as np

boston = load_boston()

bos = pd.DataFrame(boston.data) # convert to DF
bos.columns = boston.feature_names
bos['PRICE'] = boston.target
y = bos.PRICE
x = bos.drop('PRICE', axis = 1)  # DROP PRICE since only want X-type variables (not Y-target)

xw = x.as_matrix()

xw = np.insert(xw,0,1, axis = 1) # to insert a column of "1" values