在scipy中的csv到矩阵

时间:2010-11-15 20:01:44

标签: csv numpy scipy

我无法使用简单的矩阵运算来处理数据,因为我的生活中我无法弄清楚我做错了什么:

data = np.genfromtxt(dataset1, names=True, delimiter=",", dtype=float)

X = np.matrix(data)
print(X.T*X)

Traceback (most recent call last):
  File "genfromtxt.py", line 11, in <module>
    print(X.T*X)
  File "/usr/lib/pymodules/python2.6/numpy/matrixlib/defmatrix.py", line 319, in __mul__
    return N.dot(self, asmatrix(other))
TypeError: can't multiply sequence by non-int of type 'tuple'

print(data)给出:

[ (3.0, 32.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 9.0, 0.0, 5.5606799999999996, 9.0)
 (4.0, 43.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 9.0, 0.0, 5.7203099999999996, 16.0)
 (5.0, 40.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 9.0, 0.0, 5.9964500000000003, 25.0)
 ...,
 (5.0, 50.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 12.0, 0.0, 6.2146100000000004, 25.0)
 (6.0, 50.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 12.0, 0.0, 6.2915700000000001, 36.0)
 (7.0, 50.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 12.0, 0.0, 6.3716100000000004, 49.0)]

编辑:

此外,此代码

reader = csv.reader(open(dataset1, 'r'))
header = reader.next()
X = np.array([[float(col) for col in row] for row in reader])

print(X.shape)
print(X.T.shape)
print(X * X.T)

给出了这个输出:

(4165, 13)
(13, 4165)
Traceback (most recent call last):
  File "genfromtxt.py", line 17, in <module>
    print(X * X.T)
ValueError: shape mismatch: objects cannot be broadcast to a single shape
>>> 

2 个答案:

答案 0 :(得分:3)

第二个例子的问题似乎是运算符*对NumPy数组执行逐元素的多重操作。大概你想要执行矩阵乘法。有两种选择:

  1. 使用numpy.matrix代替numpy.array - 然后乘法将是矩阵乘法,整数指数的幂将按预期工作。

  2. 使用numpy.dot(A, B)代替A*B - 这将对数组和矩阵执行矩阵乘法。

答案 1 :(得分:0)

嘿   如果您对Matlab和/或Octave有任何经验,那么本页提供了许多有用的提示: http://www.scipy.org/NumPy_for_Matlab_Users