Question

我正在通过pandas阅读csv文件。该文件包含数字和文本数据。如何将这些数据只存储在一个numpy矩阵中？

Answer 1

构造结构化数组的常用方法：

In [36]: arr = np.empty((2,), dtype='U10,int')
In [37]: arr
Out[37]: array([('', 0), ('', 0)], dtype=[('f0', '<U10'), ('f1', '<i4')])

或者使用元组列表填充数据：

In [38]: arr = np.array([('one',1),('Two',2)], dtype='U10,int')
In [39]: arr
Out[39]: array([('one', 1), ('Two', 2)], dtype=[('f0', '<U10'), ('f1', '<i4')])
In [40]: arr.shape
Out[40]: (2,)

可以将1d阵列重新整形为2d（并重复或平铺或堆叠以生成更大的nd数组）：

In [41]: arr.reshape(2,1)
Out[41]: 
array([[('one', 1)],
       [('Two', 2)]], dtype=[('f0', '<U10'), ('f1', '<i4')])

它也可以变成np.matrix，但我不知道为什么有人会想这样做：

In [42]: np.matrix(arr)
Out[42]: matrix([[('one', 1), ('Two', 2)]], dtype=[('f0', '<U10'), ('f1', '<i4')])
In [43]: _.shape
Out[43]: (1, 2)
In [44]: __['f0']
Out[44]: matrix([['one', 'Two']], dtype='<U10')

字段按名称访问，而不是按列访问。形状记录的维度与dtype定义的记录组件之间存在根本性的中断。

Python：如何将float和string类型存储到一个矩阵中？

1 个答案: