Question

我有一个从csv创建的数组，但第一行包含每列的标题。它必须是一个str，但由于大多数数据是浮动的，它现在是一个float64。

代码是以下创建矩阵的代码：

self.data = np.genfromtxt(self.path, delimiter=",")

我需要将第一行更改为字符串，但如果我使用：

self.data[0] = self.data[0].astype(str)

它返回一行'nan'，我不明白。

谢谢。

Answer 1

If you have column names, you can use the names argument to pull that out.

import numpy as np
data = np.genfromtxt('data.csv', delimiter=",", names=True)

data

array([( 1.,  4.,  7.), ( 2.,  5.,  8.), ( 3.,  6.,  9.)], 
      dtype=[('a', '<f8'), ('b', '<f8'), ('c', '<f8')])

You can now do things like data['a'] to get the array named 'a'

You can also access the column names with data.dtype.names which will return a tuple of all the column names. ('a', 'b', 'c')

Answer 2

The np.genfromtxt function generates an np.ndarray by taking your array and casting to the datatype of the array, in this case float64. When you cast this back to a string, it has already been cast as a float64 and so it's read as NaN, because presumably your titles are not numbers.

Luckily for you, the function has a way to extract the header. By using the "names" parameter: np.genfromtxt(self.path, delimiter=",", names=True), the function will also return a list of column titles from the first line of the input file.

如何在Python中更改数组中行的类型？

2 个答案: