我有一个从csv创建的数组,但第一行包含每列的标题。它必须是一个str,但由于大多数数据是浮动的,它现在是一个float64。
代码是以下创建矩阵的代码:
self.data = np.genfromtxt(self.path, delimiter=",")
我需要将第一行更改为字符串,但如果我使用:
self.data[0] = self.data[0].astype(str)
它返回一行'nan',我不明白。
谢谢。
答案 0 :(得分:1)
If you have column names, you can use the names
argument to pull that out.
import numpy as np
data = np.genfromtxt('data.csv', delimiter=",", names=True)
data
array([( 1., 4., 7.), ( 2., 5., 8.), ( 3., 6., 9.)],
dtype=[('a', '<f8'), ('b', '<f8'), ('c', '<f8')])
You can now do things like data['a']
to get the array named 'a'
You can also access the column names with data.dtype.names
which will return a tuple of all the column names. ('a', 'b', 'c')
答案 1 :(得分:0)
The np.genfromtxt
function generates an np.ndarray
by taking your array and casting to the datatype of the array, in this case float64
. When you cast this back to a string
, it has already been cast as a float64
and so it's read as NaN
, because presumably your titles are not numbers.
Luckily for you, the function has a way to extract the header. By using the "names" parameter: np.genfromtxt(self.path, delimiter=",", names=True)
, the function will also return a list of column titles from the first line of the input file.