如何选择从CSV读取的Numpy数组的特定列?

时间:2019-02-07 16:31:01

标签: python numpy

我正在尝试:

import numpy as np

housing_data = np.loadtxt('Housing.csv', delimiter=',')
print(housing_data)
print(housing_data.shape)
x1 = housing_data[:,:,0]
x2 = housing_data[:,:,1]
y = housing_data[:,:,2]

print(x1)
print(x2)
print(y)

我的数据的形状为(47, 3),看起来像:

[[2.104e+03 3.000e+00 3.999e+05]
 [1.600e+03 3.000e+00 3.299e+05]
 [2.400e+03 3.000e+00 3.690e+05]
 ....

我正在尝试将第一列设置为x1,第二列设置为x2,第三列设置为y。但是我的代码似乎不起作用。我在做什么错了?

2 个答案:

答案 0 :(得分:2)

我用随机数据创建了一个虚拟* csv文件。我会这样:

import numpy as np
import pandas as pd

# read file using pandas, without header and convert it to numpy arrays
housing_data = pd.read_csv('Housing.csv', header=None).values

# print housing data
print(housing_data)
print(housing_data.shape)

# slice through the data
x1 = housing_data[:,0]
x2 = housing_data[:,1]
y = housing_data[:,2]

print(x1)
print(x2)
print(y)

输出看起来像这样:

enter image description here

答案 1 :(得分:1)

可以使用Numpy和Python选择:

#Shape (2,2) from top right corner
data[:2,1:]

#Shape bottom row
data[2]

#Shape bottom row
data[2,:]

或有条件:

data[data>2]

也许您可以检查.csv文件和数据类型:

data.astype(float) 
data = np.arange(3, dtype=np.uint8)