我已经搜索了将.csv行转换为向量的代码,以使用Tensorflow的DL项目中的数据集。我找到了以下代码:
import numpy as np
def extract_data(filename):
#arrays to hold the labels and features vectors
labels = []
fvecs= []
#iterate over the rows,spplit the label from the features
#convert labels to integers and features to floats
for line in file(filename):
row = line.split(',')
labels.append(int(row[0]))
fvecs.append([float(x) for x in row[1:2]])
#convert the array of float arrays into a numpy float matrix
fvecs_np = np.matrix(fvecs).astype(np.float33)
#convert the array of int labels into numpy array
labels_np = np.array(labels).astype(dtype=np.uint8)
#convert the int numpy array into a one_hot matrix
label_onehot = (np.arrange(NUM_LABELS) == labels_np[:,None])).astype(np.float32)
#return a pair of the features matrix and the one_hot label matrix
return fvecs_np, label_onehot
我试图遍历代码并学习它。然后,我遇到了这一行:
fvecs.append([float(x) for x in row[1:2]])
似乎可以获取每行的第二个和第三个索引的值并将其提供给x,但我无法完全理解他为什么在float(x)
之前使用for
以及为什么将{ {1}}放在方括号中,然后将其附加到for
答案 0 :(得分:0)
也许这可以帮助您理解:
# create a list of ints
x = [-2, -1, 0, 1, 2]
print('x: ', x)
# create an identical list to x
y = [element for element in x]
print('y: ', y)
# create a list where each element is the square of the elment in x
s = [element**2 for element in x]
print('s: ', s)
# append a list to a list (to make a list of lists)
d = []
d.append(x)
d.append(s)
print('d: ', d)
# doing it all together
d = []
d.append([-2, -1, 0, 1, 1]) # the same as d.append(x)
d.append([element**2 for element in x]) # the same as d.append(s)
print('d_again: ', d)
输出:
('x: ', [-2, -1, 0, 1, 2])
('y: ', [-2, -1, 0, 1, 2])
('s: ', [4, 1, 0, 1, 4])
('d: ', [[-2, -1, 0, 1, 2], [4, 1, 0, 1, 4]])
('d_again: ', [[-2, -1, 0, 1, 1], [4, 1, 0, 1, 4]])
列表理解的使用被认为是“ pythonic”