我应该如何使用tf数据集来运行model.predict(data)并可以访问tf数据集的其他功能?
例如:我的tf数据集具有以下格式: (tensor <(100,224,224,3)>,tensor <(100,)>)->将图像预处理为tf.float32,将图像的uuid预处理为tf.string
如果我像这样提取特征向量:
for image_data, uuids in ds.batch(100):
features = model.predict(data[0]) -> I get an array of features.
At this moment features is an array of (100, 2048) and uuids is a tensor of (100,) tf.string
How can I combine them in order to write the feature vectors to disk?
From my understanding, I need to have both of them in the same format, either both tensors so I can continue using tf code and save the feature vector as a tfrecord, either to get the uuid as a string from the uuid tensor so I can use python code and save the array in the file using numpy.tofile.
So my questions are:
- How can I make the features to be a tensor?
- Or can I get the string value from the tensor uuid?
- Does anything sounds wrong in what I try to do? Is there a more optimal way to create the input pipeline? Or did I misunderstood the usage of Keras API and tf dataset?
如果我使用python管道,则可以成功地将数组保存到文件中。但是我想使用tf数据集,因为我认为它将具有更快的速度和更优化的性能,因为它具有并行映射功能,批处理和自动调整并行调用的功能。