如here所示,youtube-8m tf-records以我的问题末尾的格式保存。我编写了一个代码来提取功能。但有一个问题。代码可以成功读取功能中的所有元素,但无法读取feature_lists。实际上,该示例根本不包含features_list,当我尝试访问它时出错。我该如何阅读feauures_list。我附上数据格式,我的代码和输出:
DataTable
这是代码:
context: {
feature: {
key : "video_id"
value: {
bytes_list: {
value: [YouTube video id string]
}
}
}
feature: {
key : "labels"
value: {
int64_list: {
value: [1, 522, 11, 172] # The meaning of the labels can be found here.
}
}
}
}
feature_lists: {
feature_list: {
key : "rgb"
value: {
feature: {
bytes_list: {
value: [1024 8bit quantized features]
}
}
feature: {
bytes_list: {
value: [1024 8bit quantized features]
}
}
... # Repeated for every second of the video, up to 300
}
feature_list: {
key : "audio"
value: {
feature: {
bytes_list: {
value: [128 8bit quantized features]
}
}
feature: {
bytes_list: {
value: [128 8bit quantized features]
}
}
}
... # Repeated for every second of the video, up to 300
}
}
并且代码的输出是:
def readTfRecordSamples(tfrecords_filename):
record_iterator =tf.python_io.tf_record_iterator(path=tfrecords_filename)
for string_record in record_iterator:
example = tf.train.Example()
example.ParseFromString(string_record)
prinr("Example :")
pprint(example)
img_string = (example.features)
print ("Features are : \n")
pprint(img_string)
classID = (example.features.feature['labels']
.int64_list.value[0]
)
videoID = (example.features.feature['video_id']
.bytes_list.value[0])
print (classID,videoID)
# Raise Error
rgbArray = (example.feature_lists.feature_list['rgb']
.bytes_list
.value[0])
raw_input(LineSeperator)
答案 0 :(得分:0)
而不是
example = tf.train.Example()
试
example = tf.train.SequenceExample()
然后用
验证print(example)
至少可以使用Google的AudioSet数据集,据说它与youtube-8m具有相同的结构(它还包含feature_lists),需要作为序列示例阅读。