Question

如here所示，youtube-8m tf-records以我的问题末尾的格式保存。我编写了一个代码来提取功能。但有一个问题。代码可以成功读取功能中的所有元素，但无法读取feature_lists。实际上，该示例根本不包含features_list，当我尝试访问它时出错。我该如何阅读feauures_list。我附上数据格式，我的代码和输出：

DataTable

这是代码：

context: {
  feature: {
    key  : "video_id"
    value: {
      bytes_list: {
        value: [YouTube video id string]
      }
    }
  }
  feature: {
    key  : "labels"
      value: {
        int64_list: {
          value: [1, 522, 11, 172] # The meaning of the labels can be found here.
        }
      }
    }
}

feature_lists: {
  feature_list: {
    key  : "rgb"
    value: {
      feature: {
        bytes_list: {
          value: [1024 8bit quantized features]
        }
      }
      feature: {
        bytes_list: {
          value: [1024 8bit quantized features]
        }
      }
      ... # Repeated for every second of the video, up to 300
  }
  feature_list: {
    key  : "audio"
    value: {
      feature: {
        bytes_list: {
          value: [128 8bit quantized features]
        }
      }
      feature: {
        bytes_list: {
          value: [128 8bit quantized features]
        }
      }
    }
    ... # Repeated for every second of the video, up to 300
  }

}

并且代码的输出是：

def readTfRecordSamples(tfrecords_filename):

    record_iterator =tf.python_io.tf_record_iterator(path=tfrecords_filename)

    for string_record in record_iterator:

        example = tf.train.Example()
        example.ParseFromString(string_record)
        prinr("Example :")
        pprint(example)

        img_string = (example.features)
        print ("Features are : \n")
        pprint(img_string)

        classID = (example.features.feature['labels']
                             .int64_list.value[0]
                             )
        videoID = (example.features.feature['video_id']
                             .bytes_list.value[0])
        print (classID,videoID)
        # Raise Error    
        rgbArray = (example.feature_lists.feature_list['rgb']
                         .bytes_list
                         .value[0]) 

        raw_input(LineSeperator)

Answer 1

而不是

example = tf.train.Example()

试

example = tf.train.SequenceExample()

然后用

验证

print(example)

至少可以使用Google的AudioSet数据集，据说它与youtube-8m具有相同的结构（它还包含feature_lists），需要作为序列示例阅读。

从youtube-8m tf-records读取帧级功能

1 个答案: