从youtube-8m tf-records读取帧级功能

时间:2017-08-22 07:55:12

标签: tensorflow

here所示,youtube-8m tf-records以我的问题末尾的格式保存。我编写了一个代码来提取功能。但有一个问题。代码可以成功读取功能中的所有元素,但无法读取feature_lists。实际上,该示例根本不包含features_list,当我尝试访问它时出错。我该如何阅读feauures_list。我附上数据格式,我的代码和输出:

DataTable

这是代码:

context: {
  feature: {
    key  : "video_id"
    value: {
      bytes_list: {
        value: [YouTube video id string]
      }
    }
  }
  feature: {
    key  : "labels"
      value: {
        int64_list: {
          value: [1, 522, 11, 172] # The meaning of the labels can be found here.
        }
      }
    }
}

feature_lists: {
  feature_list: {
    key  : "rgb"
    value: {
      feature: {
        bytes_list: {
          value: [1024 8bit quantized features]
        }
      }
      feature: {
        bytes_list: {
          value: [1024 8bit quantized features]
        }
      }
      ... # Repeated for every second of the video, up to 300
  }
  feature_list: {
    key  : "audio"
    value: {
      feature: {
        bytes_list: {
          value: [128 8bit quantized features]
        }
      }
      feature: {
        bytes_list: {
          value: [128 8bit quantized features]
        }
      }
    }
    ... # Repeated for every second of the video, up to 300
  }

}

并且代码的输出是:

def readTfRecordSamples(tfrecords_filename):

    record_iterator =tf.python_io.tf_record_iterator(path=tfrecords_filename)

    for string_record in record_iterator:

        example = tf.train.Example()
        example.ParseFromString(string_record)
        prinr("Example :")
        pprint(example)

        img_string = (example.features)
        print ("Features are : \n")
        pprint(img_string)

        classID = (example.features.feature['labels']
                             .int64_list.value[0]
                             )
        videoID = (example.features.feature['video_id']
                             .bytes_list.value[0])
        print (classID,videoID)
        # Raise Error    
        rgbArray = (example.feature_lists.feature_list['rgb']
                         .bytes_list
                         .value[0]) 

        raw_input(LineSeperator)

1 个答案:

答案 0 :(得分:0)

而不是

example = tf.train.Example()

example = tf.train.SequenceExample()

然后用

验证
print(example)

至少可以使用Google的AudioSet数据集,据说它与youtube-8m具有相同的结构(它还包含feature_lists),需要作为序列示例阅读。