无法遍历从LibSVM生成器创建的Tensorflow数据集。 NoneType不支持项目分配

时间:2019-05-16 08:15:47

标签: python tensorflow ranking tensorflow-datasets

我尝试使用一些自定义数据运行example的tensorflow排名。该示例使用其数据。

基本上,我想使用函数tensorflow.data.Dataset.from_generator()创建一个tensorflow数据集,以获取tf排名的数据集。

我用

创建了数据集
from sklearn.datasets import dump_svmlight_file
dump_svmlight_file(X=X, y=y, f=f, query_id=query_id)

它看起来像这样:

0 qid:10 0:53156 1:6456 2:700
1 qid:10 0:48112 1:3535 2:700
2 qid:10 0:48112 1:3655 2:16500
3 qid:10 0:51641 1:8871 2:1200
4 qid:10 0:13207 1:2790 2:400
5 qid:10 0:8175  1:1656 2:700
6 qid:21 0:8175  1:1776 2:2700
7 qid:21 0:9620  1:2424 2:1600
8 qid:21 0:8079  1:2443 2:700
9 qid:25 0:13428 1:3777 2:800

然后我用以下代码创建数据集:

_NUM_FEATURES_OWN=3
_LIST_SIZE_OWN=10

train_dataset_OWN = tf.data.Dataset.from_generator(
      tfr.data.libsvm_generator(_TRAIN_DATA_PATH_OWN, _NUM_FEATURES_OWN, _LIST_SIZE_OWN),
      output_types=(
          {str(k): tf.float32 for k in range(1,_NUM_FEATURES_OWN+1)},
          tf.float32
      ),
      output_shapes=(
          {str(k): tf.TensorShape([_LIST_SIZE_OWN, 1])
            for k in range(1,_NUM_FEATURES_OWN+1)},
          tf.TensorShape([_LIST_SIZE_OWN])
      )
    )

并获取数据集。但是,当我尝试遍历它时,会收到错误消息:

train_dataset_OWN.make_one_shot_iterator().get_next()
InvalidArgumentError: TypeError: 'NoneType' object does not support item assignment
Traceback (most recent call last):

  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/script_ops.py", line 207, in __call__
    ret = func(*args)

  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 449, in generator_py_func
    values = next(generator_state.get_iterator(iterator_id))

  File "/root/.local/lib/python3.6/site-packages/tensorflow_ranking/python/data.py", line 477, in inner_generator
    num_features, list_size, doc_list)

  File "/root/.local/lib/python3.6/site-packages/tensorflow_ranking/python/data.py", line 424, in _libsvm_generate
    features.get(feature_id)[idx, 0] = value

TypeError: 'NoneType' object does not support item assignment


     [[{{node PyFunc}}]] [Op:IteratorGetNextSync]

我在这里创建了一个示例笔记本: https://colab.research.google.com/drive/1hAVJrQmbXD5h1pZfCKpkvSJib4_OaL1J

1 个答案:

答案 0 :(得分:1)

我一直在努力解决同样的问题。

对我有用的是从1而不是0索引特征,例如:

0 qid:10 1:53156 2:6456 3:700
1 qid:10 1:48112 2:3535 3:700
2 qid:10 1:48112 2:3655 3:16500
3 qid:10 1:51641 2:8871 3:1200
4 qid:10 1:13207 2:2790 3:400
5 qid:10 1:8175  2:1656 3:700
6 qid:21 1:8175  2:1776 3:2700
7 qid:21 1:9620  2:2424 3:1600
8 qid:21 1:8079  2:2443 3:700
9 qid:25 1:13428 2:3777 3:800