Question

这是我的代码，

img_path = tf.read_file(testqueue[0])
my_img = tf.image.decode_jpeg(img_path)
sess.run(my_img)
print my_img.get_shape()

结果是，

(?, ?, ?)

为什么我得到这个结果？

Answer 1

回答这个问题并提供一些细节。

静态信息

tensor_name.shape返回图形编译时可用的形状信息。它依赖于tensor- 属性。

tf.decode_jpeg已注册here。在创建图形期间，TensorFlow在InferenceContext下运行形状传播。给定输入张量中已知的形状属性，每个操作都提供其输出张量的外观提示。

例如，“rgb2gray”操作只会复制输入张量的形状（比如[b'，h'，w'，c']并将输出设置为[b'，h'，w'， 1]。它不需要知道b'，h'，w'的确切值，因为它可以只复制这些先前的值。

查看具体的implementation for tf.decode_jpeg，此操作显然可以处理channels属性：

// read the attribute "channels from "tf.image.decode_jpeg(..., channels)"
TF_RETURN_IF_ERROR(c->GetAttr("channels", &channels));
// ....
// set the tensor information "my_img.get_shape()" will have
c->set_output(0, c->MakeShape({InferenceContext::kUnknownDim,
                                 InferenceContext::kUnknownDim, channels_dim}));

前两个维度设置为InferenceContext::kUnknownDim，因为操作只知道高度和宽度，但具体值可以变化。它可以最好地猜测通道轴的外观。如果您指定属性tf.decode_jpeg(..., channels=3)，则可以并将设置最后一个

这导致形状（？，？，？），因为if-branch channels ==0变为活跃here。

运行时信息

另一方面，tf.shape定义here结束here。这会检查实际的张量 - 内容 here：

// get actual tensor-shape from the value itself
TensorShape shape;
OP_REQUIRES_OK(ctx, shape_op_helpers::GetRegularOrVariantShape(ctx, 0, &shape));
const int rank = shape.dims();
// write the tensor result from "tf.shape(...)"
for (int i = 0; i < rank; ++i) {
  int64 dim_size = shape.dim_size(i);
  // ...
  vec(i) = static_cast<OutType>(dim_size); // the actual size for dim "i"
}

就像tf.shape对之前的操作说的那样：

几分钟前你可以告诉我你得出的结论。我不在乎你在这一点上是多么聪明，也不在乎你对形状的猜测。看，我只看一下现在有内容的具体张量，我就完成了。

后果

这有一些重要的后果：

tf.shape是张量，而tensorname.shape不是
某些属性需要整数。因此无法使用张量tf.shape
图表优化（如XLA）只能依赖tensorname.shape
如果你知道图像的形状（只有128x128x3图像的数据库），你应该设置形状，例如，使用tf.reshape(img, [128, 128, 3]

您可能也感兴趣tf.image.extract_jpeg_shape已实施here。

为什么图像张量的形状是（？，？，？）

1 个答案:

静态信息

运行时信息

后果