编辑1。

Question

有人知道如何处理Tensorflow的'work_element_count'错误吗？

F ./tensorflow/core/util/cuda_launch_config.h:127]检查失败：work_element_count> 0（0与0）中止（核心已弃用）

这是我的源代码的一部分：

class DiscriminatorModel:
    def __init__(self, session, some_parameters):
        self.sess = session
        self.parameters = some_parameters

    def build_feed_dict(self, input_frames, gt_output_frames, generator):
        feed_dict = {}
        batch_size = np.shape(gt_output_frames)[0]
        print(batch_size) # 1

        print(np.shape(generator.input_frames_train))   # (?,7,32,32,32,1)
        print(np.shape(input_frames))                   # (1,7,32,32,32,1)
        print(np.shape(generator.gt_frames_train))      # (?,7,32,32,32,1)
        print(np.shape(gt_output_frames))               # (1,7,32,32,32,1)

        g_feed_dict={generator.input_frames_train:input_frames,
                     generator.gt_frames_train:gt_output_frames}

        def getshape(d):
            if isinstance(d, dict):
                return {k:getshape(d[k]) for k in d}
            else:
                return None
        print("g_feed_dict shape :", getshape(g_feed_dict),"\n")
        # {<tf.Tensor 'generator/data/Placeholder:0' shape=(?, 32, 32, 32, 1) dtype=float32>: None, <tf.Tensor 'generator/data/Placeholder_1:0' shape=(?, 32, 32, 32, 1) dtype=float32>: None}

        print(sys.getsizeof(generator.scale_preds_train))    # 96
        print(sys.getsizeof(g_feed_dict))                    # 288


        # error occurs here.
        g_scale_preds = self.sess.run(generator.scale_preds_train, feed_dict=g_feed_dict)
        # F ./tensorflow/core/util/cuda_launch_config.h:127] Check failed: work_element_count > 0 (0 vs. 0)
        # Aborted (core dumped)

    def train_step(self, batch, generator):
        print(np.shape(batch))    # [1, 7, 32, 32, 32, 2]
        input_frames = batch[:, :, :, :, :, :-1]
        gt_output_frames = batch[:, :, :, :, :, -1:]

        feed_dict = self.build_feed_dict(input_frames, gt_output_frames, generator)

class GeneratorModel:
    def __init__(self, session, some_parameters):
        self.sess = session
        self.parameters = some_parameters

        self.input_frames_train = tf.placeholder(
            tf.float32, shape=[None, 7, 32, 32, 32, 1])
        self.gt_frames_train = tf.placeholder(
            tf.float32, shape=[None, 7, 32, 32, 32, 1])

        self.input_frames_test = tf.placeholder(
            tf.float32, shape=[None, 7, 32, 32, 32, 1])
        self.gt_frames_test = tf.placeholder(
            tf.float32, shape=[None, 7, 32, 32, 32, 1])

        self.scale_preds_train = []
        for p in range(4):
            # scale size, 4 --> 8 --> 16 --> 32
            sc = 4*(2**p)
            # this passes tf.Tensor array of shape (1,7,sc,sc,sc,1)
            train_preds = calculate(self.width_train,
                                    self.height_train,
                                    self.depth_train,
                                    ...)
            self.scale_preds_train.append(train_preds

        # [ <..Tensor shape=(1,7,4,4,4,1) ....>,
        #   <..Tensor shape=(1,7,8,8,8,1) ....>,
        #   <..Tensor shape=(1,7,16,16,16,1)..>,
        #   <..Tensor shape=(1,7,32,32,32,1)..> ]
        print(self.scale_preds_train)

sess = tf.Session()
d_model = DiscriminatorModel(sess, some_parameters)
g_model = GeneratorModel(sess, some_parameters)
sess.run(tf.global_variables_initializer())

# this returns numpy array of shape [1,7,32,32,32,2]
batch = get_batch()

# trouble here.
d_model.train_step(batch, g_model)

我看到了一些有关以下方面的建议：

使用CUDA 9.0 / cuDNN 7.0 / tensorflow-gpu 1.7.0（->我已经在使用这些了）
检查批处理的大小是否大于0（->似乎大于。）
使用的GPU数量不要超过一批中的样本数量（->我不这样做）

我在其中5个中使用了一个11GB gpu，指定为

~$ CUDA_VISIBLE_DEVICES=2 python3 foo.py

，批次大小为1。谁能告诉我我遗漏的地方或做错了什么？

编辑1。

我发现一个案例可以解决此错误。如果我对输入进行一些修改，例如

# ... previous code does not change
print(sys.getsizeof(g_feed_dict))                    # 288
temp_index = 0
temp_input = [generator.scale_preds_train[temp_index],
              generator.scale_preds_train[temp_index],
              generator.scale_preds_train[temp_index],
              generator.scale_preds_train[temp_index]]
# this <temp_input> does not raise error here.
# however temp_index > 0 don't work.
g_scale_preds = self.sess.run(temp_input, feed_dict=g_feed_dict)

这会使输入传递到sess.run，其形状类似于

[(1,7,4,4,4,1), (1,7,4,4,4,1), (1,7,4,4,4,1), (1,7,4,4,4,1)]

应该是（最初）缩放形状的列表，例如[（1,7,4,4,4,1），（1,7,8,8,8,1），（1,7,16， 16,16,1），（1,7,32,32,32,1）]。另外，字典feed_dict中的数组具有形状 (1,7,32,32,32,1)。

错误似乎来自tensorflow-gpu，它试图到达错误的数组索引（实际上未分配内存），因此“工作元素为0”（但我不确定）。

我无法理解temp_index > 0（例如1，2，3）为什么会抛出相同的结果 Check failed错误，而0是唯一没有错误的形状。

编辑2。

将gpu从TITAN Xp更改为GeForce GTX后，错误日志显示

浮点异常（核心已转储）

使用相同的代码（sess.run）。

Answer 1

在我的情况下，转换层之一具有0个输出要素图，这会导致此问题。

Answer 2

现在我已经解决了。.

就像GTX错误日志告诉我的那样，有些东西变为零，并且实际上是一个分母（因此与上面的所有这些代码无关）。上次调试时的规格如下：

CUDA 8.0 / Tensorflow 1.8.0

当然还有GeForce GTX。我认为日志显示的是版本，而不是实际的GPU，因此有所不同（并且稍有详细），即使不同版本本身并不能解决问题。

Answer 3

我在Colab上训练模型时遇到了同样的问题。问题是'num_classes'，在配置文件中将其设置为2，而我的模型有36个类。

您应该考虑注意配置文件中的num_classes。

Tensorflow检查失败：work_element_count> 0

编辑1。

编辑2。

3 个答案: