Google Cloud Dataflow工作神秘地打破

时间:2019-02-26 22:00:57

标签: google-cloud-platform google-cloud-dataflow

我反复尝试运行一组Google云数据流作业,直到最近才开始正常工作,现在却趋于崩溃。这个错误一直是最令人困惑的,仅仅是因为我不知道正在引用什么代码,而且它似乎是GCP内部的?

我的工作ID是:2019-02-26_13_27_30-16974532604317793751

我正在n1-standard-96实例上运行这些作业。

完整的跟踪记录供参考:

  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 642, in do_work
    work_executor.execute()
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 156, in execute
    op.start()
  File "dataflow_worker/shuffle_operations.py", line 49, in dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
    def start(self):
  File "dataflow_worker/shuffle_operations.py", line 50, in dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
    with self.scoped_start_state:
  File "dataflow_worker/shuffle_operations.py", line 65, in dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
    with self.scoped_process_state:
  File "dataflow_worker/shuffle_operations.py", line 66, in dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
    with self.shuffle_source.reader() as reader:
  File "dataflow_worker/shuffle_operations.py", line 68, in dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
    for key_values in reader:
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", line 433, in __iter__
    for entry in entries_iterator:
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", line 272, in next
    return next(self.iterator)
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", line 230, in __iter__
    chunk, next_position = self.reader.Read(start_position, end_position)
  File "third_party/windmill/shuffle/python/shuffle_client.pyx", line 133, in shuffle_client.PyShuffleReader.Read
IOError: Shuffle read failed: DATA_LOSS: Missing last fragment of a large value.

1 个答案:

答案 0 :(得分:0)

也许输入数据现在更大了,DataFlow无法处理吗?

我的工作有洗牌问题。当我切换到可选的“随机播放服务”时,它开始工作。您可能想尝试一下。只需将以下内容添加到您的作业命令中:

--experiments shuffle_mode=service

参考:请参阅this page的“使用Cloud Dataflow Shuffle”一节。