数据流作业在BigQuery写入时失败,出现后端错误

时间:2019-11-27 19:16:43

标签: python google-bigquery google-cloud-dataflow

我有一份工作失败了,与最终导入BigQuery有关的几个不同的错误。我已经运行了5次,但每次都会失败,尽管错误消息有时会有所不同。当我在SQLite数据库上本地运行该工作时,该工作正常进行,因此我认为问题出在Google后端。

一条错误消息:

        static void PlayHangMan(string wordToGuess, int numberOfGuesses)
        {
            try
            {
                int counter = 0;
                bool victory = false;

                var charArray = wordToGuess.ToCharArray();
                var found = new char[charArray.Length];
                for (int i = 0; i < charArray.Length; i++)
                {
                    found[i] = '*';
                }

                string hiddenWord = string.Empty;
                foreach (char character in charArray)
                {
                    hiddenWord = hiddenWord + "*";
                }

                while (counter < numberOfGuesses)
                {
                    Console.Write($"(Guess left: {numberOfGuesses - counter}) Enter a letter in a word: {hiddenWord} > ");

                    // Wait for user input
                    var userGuess = Console.ReadKey().KeyChar;

                    hiddenWord = string.Empty;
                    Console.WriteLine();
                    for (int index = 0; index < charArray.Length; index++)
                    {
                        if (charArray[index] == userGuess)
                        {
                            found[index] = userGuess;
                            hiddenWord = hiddenWord + userGuess;
                        }
                        else
                        {
                            hiddenWord = hiddenWord + found[index];
                        }
                    }
                    counter++;
                    if (!found.Any(f => f == '*'))
                    {
                        Console.WriteLine($"You won! Word: {hiddenWord}");
                        victory = true;
                        break;
                    }
                }

                if (!victory)
                    Console.WriteLine("You are out of guesses.");

                Console.Read();
            }
            catch (Exception ex)
            {
                var m = ex.Message;
            }
        }

另一则错误消息:

**Workflow failed. Causes: S04:write meter_traces_combined to BigQuery/WriteToBigQuery/NativeWrite failed., BigQuery import job "dataflow_job_5111748333716803539" failed., BigQuery creation of import job for table "meter_traces_combined" in dataset "ebce" in project "oeem-ebce-platform" failed., BigQuery execution failed., Unknown error.**

另一个错误消息:


    raceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 649, in do_work
    work_executor.execute()
  File "/usr/local/lib/python3.7/site-packages/dataflow_worker/executor.py", line 178, in execute
    op.finish()
  File "dataflow_worker/native_operations.py", line 93, in dataflow_worker.native_operations.NativeWriteOperation.finish
  File "dataflow_worker/native_operations.py", line 94, in dataflow_worker.native_operations.NativeWriteOperation.finish
  File "dataflow_worker/native_operations.py", line 95, in dataflow_worker.native_operations.NativeWriteOperation.finish
  File "/usr/local/lib/python3.7/site-packages/dataflow_worker/nativefileio.py", line 465, in __exit__
    self.file.close()
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/filesystemio.py", line 217, in close
    self._uploader.finish()
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/gcsio.py", line 588, in finish
    raise self._upload_thread.last_error  # pylint: disable=raising-bad-type
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/gcsio.py", line 565, in _start_upload
    self._client.objects.Insert(self._insert_request, upload=self._upload)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/internal/clients/storage/storage_v1_client.py", line 1154, in Insert
    upload=upload, upload_config=upload_config)
  File "/usr/local/lib/python3.7/site-packages/apitools/base/py/base_api.py", line 715, in _RunMethod
    http_request, client=self.client)
  File "/usr/local/lib/python3.7/site-packages/apitools/base/py/transfer.py", line 908, in InitializeUpload
    return self.StreamInChunks()
  File "/usr/local/lib/python3.7/site-packages/apitools/base/py/transfer.py", line 1020, in StreamInChunks
    additional_headers=additional_headers)
  File "/usr/local/lib/python3.7/site-packages/apitools/base/py/transfer.py", line 971, in __StreamMedia
    self.RefreshResumableUploadState()
  File "/usr/local/lib/python3.7/site-packages/apitools/base/py/transfer.py", line 873, in RefreshResumableUploadState
    self.stream.seek(self.progress)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/filesystemio.py", line 301, in seek
    offset, whence, self.position, self.last_block_position))
NotImplementedError: offset: 10485760, whence: 0, position: 16777216, last: 8388608

有什么想法吗?职位ID 2019-11-27_09_50_34-1251118406325466877,如果Google上的任何人正在阅读本文。谢谢。

1 个答案:

答案 0 :(得分:0)

此处提供Google Cloud支持。我检查了您的工作,发现了两个内部问题,可能与该失败有关。正如Alex Amato在评论中建议的那样,我会尝试使用

--experiments=use_beam_bq_sink

否则,我建议您直接在GCP上打开票证,因为这可能需要进一步调查。

我希望有帮助。