Question

我的目标是将在AWS RDS上运行的postgreSQL数据库中的表复制到Amazone S3上的.csv文件。为此，我使用了AWS数据管道并找到了以下tutorial，但是当我按照所有步骤进行操作时，我的管道停留在"WAITING FOR RUNNER"上，请参见屏幕截图。 AWS documentation指出：

确保为runsOn或workerGroup设置有效值这些任务的字段

但是设置了“运行于”字段。知道为什么该管道卡住了吗？

和我的定义文件：

{
  "objects": [
    {
      "output": {
        "ref": "DataNodeId_Z8iDO"
      },
      "input": {
        "ref": "DataNodeId_hEUzs"
      },
      "name": "DefaultCopyActivity01",
      "runsOn": {
        "ref": "ResourceId_oR8hY"
      },
      "id": "CopyActivityId_8zaDw",
      "type": "CopyActivity"
    },
    {
      "resourceRole": "DataPipelineDefaultResourceRole",
      "role": "DataPipelineDefaultRole",
      "name": "DefaultResource1",
      "id": "ResourceId_oR8hY",
      "type": "Ec2Resource",
      "terminateAfter": "1 Hour"
    },
    {
      "*password": "xxxxxxxxx",
      "name": "DefaultDatabase1",
      "id": "DatabaseId_BWxRr",
      "type": "RdsDatabase",
      "region": "eu-central-1",
      "rdsInstanceId": "aqueduct30v05.cgpnumwmfcqc.eu-central-1.rds.amazonaws.com",
      "username": "xxxx"
    },
    {
      "name": "DefaultDataFormat1",
      "id": "DataFormatId_wORsu",
      "type": "CSV"
    },
    {
      "database": {
        "ref": "DatabaseId_BWxRr"
      },
      "name": "DefaultDataNode2",
      "id": "DataNodeId_hEUzs",
      "type": "SqlDataNode",
      "table": "y2018m07d12_rh_ws_categorization_label_postgis_v01_v04",
      "selectQuery": "SELECT * FROM y2018m07d12_rh_ws_categorization_label_postgis_v01_v04 LIMIT 100"
    },
    {
      "failureAndRerunMode": "CASCADE",
      "resourceRole": "DataPipelineDefaultResourceRole",
      "role": "DataPipelineDefaultRole",
      "pipelineLogUri": "s3://rutgerhofste-data-pipeline/logs",
      "scheduleType": "ONDEMAND",
      "name": "Default",
      "id": "Default"
    },
    {
      "dataFormat": {
        "ref": "DataFormatId_wORsu"
      },
      "filePath": "s3://rutgerhofste-data-pipeline/test",
      "name": "DefaultDataNode1",
      "id": "DataNodeId_Z8iDO",
      "type": "S3DataNode"
    }
  ],
  "parameters": []
}

Answer 1

通常，“正在等待运行程序”状态表示它正在等待资源（例如EMR群集）。您似乎尚未设置“ workGroup”字段。这意味着您已指定要执行的操作，但未指定应由谁执行。

AWS数据管道卡在等待Runner上

1 个答案: