我的目标是将在AWS RDS上运行的postgreSQL数据库中的表复制到Amazone S3上的.csv文件。为此,我使用了AWS数据管道并找到了以下tutorial,但是当我按照所有步骤进行操作时,我的管道停留在"WAITING FOR RUNNER"
上,请参见屏幕截图。 AWS documentation指出:
确保为runsOn或workerGroup设置有效值 这些任务的字段
但是设置了“运行于”字段。知道为什么该管道卡住了吗?
和我的定义文件:
{
"objects": [
{
"output": {
"ref": "DataNodeId_Z8iDO"
},
"input": {
"ref": "DataNodeId_hEUzs"
},
"name": "DefaultCopyActivity01",
"runsOn": {
"ref": "ResourceId_oR8hY"
},
"id": "CopyActivityId_8zaDw",
"type": "CopyActivity"
},
{
"resourceRole": "DataPipelineDefaultResourceRole",
"role": "DataPipelineDefaultRole",
"name": "DefaultResource1",
"id": "ResourceId_oR8hY",
"type": "Ec2Resource",
"terminateAfter": "1 Hour"
},
{
"*password": "xxxxxxxxx",
"name": "DefaultDatabase1",
"id": "DatabaseId_BWxRr",
"type": "RdsDatabase",
"region": "eu-central-1",
"rdsInstanceId": "aqueduct30v05.cgpnumwmfcqc.eu-central-1.rds.amazonaws.com",
"username": "xxxx"
},
{
"name": "DefaultDataFormat1",
"id": "DataFormatId_wORsu",
"type": "CSV"
},
{
"database": {
"ref": "DatabaseId_BWxRr"
},
"name": "DefaultDataNode2",
"id": "DataNodeId_hEUzs",
"type": "SqlDataNode",
"table": "y2018m07d12_rh_ws_categorization_label_postgis_v01_v04",
"selectQuery": "SELECT * FROM y2018m07d12_rh_ws_categorization_label_postgis_v01_v04 LIMIT 100"
},
{
"failureAndRerunMode": "CASCADE",
"resourceRole": "DataPipelineDefaultResourceRole",
"role": "DataPipelineDefaultRole",
"pipelineLogUri": "s3://rutgerhofste-data-pipeline/logs",
"scheduleType": "ONDEMAND",
"name": "Default",
"id": "Default"
},
{
"dataFormat": {
"ref": "DataFormatId_wORsu"
},
"filePath": "s3://rutgerhofste-data-pipeline/test",
"name": "DefaultDataNode1",
"id": "DataNodeId_Z8iDO",
"type": "S3DataNode"
}
],
"parameters": []
}
答案 0 :(得分:0)
通常,“正在等待运行程序”状态表示它正在等待资源(例如EMR群集)。您似乎尚未设置“ workGroup”字段。这意味着您已指定要执行的操作,但未指定应由谁执行。