我正在使用Python 2.7。我正在尝试使用Python subprocees执行sqoop命令.Popen将表传输到HDFS。每当我执行Popen(sqoop_cmd,stdout = subprocess.PIPE)时,它将以process.poll()状态1返回。
查看输出,我正在
Warning: /opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.21/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
它没有超过这一点。但是如果我在linux终端上运行相同的命令,我会收到此警告,但它会继续执行。
Warning: /opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.21/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/03/07 14:41:24 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.9.0
Enter password:
18/03/07 14:41:36 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
18/03/07 14:41:36 INFO manager.SqlManager: Using default fetchSize of 1000
18/03/07 14:41:36 INFO tool.CodeGenTool: Beginning code generation
18/03/07 14:41:36 INFO tool.CodeGenTool: Will generate java class as codegen_PRE_TRANSACTION_****.....
我认为在它返回警告之后,Popen返回警告代码并且不执行整个过程。 这是我的全部代码。
with open(log_file, 'a+') as fh:
process = subprocess.Popen(sqoop_cmd_list, stdout=subprocess.PIPE)
while True:
output = process.stdout.readline()
if output = '' and process.poll() is not None:
break
if output:
print output.strip()
fh.write(output)
rc = process.poll()
和sqoop_cmd_list是
sqoop_cmd_list = ['sqoop',\
'import',\
'--connect', jdbc_connector,\
'--username', \
username,\
'--password', password,\
'--table', table_name,\
'--split-by', column_name,\
'--target-dir',\
'--delete-target-dir',\
destination_path,\
'--as-parquetfile'
]
如果在返回警告后停止,我可以忽略警告并继续执行吗?或者还有其他方法可以使用Python进行sqoop吗?