下面是我正在运行的调用sqoop的python代码,但是除了下面几行之外,这没有捕获日志
警告:/usr/hdp/2.6.4.0-91/accumulo不存在! Accumulo导入将失败。 请将$ ACCUMULO_HOME设置为Accumulo安装的根目录。
import subprocess
job = "sqoop-import --direct --connect 'jdbc:sqlserver://host' --username myuser --password-file /user/ivr_sqoop --table data_app_det --delete-target-dir --verbose --split-by attribute_name_id --where \"db_process_time BETWEEN ('2018-07-15') and ('9999-12-31')\""
print job
with open('save.txt','w') as fp:
proc = subprocess.Popen(job, stdout=fp, stderr=subprocess.PIPE, shell=True)
stdout, stderr = proc.communicate()
print "Here is the return code :: " + str(proc.returncode)
print stdout`
请让我知道我的通话方式是否有问题。
注意:单个sqoop cmd运行良好,并生成了所有日志。
我也尝试了以下方法,结果是一样的
import subprocess
job = "sqoop-import --direct --connect 'jdbc:sqlserver://host' --username myuser --password-file /user/ivr_sqoop --table data_app_det --delete-target-dir --verbose --split-by attribute_name_id --where \"db_process_time BETWEEN ('2018-07-15') and ('9999-12-31')\""
proc = subprocess.Popen(job, stdout=subprocess.PIPE,stderr=subprocess.PIPE, shell=True)
stdout, stderr = proc.communicate()
,还使用cmd末尾的'2> mylog.log'
import subprocess
job = "sqoop-import --direct --connect 'jdbc:sqlserver://host' --username myuser --password-file /user/ivr_sqoop --table data_app_det --delete-target-dir --verbose --split-by attribute_name_id --where \"db_process_time BETWEEN ('2018-07-15') and ('9999-12-31')\" > mylog.log "
proc = subprocess.Popen(job, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
stdout, stderr = proc.communicate()
我发现了以下类似的问题,但也没有答案。
Subprocess Popen : Ignore Accumulo warning and continue execution of Sqoop
答案 0 :(得分:2)
由于您已添加shell=True
,因此它无法捕获Sqoop日志。请从命令中删除shell=True
,然后添加universal_newlines=True
,它将显示控制台日志。
有效的代码段:
import subprocess
import logging
logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.DEBUG)
# Function to run Hadoop command
def run_unix_cmd(args_list):
"""
run linux commands
"""
print('Running system command: {0}'.format(' '.join(args_list)))
proc = subprocess.Popen(args_list, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)
s_output, s_err = proc.communicate()
s_return = proc.returncode
return s_return, s_output, s_err
# Create Sqoop Job
def sqoop_job():
"""
Create Sqoop job
"""
cmd = ['sqoop', 'import', '--connect', 'jdbc:oracle:thin:@//host:port/schema', '--username', 'user','--password', 'XX', '--query', '"your query"', '-m', '1', '--target-dir', 'tgt_dir']
print(cmd)
(ret, out, err) = run_unix_cmd(cmd)
print(ret, out, err)
if ret == 0:
logging.info('Success.')
else:
logging.info('Error.')
if __name__ == '__main__':
sqoop_job()