我正在编写一个Python脚本,使用beeline CLI解析元数据并为我发出一个文件来解析。我无法从Python传递正确的转义字符序列 - >直线脚本。
以下是我的代码段
#!/usr/bin/python
import commands
import subprocess
import sys
hive_cmd = "beeline -u \"jdbc:hive2://$(hostname -f):10000/;principal=hive/_HOST@some_company.COM\" \
--silent=true\
--outputformat=csv2\
--showHeader=false\
--force=true\
--showWarnings=false\
-e 'USE some_database; "
if __name__ == '__main__':
describe_table_cmd = ""
f = open('run_dml.sql', 'w+')
tables_list_cmd = hive_cmd + "SHOW TABLES;'"
status, tables_list = commands.getstatusoutput(tables_list_cmd)
for table_name in tables_list.splitlines():
if not 'HotSpot' in table_name:
describe_table_cmd += "\!sh echo {0};\ndescribe {0};\n".format(str(table_name))
if table_name == "2table":
break
print(describe_table_cmd)
status2, columns_list = commands.getstatusoutput(hive_cmd + describe_table_cmd + "'")
for line in columns_list.splitlines():
f.write("{0}\n".format(line))
f.close()
这是控制台输出:
[username@server ~]$ ./r.py
\!sh echo 1table;
describe 1table;
\!sh echo 2table;
describe 2table;
[username@server ~]$
这就是run_dml.sql
中的内容Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Error: Error while compiling statement: FAILED: ParseException line 1:1 cannot recognize input near '!' 'sh' 'echo' (state=42000,code=40000)
显然,我试图在shell上运行echo语句来获取表输出。如果有更好的方法来动态获取hive元数据(table_names及其列),我很满意。在那之前,这是违法的代码。
describe_table_cmd += "\!sh echo {0};\ndescribe {0};\n".format(str(table_name))