我正在编写一个脚本,该脚本向Google BigQuery提交多个(30)SQL查询。遍历查询的最佳方法是什么?我的代码可以工作,但是感觉不太像Python。
我需要在job_id中传递查询名称并提交查询。
def run_query(query,job_id):
try:
query_job = client.query(query,job_id=job_id)
polling = 1
while query_job.done() is False:
if "q1_" in job_id:
time.sleep(20)
print("Job State : {} - Polling : {}".format(query_job.state,polling))
polling +=1
query_job.reload()
else:
time.sleep(1)
print("Job State : {} - Polling : {}".format(query_job.state,polling))
polling +=1
query_job.reload()
except Conflict as err:
print("Could not run Query. System Message: \n{}".format(err))
sys.exit()
q1 = """SELECT * FROM XYZ"""
q2 = """SELECT TOP 10 * FROM YZF"""
q3 = """select id from fjfj"""
q4 = """SELECT * FROM XYZ"""
q5 = """SELECT TOP 10 * FROM YZF"""
q6 = """select id from fjfj"""
query_jobs = [q1,q2,q3,q4,q5,q6]
q = 0
for query in query_jobs:
randid = str(uuid.uuid4())
q+=1
queries = "q"+str(q)
job_id = queries+"_"+randid
run_query(query,job_id)
print job_id
答案 0 :(得分:1)
我很好,您可以在循环中使用enumerate
而不是计数器来稍微改善一下:
for i, query in enumerate(query_jobs):
randid = str(uuid.uuid4())
queries = "q"+str(i)
job_id = queries+"_"+randid
run_query(query,job_id)
print job_id
答案 1 :(得分:0)
首先,您可以使用以下方法简化run_query
方法:
time.sleep(20 if "q1_" in job_id else 1)
这将在python 3+中运行,我不确定它是否将在2.7以上,但是您明白了。
然后,您应该看看python string formatting,它确实可以为您提供帮助。
最后,您可能会遇到类似的事情:
query_jobs = ["""SELECT * FROM XYZ""",
# (...)
"""select id from fjfj"""]
for i, query in enumerate(query_jobs):
job_id = "q%s_%s" % (i, uuid.uuid4())
run_query(query,job_id)
print job_id
答案 2 :(得分:0)
我建议在字典中包含查询,概述每个查询的用途。
QUERIES = {
"q1_XYZ": """SELECT * FROM XYZ""",
"q2_YZF": """SELECT TOP 10 * FROM YZF""",
"q3_FJFJ": """select id from fjfj""",
"q4_XYZ2": """SELECT * FROM XYZ""",
"q5_YZF": """SELECT TOP 10 * FROM YZF""",
"q6_FJFJ": """select id from fjfj"""
}
for job_id, query in query_jobs.items():
run_query(query,job_id)
根据这将变得多么复杂,我建议添加更多属性。这样的好处是,如果您需要在run_query中使用更复杂的逻辑,则可以通过属性而不是查询的job_id进行控制。
QUERIES = {
"q1_XYZ": { 'query': """SELECT * FROM XYZ""", 'is_A': True, 'cost': 100 },
<< more samples >>
}
for job_id, details in query_jobs.items():
run_query(details['query'],job_id)