执行脚本所花的时间比预期的要多得多。对于仅 1250 条记录,循环并插入表需要 20 多分钟。 请告诉我们这是否正常
以下是从 API(JSON) 中提取的 11 列并将每一行加载到表中 (oracle)。
脚本:
Button
有没有办法使用索引或任何建议。
auth_values = (user, passwd)
response = requests.get(url, auth=auth_values)
json_data = json.loads(response.text)
for data in json_data['result']:
branchFullName = data['full_name']
branchNum = data['u_branch_id']
branchName = data['u_branch_name']
sysId= data['sys_id']
sys_updated_on = data['sys_updated_on']
sys_created_on = data['sys_created_on']
cursor.execute("INSERT INTO "+PrestageTable+"(BRANCH_FULL_NAME,
BRANCH_NUM, BRANCH_NAME,SYS_ID,SYS_CREATED_ON,SYS_UPDATED_ON) VALUES
(:1, :2, :3, :4, :5, :6)",
(branchFullName,branchNum,branchName,sysId,sys_updated_on,sys_created_on))
con.commit()
添加了 JSON 文件。
Updated:
insert_data = []
for data in json_data['result']:
branchFullName = data['full_name']
branchNum = data['u_branch_id']
branchName = data['u_branch_name']
sysId= data['sys_id']
sys_updated_on = data['sys_updated_on']
sys_created_on = data['sys_created_on']
insert_data.append(
(branchFullName, branchNum, branchName, sysId, sys_updated_on, sys_created_on)
)
args_str = ','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s)", x) for x in insert_data)
cursor.execute(f"INSERT INTO {PrestageTable} VALUES " + args_str)
con.commit()
答案 0 :(得分:1)
正如我所说,您应该一次插入多行并执行一次。
试试这个:
insert_data = []
for data in json_data['result']:
... # branchFullName, branchNum, etc. variables
inser_data.append(
(branchFullName, branchNum, branchName, sysId, sys_updated_on, sys_created_on)
)
args_str = ','.join(cursor.mogrify("(%s,%s,%s,%s,%s,%s)", x) for x in insert_data)
cursor.execute(f"INSERT INTO {PrestageTable} VALUES " + args_str)
con.commit()
请注意,execute
在循环之外。
答案 1 :(得分:1)
使用 cursor.executemany()
一次性插入所有行。这要求您为所有行创建一个二维参数列表。
params = []
for data in json_data['result']:
branchFullName = data['full_name']
branchNum = data['u_branch_id']
branchName = data['u_branch_name']
sysId= data['sys_id']
sys_updated_on = data['sys_updated_on']
sys_created_on = data['sys_created_on']
params.append((branchFullName,branchNum,branchName,sysId,sys_updated_on,sys_created_on)
cursor.executemany("""INSERT INTO "+PrestageTable+"(BRANCH_FULL_NAME,
BRANCH_NUM, BRANCH_NAME,SYS_ID,SYS_CREATED_ON,SYS_UPDATED_ON) VALUES
(:1, :2, :3, :4, :5, :6)""",
params)
con.commit()
答案 2 :(得分:0)
此代码中缺少某些内容。这个循环不会花时间执行,所以问题在于检索数据,或者插入到oracle中。 首先我建议确定问题出在哪里,像perf_tool这样的分析工具可以帮助你很多。很难想象这里出了什么问题,但我认为经过一些检查你会发现问题在于写入数据库,因此解决方案可能是进行批量插入或处理索引。
答案 3 :(得分:0)
我按照以下进行了更改,现在运行速度更快了。 代码:
Updated:
insert_data = []
for data in json_data['result']:
branchFullName = data['full_name']
branchNum = data['u_branch_id']
branchName = data['u_branch_name']
sysId= data['sys_id']
sys_updated_on = data['sys_updated_on']
sys_created_on = data['sys_created_on']
insert_data.append(
(branchFullName, branchNum, branchName, sysId, sys_updated_on, sys_created_on)
)
cursor.executemany(f"INSERT INTO {PrestageTable} VALUES " + insert_data)
con.commit()```