迭代JSON数据,创建dict / list然后使用psycopg2插入Postgresql数据库?

时间:2017-04-27 20:01:38

标签: python json postgresql dictionary

这是我正在研究的多部分问题。是的,我是Python的新手,所以我可能无法正确表达。

试图弄清楚我将如何完成迭代我从API提取的一些JSON数据以提取值然后我相信我需要创建一个dict或数据列表然后我想要“INSERT” INTO“PostgreSQL数据库。

这是我从设备中提取的JSON数据:

{u'service_group_stat': {u'cur_conns': 344,
                         u'member_stat_list': [{u'cur_conns': 66,
                                                u'req_bytes': 1476212423,
                                                u'req_pkts': 10449342,
                                                u'resp_bytes': 33132743858,
                                                u'resp_pkts': 25652317,
                                                u'server': u'WWW0006',
                                                u'tot_conns': 172226},
                                               {u'cur_conns': 64,
                                                u'req_bytes': 1666275823,
                                                u'req_pkts': 11982676,
                                                u'resp_bytes': 37575461036,
                                                u'resp_pkts': 29175599,
                                                u'server': u'WWW0005',
                                                u'tot_conns': 205244},
                                               {u'cur_conns': 89,
                                                u'req_bytes': 1671222671,
                                                u'req_pkts': 11940864,
                                                u'resp_bytes': 37064038202,
                                                u'resp_pkts': 28747313,
                                                u'server': u'WWW0004',
                                                u'tot_conns': 195789},
                                               {u'cur_conns': 37,
                                                u'req_bytes': 94117510958,
                                                u'req_pkts': 585916896,
                                                u'resp_bytes': 1860691638618,
                                                u'resp_pkts': 1439228725,
                                                u'server': u'WWW0003',
                                                u'tot_conns': 7366402},
                                               {u'cur_conns': 42,
                                                u'req_bytes': 98580368121,
                                                u'req_pkts': 642797814,
                                                u'resp_bytes': 1934241923560,
                                                u'resp_pkts': 1498242871,
                                                u'server': u'WWW0002',
                                                u'tot_conns': 7221995},
                                               {u'cur_conns': 46,
                                                u'req_bytes': 94886760323,
                                                u'req_pkts': 593577169,
                                                u'resp_bytes': 1863028601218,
                                                u'resp_pkts': 1441197389,
                                                u'server': u'WWW0001',
                                                u'tot_conns': 7260787}],
                         u'name': u'SG_SITE1.BUSINESS.COM_443',
                         u'req_bytes': 292398350319,
                         u'req_pkts': 1856664761,
                         u'resp_bytes': 5765734406492,
                         u'resp_pkts': 4462244214,
                         u'tot_conns': 22422443}}

它存储为“数据”。

所以我认为需要这样的东西:

for row in data['service_group_stat']['member_stat_list']:
    SRVR_NAME = row['server']
    CURR_CONNS = row['cur_conns']
    TOTAL_CONNS = row['tot_conns']
    REQ_BYTES = row['req_bytes']
    REQ_PKTS = row['req_pkts']
    RESP_BYTES = row['resp_bytes']
    RESP_PKTS = row['resp_pkts']

目标是提取该列表中每个服务器的指标。

基本上我想提取以下内容并插入数据库:

SRVR_NAME  CURR_CONNS  TOT_CONNS  REQ_BYTES   REQ_PKTS  RESP_BYTES   RESP_PKTS
WWW00006    66         172226     1476212423  10449342  33132743858  25652317
WWW00005    64         205244     1666275823  11982676  37575461036  29175599
WWW00004    89         195789     1671222671  11940864  37064038202  28747313
WWW00003    37         7366402    94117510958 585916896 1860691638618 1439228725

我不确定是否应该尝试读取一台服务器的详细信息,根据该服务器的值创建一个dict,然后一次将一台服务器插入数据库?

类似的东西:

# Create DICT of server metrics from JSON data

metrics = ({"SRVR_NAME":'SRVR_NAME', "CURR_CONNS":'CURR_CONNS',
             "TOTAL_CONNS":'TOTAL_CONNS', "REQ_BYTES":'REQ_BYTES', "REQ_PKTS":'REQ_PKTS', "RESP_BYTES":'RESP_BYTES',
             "RESP_PKTS":'RESP_PKTS'})

# Execute storing metrics for server
# Table: metrics    Fields:(SRVR_NAME, CURR_CONNS, TOTAL_CONNS, REQ_BYTES, REQ_PKTS, RESP_BYTES, RESP_PKTS)

# Using psycopg2 to interact with PostgreSQL

cur.executemany("""INSERT INTO metrics("SRVR_NAME", "CURR_CONNS", "TOTAL_CONNS", "REQ_BYTES", 
                    "REQ_PKTS", "RESP_BYTES", "RESP_PKTS") VALUES (%(SRVR_NAME)s, %(DATE)s, %(TIME)s, %(CURR_CONNS)s, 
                    %(TOTAL_CONNS)s, %(REQ_BYTES)s, %(REQ_PKTS)s, %(RESP_BYTES)s, %(RESP_PKTS)s)""", metrics)

我认为这与我一次能够为单个服务器执行指标的方式很接近?

尚未想出整件事,但我认为它很接近。

但真正的问题是,我可以构建一个更大的列表或dict,其中包含每个服务器的所有详细信息,然后进行批量插入吗?或者也许我遍历新的dict / list并插入该dict / list的每次传递/迭代?

希望这是有道理的。只是不善于简单地打破这个并解释。

希望了解实现这一目标的最有效方法。

更新#1

使用一些代码建议我尝试使用以下内容:

now = datetime.datetime.now()
date = str(now.strftime("%m-%d-%Y"))
time = str(now.strftime("%H:%M:%S"))

# Define a DICT/LIST of ServiceGroup names that we will pull stats for

name = 'SG_ACCOUNT.BUSINESS.COM_443'

# Pull stats from A10 LB for ServiceGroup and store in memory
# Will want to eventually iterate through a DICT/LIST of SG names

data = c.slb.service_group.stats(name)

srv = """INSERT INTO server("SG_NAME", "SRVR_NAME") VALUES ('name', %(server)s)"""
argslist1 = data[u'service_group_stat'][u'member_stat_list']
psycopg2.extras.execute_batch(cur, srv, argslist1, page_size=100)

sql = """INSERT INTO metrics("SRVR_NAME", "DATE", "TIME", "CURR_CONNS", "TOTAL_CONNS", 
                             "REQ_BYTES", "REQ_PKTS", "RESP_BYTES", 
                             "RESP_PKTS") VALUES (%(server)s, 
             'date', 'time', %(curr_conns)s, %(total_conns)s, 
             %(req_bytes)s, %(req_pkts)s, %(resp_bytes)s, %(resp_pkts)s)"""
argslist2 = data[u'service_group_stat'][u'member_stat_list']
psycopg2.extras.execute_batch(cur, sql, argslist2, page_size=100)

然而它失败了,因为我似乎无法引入先前声明的值,例如:

date = str(now.strftime("%m-%d-%Y"))
time = str(now.strftime("%H:%M:%S"))
name = 'SG_ACCOUNT.BUSINESS.COM_443'

尝试插入'name'

时出现以下错误
DETAIL:  Key (SG_NAME)=(name) is not present in table "servicegroup".

我认为它实际上没有拉出为'name'定义的值,而是试图插入“servicegroup”表中不存在的“name”。

1 个答案:

答案 0 :(得分:1)

http://initd.org/psycopg/docs/extras.html#module-psycopg2.extras它说.executemany并不是最高效的。另一种选择是http://initd.org/psycopg/docs/extras.html#fast-execution-helpers

中的execute_batch()
import psycopg2.extras
cur = conn.cursor()
sql = """INSERT INTO metrics("SRVR_NAME", "CURR_CONNS", "TOTAL_CONNS", 
                             "REQ_BYTES", "REQ_PKTS", "RESP_BYTES", 
                             "RESP_PKTS") VALUES (%(srvr_name)s, 
             %(date)s, %(time)s, %(curr_conns)s, %(total_conns)s, 
             %(req_bytes)s, %(req_pkts)s, %(resp_bytes)s, %(resp_pkts)s)"""
argslist = data[u'service_group_stat'][u'member_stat_list']
psycopg2.extras.execute_batch(cur, sql, argslist, page_size=100)

此处argslistdata[u'service_group_stat'][u'member_stat_list']获取,如下所示:

[{u'cur_conns': 66,
  u'req_bytes': 1476212423,
  u'req_pkts': 10449342,
  u'resp_bytes': 33132743858L,
  u'resp_pkts': 25652317,
  u'server': u'WWW0006',
  u'tot_conns': 172226},
 {u'cur_conns': 64,
  u'req_bytes': 1666275823,
  u'req_pkts': 11982676,
  u'resp_bytes': 37575461036L,
  u'resp_pkts': 29175599,
  u'server': u'WWW0005',
  u'tot_conns': 205244},
 {u'cur_conns': 89,
  u'req_bytes': 1671222671,
  u'req_pkts': 11940864,
  u'resp_bytes': 37064038202L,
  u'resp_pkts': 28747313,
  u'server': u'WWW0004',
  u'tot_conns': 195789},
 {u'cur_conns': 37,
  u'req_bytes': 94117510958L,
  u'req_pkts': 585916896,
  u'resp_bytes': 1860691638618L,
  u'resp_pkts': 1439228725,
  u'server': u'WWW0003',
  u'tot_conns': 7366402},
 {u'cur_conns': 42,
  u'req_bytes': 98580368121L,
  u'req_pkts': 642797814,
  u'resp_bytes': 1934241923560L,
  u'resp_pkts': 1498242871,
  u'server': u'WWW0002',
  u'tot_conns': 7221995},
 {u'cur_conns': 46,
  u'req_bytes': 94886760323L,
  u'req_pkts': 593577169,
  u'resp_bytes': 1863028601218L,
  u'resp_pkts': 1441197389,
  u'server': u'WWW0001',
  u'tot_conns': 7260787}]