在Python 2.7中将Flattened JSON转换为Dataframe

时间:2017-07-11 19:59:16

标签: python json python-2.7 dataframe flatten

我正在尝试使用REST API读取一些数据并将其写入数据库表。我写了下面的代码。但不幸的是,我有点被压扁的JSON。你能帮忙解决一下将JSON转换为数据框的方法。

代码

 import requests
 import json
 import pandas
 from pandas.io.json import json_normalize
 from flatten_json import flatten

 j_username = 'ABCD'
 j_password = '12456'
 query = '"id = 112233445566"'
 print query
 r=requests.get('Url' % query, auth= (j_username,j_password))

 print r.json()
 first_response = r.json()
 string_data = json.dumps(r.json())
 normalized_r = json_normalize(r.json())
 print flatten(r.json())
 r_flattened = flatten(r.json())
 r_flattened_str = json.dumps(flatten(r.json()))
 print type (flatten(r.json()))

展平的JSON输出如下

      {
     'data_0_user-35': u'Xyz',
'data_0_user-34': None,
'data_0_user-37': u'CC',
'data_0_user-36': None,
'data_0_user-31': u'Regular',
'data_0_user-33': None, 
'data_0_user-32': None, 
'data_0_target-rcyc_id': 0101,
'data_0_to-mail': None,
'data_0_closing-version': None, 
'data_0_user-44': None, 
'data_0_test-reference': None,
'data_0_request-server': None, 
'data_0_target-rcyc_type': u'regular type',
'data_0_project': None,
'data_0_user-01': u'Application Name',
'data_0_user-02': None,
'data_0_user-03': None, .......
 .......

......  .....}

预期输出

               data_0_user-35   data_0_user-34  data_0_user-37  .........

                 XYZ               None            CC             ........

1 个答案:

答案 0 :(得分:0)

我终于解决了这个问题。此代码将从REST API读取数据并将其转换为数据框,最终写入Oracle数据库。感谢我的朋友和社区中的一些优秀人才,他们的答案帮助我解决了这个问题。

        import requests
        from pandas.io.json import json_normalize
        import datetime as dt
        import pandas as pd
        import cx_Oracle

        date = dt.datetime.today().strftime("%Y-%m-%d")
        date = "'%s'" % date
        query2 = '"creation-time=%s"' % date
        r = requests.get('url?query=%s' % query2,
             auth=('!username', 'password#'))
        response_data_json = r.json()
        response_data_normalize = json_normalize(response_data_json['data'])
        subset = response_data_normalize.loc[:, ('value1', 'value2')]
        Counter = subset['value1'].max()
        converted_value = getattr(Counter, "tolist", lambda x=Counter: x)()
        frame = pd.DataFrame()
        for i in range(2175, converted_value + 1): #2175 is just a reference number to start the comparison from....specific to my work
            id = '"id = %s"' % i
            r = requests.get('url?&query=%s' % id, auth=('!username', 'password#'))
            response_data_json1 = r.json()
            response_data_normalize1 = json_normalize(response_data_json1['data'])
            sub = response_data_normalize1.loc[:, ('value1', 'value2', 'value3',  'value4')]
            frame = frame.append(sub, ignore_index=True)


        con = cx_Oracle.connect('USERNAME','PASSWORD',cx_Oracle.makedsn('HOSTNAME',PORTNUMBER,'SERVICENAME'))

        cur = con.cursor()
        rows = [tuple(x) for x in frame.values]
        print rows
        cur.executemany('''INSERT INTO TABLENAME(Value1, Value2,Value3,Value4) VALUES (:1,:2,:3,:4)''',rows)
        con.commit()
        cur.close()
        con.close()