如何将数组从JSON扁平化为pandas

时间:2017-03-16 01:31:38

标签: python json pandas dataframe time-series

我有一个JSON:

[{
"analogData": [
[122483,104],[122493,100],[122503,106],[122513,106],[122523,107]
[122533,99],[122543,103],[122553,98],[122563,106],[122573,95],
[122583,98],[122593,97],[122603,95],[122613,101],[122623,99],
[122633,98],[122643,101],[122653,102],[122663,105],[122673,99],
[122683,102],[122693,107],[122703,106],[122713,108],[122723, 99],
[122733,98],[122743,104],[122753,104],[122763,96],[122773,99]]
"upTime": 132833,"deviceId": "5c6d27a","state": "idle"},  next 30 readings, ...]"

它再次重复下一组。

我能够将JSON加载到数据框中,它出现为:

... 0  [[122483, 104], [122493, 100], [122503, 106], ...
1      [[41614, 103], [41624, 105], [41634, 102], [41...
2      [[22674, 113], [22684, 89], [22694, 106], [227...
3      [[220570, 81], [220580, 81], [220590, 81], [22...
4      [[160474, 85], [160484, 86], [160494, 86], [16...

我想得到“模拟数据”#39;重新格式化为包含4列的数据框:索引,正常运行时间,时间和级别

index uptime  time    level   
1     132833  122483  104
2     132833  122493  100
3     132833  122503  106

...

2 个答案:

答案 0 :(得分:0)

假设JSON数据已加载到变量data中,您可以尝试一些数据结构操作吗?

data = {}

# load up the JSON file, assuming that's how it's saved
with open("test.json") as json_file:
    data = json.load(json_file)  # now a Python list of dict

new_data = []
temp_data = {}  # for temporary use

for col in data:  # iterate through the data
   for row in col["analogData"]:
       temp_data["uptime"] = col["upTime"]
       # split up the analogData tuple-like lists
       temp_data["analogData"], temp_data["level"] = row
       new_data.append(temp_data)

       temp_data = {}  # reset row

df = pandas.DataFrame(new_data)

不确定您希望index列的方式,但Pandas在创建数据框时会创建其默认索引列。

答案 1 :(得分:0)

基本上,您使用Json API将JSON加载到字典中,然后根据需要提取值。

以pythonic方式进行:

my_json = '{"analogData": [[122483,104],[122493,100],[122503,106],[122513,106],[122523,107],[122533,99],[122543,103],[122553,98],[122563,106],[122573,95],[122583,98],[122593,97],[122603,95],[122613,101],[122623,99],[122633,98],[122643,101],[122653,102],[122663,105],[122673,99],[122683,102],[122693,107],[122703,106],[122713,108],[122723, 99],[122733,98],[122743,104],[122753,104],[122763,96],[122773,99]], "upTime": 132833,"deviceId": "5c6d27a","state": "idle"}'

buffer = json.loads(my_json)
aD = buffer["analogData"]
uT = buffer["upTime"]
print( [[i+1, uT] + aD[i] for i in range(len(aD))])

输出:

[[1, 132833, 122483, 104], [2, 132833, 122493, 100], [3, 132833, 122503, 106], [4, 132833, 122513, 106], [5, 132833, 122523, 107], [6, 132833, 122533, 99], [7, 132833, 122543, 103], [8, 132833, 122553, 98], [9, 132833, 122563, 106], [10, 132833, 122573, 95], [11, 132833, 122583, 98], [12, 132833, 122593, 97], [13, 132833, 122603, 95], [14, 132833, 122613, 101], [15, 132833, 122623, 99], [16, 132833, 122633, 98], [17, 132833, 122643, 101], [18, 132833, 122653, 102], [19, 132833, 122663, 105], [20, 132833, 122673, 99], [21, 132833, 122683, 102], [22, 132833, 122693, 107], [23, 132833, 122703, 106], [24, 132833, 122713, 108], [25, 132833, 122723, 99], [26, 132833, 122733, 98], [27, 132833, 122743, 104], [28, 132833, 122753, 104], [29, 132833, 122763, 96], [30, 132833, 122773, 99]]