我有这样的JSON:
in_str='''{
"prices": [
[1, 10],
[2, 20],
[3, 30]
],
"total_volumes": [
[1, 100],
[2, 200],
[3, 300]
]
}'''
我正在尝试生成3列(id,价格,数量)的熊猫数据框:
1 10 100
2 20 200
3 30 300
我尝试使用pandas.read_json()
,但这给了我两列,我不知道下一步要去哪里。 json_normalize()
仅使我返回两行。
import pandas as pd
import json
from pandas.io.json import json_normalize
in_str='''{
"prices": [
[1, 10],
[2, 20],
[3, 30]
],
"total_volumes": [
[1, 10],
[2, 20],
[3, 30]
]
}'''
df = pd.read_json(in_str)
json_normalize(json.loads(in_str))
答案 0 :(得分:1)
我们可以照常阅读,重新转换和合并:
df = pd.read_json(in_str)
pd.merge(*[pd.DataFrame(np.array(df[col].to_list()),
columns=['id', col]) for col in df],
on='id')
输出:
id prices total_volumes
0 1 10 100
1 2 20 200
2 3 30 300
答案 1 :(得分:1)
pd.read_json
没有适合此结构的orient
参数。看起来更容易直接转换json和构造数据帧:
>>> pd.DataFrame({key: dict(value) for key, value in json.loads(in_str).items()})
prices total_volumes
1 10 100
2 20 200
3 30 300
答案 2 :(得分:1)
您可以将json预处理为适当的字典,然后用它来构造数据框
import ast
d = ast.literal_eval(in_str)
d1 = {k: dict(v) for k, v in d.items()}
df = pd.DataFrame(d1).rename_axis('id').reset_index()
Out[857]:
id prices total_volumes
0 1 10 100
1 2 20 200
2 3 30 300
答案 3 :(得分:1)
如果您尚未使用pandas
,则不必下载它。这是一个使用内置json
解析器读取文件,并使用本机数据结构将数据处理为所需形状(也可能是更有用的形状)的解决方案。
import json
in_str='''{
"prices": [
[1, 10],
[2, 20],
[3, 30]
],
"total_volumes": [
[1, 100],
[2, 200],
[3, 300]
]
}'''
in_json = json.loads(in_str)
# you can use json.load(filename) if you're using a file here.
print(in_json)
'''
>>> {'prices': [[1, 10], [2, 20], [3, 30]], 'total_volumes': [[1, 100], [2, 200], [3, 300]]}
'''
# Here we're going to merge the two data sets to make them iterable in one go.
inventory = dict()
for item_id, price in in_json["prices"]:
inventory[item_id] = {"price": price}
for item_id, volume in in_json["total_volumes"]:
if isinstance(inventory.get(item_id), dict):
inventory[item_id]["volume"] = volume
else:
inventory[item_id] = {"volume": volume}
print(inventory)
'''
>>> {1: {'price': 10, 'volume': 100}, 2: {'price': 20, 'volume': 200}, 3: {'price': 30, 'volume': 300}}
'''
# Now that the data is all in one dict, we can just iterate through it to get the rows in the shape that you want.
inventory_table = list()
for item_id, info in inventory.items():
row = [item_id, info.get("price"), info.get("volume")]
print(row)
'''
>>> [1, 10, 100]
>>> [2, 20, 200]
>>> [3, 30, 300]
'''
inventory_table.append(row)
# the final form
print(inventory_table)
'''
>>> [[1, 10, 100], [2, 20, 200], [3, 30, 300]]
'''
现在我们以此为基线,我们可以制作人们在Python中流口水的一些单行代码:
import json
in_str='''{
"prices": [
[1, 10],
[2, 20],
[3, 30]
],
"total_volumes": [
[1, 100],
[2, 200],
[3, 300]
]
}'''
in_json = json.loads(in_str)
inventory = {item_id: {"price": price} for item_id, price in in_json["prices"]}
for item_id, volume in in_json["total_volumes"]:
if isinstance(inventory.get(item_id), dict):
inventory[item_id]["volume"] = volume
else:
inventory[item_id] = {"volume": volume}
print(inventory)
inventory_table = [[item_id, info.get("price"), info.get("volume")] for item_id, info in inventory.items()]
print(inventory_table)
'''
>>> [[1, 10, 100], [2, 20, 200], [3, 30, 300]]
'''