我有一个看起来像这样的json:
{
"formatVersion" : "v1.0",
"disclaimer" : "This pricing list is for informational purposes only ..."
"offerCode" : "AmazonEC2",
"version" : "20181122020351",
"publicationDate" : "2018-11-22T02:03:51Z",
"products" : {
"G5FFNNK98ETA2UBE" : {
"sku" : "G5FFNNK98ETA2UBE",
"productFamily" : "Compute Instance",
"attributes" : {
"servicecode" : "AmazonEC2",
"location" : "Asia Pacific (Tokyo)",
"locationType" : "AWS Region",
"instanceType" : "c4.4xlarge",
"currentGeneration" : "Yes",
"instanceFamily" : "Compute optimized",
"vcpu" : "16",
"physicalProcessor" : "Intel Xeon E5-2666 v3 (Haswell)",
"clockSpeed" : "2.9 GHz",
"memory" : "30 GiB",
"storage" : "EBS only",
并且我正在尝试使用以下代码将其转换为Pandas DataFrame:
df = pd.DataFrame()
for sku, data in json.loads(ec2offer)['products'].items():
if data['productFamily'] == 'Compute Instance':
new_df = pd.DataFrame.from_dict(data['attributes'], index=[0])
df.append(new_df, ignore_index=True)
print(df)
在添加index=[0]
之前,出现错误“ ValueError:如果使用所有标量值,则必须传递索引”
因此,我根据Constructing pandas DataFrame from values in variables gives "ValueError: If using all scalar values, you must pass an index"
现在我收到此错误:
TypeError:from_dict()得到了意外的关键字参数'index'
TL; DR
忘记上面的代码。将上述json中的每个“属性”结构添加到Pandas数据框中自己的行中,最简单的方法是什么?
预期输出
instanceType memory ...
c4.4xlarge 30 Gib ...
... ... ...
答案 0 :(得分:1)
jsonstr={
"formatVersion": "v1.0",
"disclaimer": "This pricing list is for informational purposes only ...",
"offerCode": "AmazonEC2",
"version": "20181122020351",
"publicationDate": "2018-11-22T02:03:51Z",
"products": {
"G5FFNNK98ETA2UBE": {
"sku": "G5FFNNK98ETA2UBE",
"productFamily": "Compute Instance",
"attributes": {
"servicecode": "AmazonEC2",
"location": "Asia Pacific (Tokyo)",
"locationType": "AWS Region",
"instanceType": "c4.4xlarge",
"currentGeneration": "Yes",
"instanceFamily": "Compute optimized",
"vcpu": "16",
"physicalProcessor": "Intel Xeon E5-2666 v3 (Haswell)",
"clockSpeed": "2.9 GHz",
"memory": "30 GiB",
"storage": "EBS only"
}
},
"G5FFNNK98ETA2VIB": {
"sku": "G5FFNNK98ETA2UBE",
"productFamily": "Compute Instance",
"attributes": {
"servicecode": "AmazonEC22",
"location": "Asia Pacific (Tokyo)",
"locationType": "AWS Region",
"instanceType": "c4.4xlarge",
"currentGeneration": "Yes",
"instanceFamily": "Compute optimized",
"vcpu": "16",
"physicalProcessor": "Intel Xeon E5-2666 v3 (Haswell)",
"clockSpeed": "2.9 GHz",
"memory": "30 GiB",
"storage": "EBS only"
}
}
}
}
import pandas as pd
d={}
for product in jsonstr['products'].keys():
d[product]={}
d[product]=jsonstr['products'][product]['attributes']
df=pd.DataFrame(d).T.reset_index().drop('index',1)
输出:
df
答案 1 :(得分:0)
您可以像在this问题中进行操作一样使用json_normalize
: