从python中的字典列表的json文件中获取数据帧

时间:2018-03-21 15:45:27

标签: json python-3.x list dictionary dataframe

我需要从每个内容idlikeCountdisplayName的json响应中获取数据框。 除了displayname之外,其他所有工作都很正常。

它出错了:

KeyError: 'author'

我使用的代码:

    df=pd.DataFrame([])


for i in json_data['list']:
    df=df.append(pd.DataFrame({'Content_id':[i['contentID']],'subject':[i['subject']],'published':[i['published']],'updated':[i['updated']],'viewCount':i['viewCount'],'type':i['type'],'name':[i['author']['displayName']]},index=[0]),ignore_index=True)
print(df.head())



{
"itemsPerPage": 100,
"links": {
    "next": "https:"
},
"list": [
    {
        "id": "77248",
        "resources": {
            "entitlements": {
                "allowed": [
                    "GET"
                ],
                "ref": "https:"
            },
            "outcomeTypes": {
                "allowed": [
                    "GET"
                ],
                "ref": "https:"
            },
            "childOutcomeTypes": {
                "allowed": [
                    "GET"
                ],
                "ref": "https:"
            },
            "followingIn": {
                "allowed": [
                    "POST",
                    "GET"
                ],
                "ref": "https:"
            },
            "editHTML": {
                "allowed": [
                    "GET"
                ],
                "ref": "https:"
            },
            "attachments": {
                "allowed": [
                    "POST",
                    "GET"
                ],
                "ref": "https:"
            },
            "comments": {
                "allowed": [
                    "POST",
                    "GET"
                ],
                "ref": "https:"
            },
            "read": {
                "allowed": [
                    "DELETE",
                    "POST"
                ],
                "ref": "https:"
            },
            "followers": {
                "allowed": [
                    "GET"
                ],
                "ref": "https"
            },
            "versions": {
                "allowed": [
                    "GET"
                ],
                "ref": "https:"
            },
            "outcomes": {
                "allowed": [
                    "POST",
                    "GET"
                ],
                "ref": "https"
            },
            "self": {
                "allowed": [
                    "GET",
                    "PUT"
                ],
                "ref": "https:"
            },
            "html": {
                "allowed": [
                    "GET"
                ],
                "ref": "https:"
            },
            "extprops": {
                "allowed": [
                    "DELETE",
                    "POST",
                    "GET"
                ],
                "ref": "https:"
            },
            "likes": {
                "allowed": [
                    "POST",
                    "GET"
                ],
                "ref": "https:"
            }
        },
        "followerCount": 1,
        "followed": false,
        "likeCount": 0,
        "published": "2018-03-20T17:44:07.623+0000",
        "tags": [],
        "updated": "2018-03-20T17:44:07.639+0000",
        "iconCss": "jive-icon-document",
        "parentPlace": {
            "id": "1063",
            "html": "https:",
            "name": "A's Sa",
            "type": "group",
            "uri": "https:"
        },
        "contentID": "1720297",
        "author": {
            "id": "361666",
            "resources": {
                "reports": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                },
                "followingIn": {
                    "allowed": [
                        "POST",
                        "GET"
                    ],
                    "ref": "https:"
                },
                "images": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                },
                "activity": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                },
                "manager": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                },
                "social": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                },
                "recognition": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                },
                "trendingContent": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                },
                "trendingPlaces": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                },
                "avatar": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                },
                "followers": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                },
                "colleagues": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https"
                },
                "following": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                },
                "members": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                },
                "self": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                },
                "html": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                },
                "extprops": {
                    "allowed": [
                        "GET"
                    ],
                    "ref": "https:"
                }
            },
            "displayName": "R S",
            "emails": [
                {
                    "jive_label": "Email",
                    "primary": true,
                    "type": "work",
                    "value": "s.r@rjz.com",
                    "jive_displayOrder": 2,
                    "jive_showSummaryLabel": false
                }
            ],
            "jive": {
                "enabled": true,
                "level": {
                    "description": "Level 2",
                    "imageURI": "https:",
                    "name": "Novice",
                    "points": 154
                },
                "externalContributor": false,
                "username": "522164052a",
                "visible": true
            },
            "name": {
                "familyName": "S",
                "formatted": "R S",
                "givenName": "R"
            },
            "type": "person"
        },
        "content": {
            "text": "<body><!-- [] --><div class=\"jive-rendered-content\"><p>test zebra madagascar</p></div><!-- [] --></body>",
            "editable": false,
            "type": "text/html"
        },
        "parent": "https:",
        "favoriteCount": 0,
        "replyCount": 0,
        "status": "published",
        "subject": "Zebra",
        "viewCount": 2,
        "visibleToExternalContributors": false,
        "parentVisible": true,
        "parentContentVisible": true,
        "lastActivity": 1521567847639,
        "authorship": "open",
        "categories": [],
        "visibility": "place",
        "outcomeTypes": [
            {
                "id": "3",
                "name": "pending",
                "confirmUnmark": false,
                "shareable": true,
                "confirmExclusion": false,
                "noteRequired": true,
                "urlAllowed": false,
                "generalNote": false
            },
            {
                "id": "6",
                "name": "success",
                "communityAudience": "true",
                "confirmUnmark": false,
                "shareable": false,
                "confirmExclusion": false,
                "noteRequired": true,
                "urlAllowed": false,
                "generalNote": true
            },
            {
                "id": "2",
                "name": "finalized",
                "confirmUnmark": true,
                "shareable": false,
                "confirmExclusion": true,
                "noteRequired": false,
                "urlAllowed": false,
                "generalNote": false
            },
            {
                "id": "9",
                "name": "wip",
                "confirmContentEdit": "true",
                "confirmUnmark": true,
                "shareable": false,
                "confirmExclusion": true,
                "noteRequired": false,
                "urlAllowed": false,
                "generalNote": false
            },
            {
                "id": "7",
                "name": "outdated",
                "confirmUnmark": false,
                "shareable": false,
                "confirmExclusion": false,
                "noteRequired": false,
                "urlAllowed": true,
                "generalNote": false
            }
        ],
        "attachments": [],
        "restrictComments": false,
        "type": "document",
        "lastActivityDate": "2018-03-20T17:44:07.639+0000"
    }
],
"startIndex":0
&#13;
&#13;
&#13;

我想要的输出是,

enter image description here

2 个答案:

答案 0 :(得分:0)

KeyError: 'anything'通常表示未找到您的词典密钥。

我建议改为使用get;

import pandas as pd
for i in json_data.get('list',[]):
    dfDICT = {'Content_id':[i.get('contentID',None)],
              'subject':[i.get('subject',None)],
              'published':[i.get('published',None)],
              'updated':[i.get('updated',None)],
              'likeCount':i.get('likeCount',None)
             }

    if i.get('author',None):  # Test if your key exists
        dfDICT['name'] = i.get('author').get('displayName',None)
        # NOTE: displayname --> displayName
    else:  # Included for consistency
        dfDICT['name'] = None

    df=df.append(pd.DataFrame(dfDICT,index=[0]),ignore_index=True)
print(df.head())

注意: get如果只传递一个参数,则默认为None

答案 1 :(得分:0)

我不确定您是否遗漏了您提供的代码或json中的内容,但我没有收到您在问题中提到的错误。

我在这里发布json格式

{
   "itemsPerPage": 100,
   "links": {
      "next": "https:"
   },
   "list": [
      {
         "id": "77248",
         "resources": {
            "entitlements": {
               "allowed": [
                  "GET"
               ],
               "ref": "https:"
            },
            "outcomeTypes": {
               "allowed": [
                  "GET"
               ],
               "ref": "https:"
            },
            "childOutcomeTypes": {
               "allowed": [
                  "GET"
               ],
               "ref": "https:"
            },
            "followingIn": {
               "allowed": [
                  "POST",
                  "GET"
               ],
               "ref": "https:"
            },
            "editHTML": {
               "allowed": [
                  "GET"
               ],
               "ref": "https:"
            },
            "attachments": {
               "allowed": [
                  "POST",
                  "GET"
               ],
               "ref": "https:"
            },
            "comments": {
               "allowed": [
                  "POST",
                  "GET"
               ],
               "ref": "https:"
            },
            "read": {
               "allowed": [
                  "DELETE",
                  "POST"
               ],
               "ref": "https:"
            },
            "followers": {
               "allowed": [
                  "GET"
               ],
               "ref": "https"
            },
            "versions": {
               "allowed": [
                  "GET"
               ],
               "ref": "https:"
            },
            "outcomes": {
               "allowed": [
                  "POST",
                  "GET"
               ],
               "ref": "https"
            },
            "self": {
               "allowed": [
                  "GET",
                  "PUT"
               ],
               "ref": "https:"
            },
            "html": {
               "allowed": [
                  "GET"
               ],
               "ref": "https:"
            },
            "extprops": {
               "allowed": [
                  "DELETE",
                  "POST",
                  "GET"
               ],
               "ref": "https:"
            },
            "likes": {
               "allowed": [
                  "POST",
                  "GET"
               ],
               "ref": "https:"
            }
         },
         "followerCount": 1,
         "followed": false,
         "likeCount": 0,
         "published": "2018-03-20T17:44:07.623+0000",
         "tags": [],
         "updated": "2018-03-20T17:44:07.639+0000",
         "iconCss": "jive-icon-document",
         "parentPlace": {
            "id": "1063",
            "html": "https:",
            "name": "A's Sa",
            "type": "group",
            "uri": "https:"
         },
         "contentID": "1720297",
         "author": {
            "id": "361666",
            "resources": {
               "reports": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               },
               "followingIn": {
                  "allowed": [
                     "POST",
                     "GET"
                  ],
                  "ref": "https:"
               },
               "images": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               },
               "activity": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               },
               "manager": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               },
               "social": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               },
               "recognition": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               },
               "trendingContent": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               },
               "trendingPlaces": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               },
               "avatar": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               },
               "followers": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               },
               "colleagues": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https"
               },
               "following": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               },
               "members": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               },
               "self": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               },
               "html": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               },
               "extprops": {
                  "allowed": [
                     "GET"
                  ],
                  "ref": "https:"
               }
            },
            "displayName": "Rahman2 Syd",
            "emails": [
               {
                  "jive_label": "Email",
                  "primary": true,
                  "type": "work",
                  "value": "s.r@rjz.com",
                  "jive_displayOrder": 2,
                  "jive_showSummaryLabel": false
               }
            ],
            "jive": {
               "enabled": true,
               "level": {
                  "description": "Level 2",
                  "imageURI": "https:",
                  "name": "Novice",
                  "points": 154
               },
               "externalContributor": false,
               "username": "522164052a",
               "visible": true
            },
            "name": {
               "familyName": "S",
               "formatted": "R S",
               "givenName": "R"
            },
            "type": "person"
         },
         "content": {
            "text": "<body><!-- [] --><div class=\"jive-rendered-content\"><p>test zebra madagascar<\/p><\/div><!-- [] --><\/body>",
            "editable": false,
            "type": "text/html"
         },
         "parent": "https:",
         "favoriteCount": 0,
         "replyCount": 0,
         "status": "published",
         "subject": "Zebra",
         "viewCount": 2,
         "visibleToExternalContributors": false,
         "parentVisible": true,
         "parentContentVisible": true,
         "lastActivity": 1521567847639,
         "authorship": "open",
         "categories": [],
         "visibility": "place",
         "outcomeTypes": [
            {
               "id": "3",
               "name": "pending",
               "confirmUnmark": false,
               "shareable": true,
               "confirmExclusion": false,
               "noteRequired": true,
               "urlAllowed": false,
               "generalNote": false
            },
            {
               "id": "6",
               "name": "success",
               "communityAudience": "true",
               "confirmUnmark": false,
               "shareable": false,
               "confirmExclusion": false,
               "noteRequired": true,
               "urlAllowed": false,
               "generalNote": true
            },
            {
               "id": "2",
               "name": "finalized",
               "confirmUnmark": true,
               "shareable": false,
               "confirmExclusion": true,
               "noteRequired": false,
               "urlAllowed": false,
               "generalNote": false
            },
            {
               "id": "9",
               "name": "wip",
               "confirmContentEdit": "true",
               "confirmUnmark": true,
               "shareable": false,
               "confirmExclusion": true,
               "noteRequired": false,
               "urlAllowed": false,
               "generalNote": false
            },
            {
               "id": "7",
               "name": "outdated",
               "confirmUnmark": false,
               "shareable": false,
               "confirmExclusion": false,
               "noteRequired": false,
               "urlAllowed": true,
               "generalNote": false
            }
         ],
         "attachments": [],
         "restrictComments": false,
         "type": "document",
         "lastActivityDate": "2018-03-20T17:44:07.639+0000"
      }
   ]
}

您提供的代码是:

for i in json_data['list']:

    df=df.append(pd.DataFrame({'Content_id':[i['contentID']],'subject':[i['subject']],'published':[i['published']],'updated':[i['updated']],'likeCount':i['likeCount'],'name':i['author']['displayname']},index=[0]),ignore_index=True)

print(f.head())

运行它们会给你两个错误:

  • KeyError: displayname

密钥真的是displayName而不是displayname

  • f.head() - 更改为df.head()

我也不确定你的初始df是什么。假设它是一个空的DataFrame,打印df会给你

Content_id  likeCount   name    published   subject updated
0   1720297 0   Rahman2 Syd 2018-03-20T17:44:07.623+0000    Zebra   2018-03-20T17:44:07.639+0000

我不确定这是否是您正在寻找的输出,但它修复了错误,我让剩下的让您弄明白。