Json到Pandas DataFrame具有特定格式

时间:2019-12-21 19:12:15

标签: python json pandas dataframe

我需要在pandas DataFrame中以某种格式加载Json文件的内容,以便我可以运行pandassql来转换数据。

{
    "data": [
        {
            "bugURL": null,
            "hidden": false,
            "issueName": "Portability Flaw: Locale Dependent Comparison",
            "folderGuid": "223d7f10-3c78-4631-9adf-d60f7762a25d",
            "lastScanId": 1054162,
            "engineType": "SCA",
            "issueStatus": "Unreviewed",
            "friority": "High",
            "analyzer": "Control Flow",
            "primaryLocation": "AdminWSHelper.java",
            "reviewed": null,
            "id": 20114769,
            "suppressed": false,
            "hasAttachments": false,
            "engineCategory": "STATIC",
            "projectVersionName": null,
            "removedDate": null,
            "severity": 2.0,
            "_href": "https://fortifyssc.xxx.com/api/v1/projectVersions/23004/issues/20114769",
            "displayEngineType": "SCA",
            "foundDate": "2018-12-13T14:44:28.000+0000",
            "confidence": 5.0,
            "impact": 2.5,
            "primaryRuleGuid": "D8E9ED3B-22EC-4CBA-98C8-7C67F73CCF4C",
            "projectVersionId": 23004,
            "scanStatus": "UPDATED",
            "audited": false,
            "kingdom": "Code Quality",
            "folderId": 288551,
            "revision": 0,
            "likelihood": 1.0,
            "removed": false,
            "issueInstanceId": "FD29B2E76A8C579FC0F7A9ED2BDD4832",
            "hasCorrelatedIssues": false,
            "primaryTag": null,
            "lineNumber": 477,
            "projectName": null,
            "fullFileName": "D:/view_store/AR/CS-SC-TRUNK-HP-FORTIFY/SRN_SC_Common/Source/scx/web/src/main/java/com/xxx/xxx/sc/scx/helper/AdminWSHelper.java"
        },
        {
            "bugURL": null,
            "hidden": false,
            "issueName": "Null Dereference",
            "folderGuid": "223d7f10-3c78-4631-9adf-d60f7762a25d",
            "lastScanId": 1054162,
            "engineType": "SCA",
            "issueStatus": "Unreviewed",
            "friority": "High",
            "analyzer": "Control Flow",
            "primaryLocation": "AdminWSHelper.java",
            "reviewed": null,
            "id": 20114572,
            "suppressed": false,
            "hasAttachments": false,
            "engineCategory": "STATIC",
            "projectVersionName": null,
            "removedDate": null,
            "severity": 3.0,
            "_href": "https://fortifyssc.xxx.com/api/v1/projectVersions/23004/issues/20114572",
            "displayEngineType": "SCA",
            "foundDate": "2018-12-13T14:44:28.000+0000",
            "confidence": 5.0,
            "impact": 3.0,
            "primaryRuleGuid": "B32F92AC-9605-0987-E73B-CCB28279AA24",
            "projectVersionId": 23004,
            "scanStatus": "UPDATED",
            "audited": false,
            "kingdom": "Code Quality",
            "folderId": 288551,
            "revision": 0,
            "likelihood": 0.8,
            "removed": false,
            "issueInstanceId": "71C5977F7D157D875160E5C306ACD805",
            "hasCorrelatedIssues": false,
            "primaryTag": null,
            "lineNumber": 552,
            "projectName": null,
            "fullFileName": "D:/view_store/AR/CS-SC-TRUNK-HP-FORTIFY/SRN_SC_Common/Source/scx/web/src/main/java/com/xxx/xxx/sc/scx/helper/AdminWSHelper.java"
        }]

}

我想在解析json之后获得如下所述的DataFrame对象。

issueName | primaryLocation | issueStatus | foundDate | projectVersionId | fullFileName

空解除引用| AdminWSHelper.java | xxx | xxx | xxx | 12345 | yyyy | ... helper / AdminWSHelper.java

空解除引用| AdminWSHelper.java | xxx | xxx | xxx | 12345 | yyyy | ... helper / AdminWSHelper.java

我尝试过这样:

dataSource = FortifyClient.readFileContent('sample.json')    
print('json data : '+dataSource)
df = pd.io.json.json_normalize(dataSource) 

我遇到错误:

Traceback (most recent call last):
  File "C:\Java_Projects\FortiFyReportingEngine\FortifyClient.py", line 67, in <module>
    df = pd.io.json.json_normalize(dataSource) 
  File "C:\Python\Python37-32\lib\site-packages\pandas\io\json\_normalize.py", line 258, in json_normalize
    if any([isinstance(x, dict) for x in y.values()] for y in data):
  File "C:\Python\Python37-32\lib\site-packages\pandas\io\json\_normalize.py", line 258, in <genexpr>
    if any([isinstance(x, dict) for x in y.values()] for y in data):
AttributeError: 'str' object has no attribute 'values'

有人可以帮助我实现这一目标吗?

1 个答案:

答案 0 :(得分:0)

这应该有效:

import json

with open('sample.json') as f:
    d = json.loads(f)

df = pd.DataFrame(d['data'])