如何在熊猫数据框中读取mongodb导出的Json

时间:2018-11-09 17:01:37

标签: python json mongodb pandas

我正在使用以下代码从mongoDB查询中导出json:

 querywith open(r'/Month/Applications_test.json', 'w') as f:
for x in dic:
    json.dump(x, f, default=json_util.default)

效果很好,并返回以下json:

{
  "_class": "Application",
  "_id": "123",
  "applicationTimeStamp": {
    "$date": 1541466008000
  },
  "createdDateTime": {
    "$date": 1541466008084
  }
}
{
  "_class": "Application",
  "_id": "124",
  "applicationTimeStamp": {
    "$date": 1540080000000
  },
  "createdDateTime": {
    "$date": 1540080000096
  }
}
{
  "_class": "Application",
  "_id": "125",
  "applicationTimeStamp": {
    "$date": 1540080000000
  },
  "createdDateTime": {
    "$date": 1540080000097
  }
}

我正在使用以下熊猫代码尝试阅读它:

data_df = pd.read_json(r'/Month/Applications_test.json', lines = True)

我遇到以下错误:

ValueError: Unexpected character found when decoding array value (2)

我想要的是一个具有以下内容的熊猫数据框:

_class      | _id | applicationTimeStamp | createdDateTime
Application | 123 | 10/07/2018           | 10/07/2018
Application | 124 | 10/07/2018           | 10/07/2018
Application | 125 | 10/07/2018           | 10/07/2018

如何将上面的json读入pandas数据框中?

谢谢!

1 个答案:

答案 0 :(得分:3)

您必须以这种方式使用read_json:

            _class  _id  applicationTimeStamp  createdDateTime
$date  Application  123         1541466008000    1541466008084

它返回如下数据框:

        _class            ...                      createdDateTime
0  Application            ...             {'$date': 1541466008084}
1  Application            ...             {'$date': 1540080000096}
2  Application            ...             {'$date': 1540000000097}

或:

var questions = [

{
prompt: "What is the maori translation for: 
\"Apple\"\n(a).Aporo\n(b).Ako
\n(c).Whanaunga\n(d).Pene rākau",

answer: "a"
}

然后您可以转换时间戳。