使用pandas解析JSON

时间:2018-01-05 19:38:15

标签: python json pandas

我正在尝试使用pandas解析此JSON并获取此特定错误:

<resources>
    <style name="AppTheme" parent="Theme.AppCompat.Light.NoActionBar">
        <!-- Customize your theme here. -->
        <item name="colorPrimary">@color/colorPrimary</item>
        <item name="colorPrimaryDark">@color/colorPrimaryDark</item>
        <item name="colorAccent">@color/colorAccent</item>
    </style>
</resources>

我收到此错误import pandas as pd import json from pandas.io.json import json_normalize from IPython.display import HTML from IPython.core.display import HTML data = [{ "Name": { "Name": "abc xyz", "email": "abc@gmail.com", "website": "www.abc.me", "github": "https://github.com/abc", "address": "abc" }, "Test": "Name": "abc xyz", "email": "abc@gmail.com", "website": "www.abc.me", "github": "https://github.com/abc", "address": "abc" } }] Name = pd.io.json.json_normalize(data['Name']) Name.set_index('Name', inplace=True) Name

但是,如果只尝试JSON中的一个元素,那么它确实有效。

2 个答案:

答案 0 :(得分:1)

你的json无效。 data = [{ "Name": { "Name": "abc xyz", "email": "abc@gmail.com", "website": "www.abc.me", "github": "https://github.com/abc", "address": "abc" }, "Test":{ "Name": "abc xyz", "email": "abc@gmail.com", "website": "www.abc.me", "github": "https://github.com/abc", "address": "abc" } }] 键的值缺少首字母{&#39;。它应该是:

pd.DataFrame(data[0])
                           Name                    Test
Name                    abc xyz                 abc xyz
address                     abc                     abc
email             abc@gmail.com           abc@gmail.com
github   https://github.com/abc  https://github.com/abc
website              www.abc.me              www.abc.me

然后可以将其直接加载到pandas中,如下所示:

struct

答案 1 :(得分:1)

>>> df = pd.DataFrame([['a', 'b'], ['c', 'd']],
...                   index=['row 1', 'row 2'],
...                   columns=['col 1', 'col 2'])

使用&#39;分割&#39;对数据帧进行编码/解码格式化JSON:

>>> df.to_json(orient='split')
'{"columns":["col 1","col 2"],
  "index":["row 1","row 2"],
  "data":[["a","b"],["c","d"]]}'
>>> pd.read_json(_, orient='split')
      col 1 col 2
row 1     a     b
row 2     c     d

使用&#39;索引&#39;对数据帧进行编码/解码格式化JSON:

>>> df.to_json(orient='index')
'{"row 1":{"col 1":"a","col 2":"b"},"row 2":{"col 1":"c","col 2":"d"}}'
>>> pd.read_json(_, orient='index')
      col 1 col 2
row 1     a     b
row 2     c     d

使用&#39;记录&#39;对数据帧进行编码/解码格式化的JSON。请注意,此编码不会保留索引标签。

>>> df.to_json(orient='records')
'[{"col 1":"a","col 2":"b"},{"col 1":"c","col 2":"d"}]'
>>> pd.read_json(_, orient='records')
  col 1 col 2
0     a     b
1     c     d

使用表架构编码

>>> df.to_json(orient='table')
'{"schema": {"fields": [{"name": "index", "type": "string"},
                        {"name": "col 1", "type": "string"},
                        {"name": "col 2", "type": "string"}],
                "primaryKey": "index",`enter code here`
                "pandas_version": "0.20.0"},
    "data": [{"index": "row 1", "col 1": "a", "col 2": "b"},
            {"index": "row 2", "col 1": "c", "col 2": "d"}]}'