Question

我想将JSON文件作为熊猫数据框读入Jupiter笔记本中。
macOS 10.12，Python 3.7，pandas 0.24.2
我的数据集：https://open.fda.gov/apis/drug/label/download/
带有相同错误消息的类似问题（我尝试从此处使用解决方案，但给出了相同错误消息）：Read JSON to pandas dataframe - ValueError: Mixing dicts with non-Series may lead to ambiguous ordering

import json
import pandas as pd

data = json.load(open('drug-label-0001-of-0008.json'))
df = pd.DataFrame(data)

此答案表明我没有进行双重转换：Pandas vs JSON library to read a JSON file in Python 他的代码可以正常工作，我的遇到错误：

import pandas as pd
pd_example = pd.read_json('some_json_file.json')

我的代码相似，但是出现以下错误：

import pandas as pd
df = pd.read_json('drug-label-0008-of-0008.json')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-23-77b3c3e486fc> in <module>
----> 1 df = pd.read_json('drug-label-0008-of-0008.json')

~/anaconda3/lib/python3.7/site-packages/pandas/io/json/json.py in read_json(path_or_buf, orient, typ, dtype, convert_axes, convert_dates, keep_default_dates, numpy, precise_float, date_unit, encoding, lines, chunksize, compression)
    425         return json_reader
    426 
--> 427     result = json_reader.read()
    428     if should_close:
    429         try:

~/anaconda3/lib/python3.7/site-packages/pandas/io/json/json.py in read(self)
    535             )
    536         else:
--> 537             obj = self._get_object_parser(self.data)
    538         self.close()
    539         return obj

~/anaconda3/lib/python3.7/site-packages/pandas/io/json/json.py in _get_object_parser(self, json)
    554         obj = None
    555         if typ == 'frame':
--> 556             obj = FrameParser(json, **kwargs).parse()
    557 
    558         if typ == 'series' or obj is None:

~/anaconda3/lib/python3.7/site-packages/pandas/io/json/json.py in parse(self)
    650 
    651         else:
--> 652             self._parse_no_numpy()
    653 
    654         if self.obj is None:

~/anaconda3/lib/python3.7/site-packages/pandas/io/json/json.py in _parse_no_numpy(self)
    869         if orient == "columns":
    870             self.obj = DataFrame(
--> 871                 loads(json, precise_float=self.precise_float), dtype=None)
    872         elif orient == "split":
    873             decoded = {str(k): v for k, v in compat.iteritems(

~/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    390                                  dtype=dtype, copy=copy)
    391         elif isinstance(data, dict):
--> 392             mgr = init_dict(data, index, columns, dtype=dtype)
    393         elif isinstance(data, ma.MaskedArray):
    394             import numpy.ma.mrecords as mrecords

~/anaconda3/lib/python3.7/site-packages/pandas/core/internals/construction.py in init_dict(data, index, columns, dtype)
    210         arrays = [data[k] for k in keys]
    211 
--> 212     return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    213 
    214 

~/anaconda3/lib/python3.7/site-packages/pandas/core/internals/construction.py in arrays_to_mgr(arrays, arr_names, index, columns, dtype)
     49     # figure out the index, if necessary
     50     if index is None:
---> 51         index = extract_index(arrays)
     52     else:
     53         index = ensure_index(index)

~/anaconda3/lib/python3.7/site-packages/pandas/core/internals/construction.py in extract_index(data)
    318 
    319             if have_dicts:
--> 320                 raise ValueError('Mixing dicts with non-Series may lead to '
    321                                  'ambiguous ordering.')
    322 

ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.

Answer 1

您可以只使用python内置的JSON处理功能：

import json

with open("drug-label-0008-of-0008.json", "r") as read_file:
    data = json.load(read_file)

“当json文件中只有一个JSON结构时，请使用read_json，因为它将JSON直接加载到DataFrame中。使用json.loads，您必须将其加载到python字典/列表中，然后再加载到DataFrame-不必要的两步过程。Pandas vs JSON library to read a JSON file in Python“

ValueError：与非系列混合使用dict可能会导致顺序不明确

1 个答案: