创建熊猫数据框时出现值错误

时间:2019-09-30 19:59:32

标签: python scikit-learn

尝试从JSON / Dict对象创建DataFrame时收到以下错误。我是python的新手,正在做学习练习,因此感谢所有帮助。

真正的问题是我想要1 ROW(values)with 100 Columns,但是这告诉我,我将其设置为期望100个值100列的方式。

如果我不提供columns = json_input.keys(),那么我可以做到这一点,但是它默认为100行与100列和1行。

import pandas as pd

json_input = {"x0": "9.521496806", "x1": "wed", "x2": "-5.087588682", "x3": "-17.21471427", "x4": "-2.486421073", "x5": "35.48653879", "x6": "-1.20495816", "x7": "-23.7174973", "x8": "105.7946327", "x9": "-5.951938559", "x10": "5.214871257", "x11": "0.303798139", "x12": "$296.43 ", "x13": "-0.194132881", "x14": "-2.188915191", "x15": "15.8504554", "x16": "-7.419140411", "x17": "6.931577729", "x18": "-33.76908811", "x19": "-1.932735617", "x20": "0.066503478", "x21": "0.014625357", "x22": "-2.826542568", "x23": "-9.51560375", "x24": "27.31797115", "x25": "-4.210150941", "x26": "-13.45071138", "x27": "17.51376958", "x28": "0.14235993", "x29": "6.49488499", "x30": "8.922856241", "x31": "1.264469019", "x32": "-14.22456453", "x33": "-22.51356894", "x34": "2.042085808", "x35": "7.996513763", "x36": "15.62250736", "x37": "-36.34086747", "x38": "2.665399772", "x39": "-1.354001761", "x40": "33.71068143", "x41": "11.74949803", "x42": "-2.793416547", "x43": "71.4392679", "x44": "-3.57085601", "x45": "-10.61019691", "x46": "63.36622572", "x47": "1.084953519", "x48": "0.965175942", "x49": "15.41097088", "x50": "38.02325393", "x51": "-4.601041878", "x52": "9.544564428", "x53": "5.171864325", "x54": "Aug", "x55": "3.238851899", "x56": "-1.444656373", "x57": "-24.85405723", "x58": "-0.127639937", "x59": "14.69515683", "x60": "-3.577237241", "x61": "12.67485", "x62": "-26.60833996", "x63": "22.3566647", "x64": "0.187033314", "x65": "-20.08925727", "x66": "0.3013055", "x67": "9.782791255", "x68": "-0.590871745", "x69": "-27.03617115", "x70": "0.178891203", "x71": "9.297257064", "x72": "-0.687360237", "x73": "23.1353161", "x74": "-1.692361883", "x75": "6.007302227", "x76": "-0.05636968", "x77": "20.23959571", "x78": "4.889493523", "x79": "0.02%", "x80": "20.11423271", "x81": "22.31711274", "x82": "asia", "x83": "-0.072090104", "x84": "volkswagon", "x85": "-0.14252212", "x86": "0.464293542", "x87": "-0.974325314", "x88": "7.131219017", "x89": "-2.506897555", "x90": "-0.069832619", "x91": "-11.84213839", "x92": "0.09761061", "x93": "15.27673142", "x94": "-1.927285625", "x95": "8.008175145", "x96": "0.659805361", "x97": "2.216918955", "x98": "-18.64465705", "x99\n": "-1.926577376"}

data_input = pd.DataFrame.from_dict(data=json_input, orient='index', columns=json_input.keys())

print(data_input)
C:\dev\anaconda3\lib\site-packages\sklearn\externals\joblib\__init__.py:15: DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
  warnings.warn(msg, category=DeprecationWarning)
Traceback (most recent call last):
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\internals\managers.py", line 1651, in create_block_manager_from_blocks
    placement=slice(0, len(axes[0])))]
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\internals\blocks.py", line 3095, in make_block
    return klass(values, ndim=ndim, placement=placement)
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\internals\blocks.py", line 2631, in __init__
    placement=placement)
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\internals\blocks.py", line 87, in __init__
    '{mgr}'.format(val=len(self.values), mgr=len(self.mgr_locs)))
ValueError: Wrong number of items passed 1, placement implies 100

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pandastest.py", line 18, in <module>
    data_input = pd.DataFrame.from_dict(data=json_input, orient='index', columns=json_input.keys())
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\frame.py", line 1138, in from_dict
    return cls(data, index=index, columns=columns, dtype=dtype)
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\frame.py", line 451, in __init__
    copy=copy)
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 167, in init_ndarray
    return create_block_manager_from_blocks([values], [columns, index])
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\internals\managers.py", line 1660, in create_block_manager_from_blocks
    construction_error(tot_items, blocks[0].shape[1:], axes, e)
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\internals\managers.py", line 1691, in construction_error
    passed, implied))
ValueError: Shape of passed values is (100, 1), indices imply (100, 100)

2 个答案:

答案 0 :(得分:2)

-嗨,迈克,您可以在行尾添加移调:

pd.DataFrame.from_dict(data=json_input, orient='index').T

这将为您提供所需的形状。

答案 1 :(得分:1)

它会给您100行,因为您正在传递orient=index。如果您可以控制数据结构,则可以使用更简单的pd.DataFrame.from_dict(data=json_input),其输入格式如下:

{
    "column1": [value],
    "column2": [value],
    ...
}

或者您的情况:

{
    "x0": ["9.521496806"],
    "x1": ["wed"],
    ...
}