我正在尝试使用来自sklean的癌症数据集,并且已将其很好地导入,并且所有内容看起来都很好,但是当我尝试创建数据框时,它在tracebak中显示错误“传递的值的形状是(30,569),索引暗示(569,569)”
from sklearn.datasets import load_breast_cancer
cancer=load_breast_cancer()
cancer.keys()
df_feat = pd.DataFrame(cancer['data'],columns=cancer['target'])
ValueError跟踪(最近一次通话最近) C:\ Users \ Bilal药剂师\ Anaconda3 \ lib \ site- 包\熊猫\核心\ internals.py create_block_manager_from_blocks(块,轴) 4293个块= [make_block(values = blocks [0], 4294位置=切片(0,len(轴[0])))] 4295
C:\Users\Bilal Pharmacist\Anaconda3\lib\site-
packages\pandas\core\internals.py in
make_block(values, placement, klass, ndim, dtype, fastpath)
2718
2719 return klass(values, ndim=ndim, fastpath=fastpath,
placement=placement)
2720
C:\Users\Bilal Pharmacist\Anaconda3\lib\site-
packages\pandas\core\internals.py in
__init__(self, values, placement, ndim, fastpath)
114 'implies %d' % (len(self.values),
115 len(self.mgr_locs)))
116
ValueError: Wrong number of items passed 30, placement implies 569
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-16-24a03a5e14d7> in <module>()
1 df_feat = pd.DataFrame(cancer['data'],columns
2 =cancer['target'])
C:\Users\Bilal Pharmacist\Anaconda3\lib\site-packages\pandas\core\frame.py in
__init__(self, data, index, columns, dtype, copy)
304 else:
305 mgr = self._init_ndarray(data, index, columns,
dtype=dtype,
306 copy=copy)
307 elif isinstance(data, (list, types.GeneratorType)):
308 if isinstance(data, types.GeneratorType):
C:\Users\Bilal Pharmacist\Anaconda3\lib\site-packages\pandas\core\frame.py in
_init_ndarray(self, values, index, columns, dtype, copy)
481 values = maybe_infer_to_datetimelike(values)
482
483 return create_block_manager_from_blocks([values], [columns,
index])
484
485 @property
C:\Users\Bilal Pharmacist\Anaconda3\lib\site-
packages\pandas\core\internals.py in
create_block_manager_from_blocks(blocks, axes)
4301 blocks = [getattr(b, 'values', b) for b in blocks]
4302 tot_items = sum(b.shape[0] for b in blocks)
4303 construction_error(tot_items, blocks[0].shape[1:], axes, e)
4304
4305
C:\Users\Bilal Pharmacist\Anaconda3\lib\site-
packages\pandas\core\internals.py in
construction_error(tot_items, block_shape, axes, e)
4278 raise ValueError("Empty data passed with indices specified.")
4279 raise ValueError("Shape of passed values is {0}, indices imply
{1}".format(
4280 passed, implied))
4281
4282
ValueError: Shape of passed values is (30, 569), indices imply (569, 569)
答案 0 :(得分:1)
错误是因为cancer['data']
的形状为(569,30)(即最多可以接受30个列名),而cancer['target']
的形状为(569,)(并且您试图设置它们作为列名)。将cancer['feature_names']
用作columns
。我猜cancer['target']
实际上是一个目标变量(y),不应是列名,而应该是数据帧中的列之一。
这应该有效:
import pandas as pd
from sklearn.datasets import load_breast_cancer
cancer = load_breast_cancer()
df_feat = pd.DataFrame(cancer['data'], columns=cancer['feature_names'])
df_feat['target'] = cancer['target']