我正在运行Python 3.6.x和pandas版本0.19.2。我正在尝试为数据框中的每个条目创建一个列表,如下所示。此示例有效。
df = pd.DataFrame({'names':['a', 'b', 'c'], 'year_min':[2001, 2010, 2005], 'year_max':[2018, 2019, 2017]})
start_year = 2017
df['years'] = df.apply(lambda x: list(range(max(x['year_min'],start_year), x['year_max']+1)), axis=1)
df
Out[37]:
names year_max year_min years
0 a 2018 2001 [2017, 2018]
1 b 2019 2010 [2017, 2018, 2019]
2 c 2017 2005 [2017]
不幸的是,当我为dataframe in this pickle file尝试相同的代码行时,尽管两列的dtypes
仍为int64
,但我还是报错了。毫无疑问,我弄乱了这个数据框的一些内容,但是我不知道问题出在哪里(!)。有什么想法吗?
players = pd.read_pickle("players_2017_2019.p")
start_year = 2017
players['years']= players.apply(lambda x: list(range(max(x['year_min'],start_year), x['year_max']+1)), axis=1)
Traceback (most recent call last):
File "...\python36\win64\431\lib\site-packages\pandas\core\internals.py", line 4262, in create_block_manager_from_arrays
blocks = form_blocks(arrays, names, axes)
File "...\python36\win64\431\lib\site-packages\pandas\core\internals.py", line 4339, in form_blocks
int_blocks = _multi_blockify(int_items)
File "...\python36\win64\431\lib\site-packages\pandas\core\internals.py", line 4408, in _multi_blockify
values, placement = _stack_arrays(list(tup_block), dtype)
File "...\python36\win64\431\lib\site-packages\pandas\core\internals.py", line 4453, in _stack_arrays
stacked[i] = _asarray_compat(arr)
ValueError: could not broadcast input array from shape (2) into shape (3)
在处理上述异常期间,发生了另一个异常: 追溯(最近一次通话):
File "...\python36\win64\431\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-7fc8712b01b0>", line 1, in <module>
players.apply(lambda x: list(range(max(x['year_min'],start_year), x['year_max']+1)), axis=1)
File "...\python36\win64\431\lib\site-packages\pandas\core\frame.py", line 4152, in apply
return self._apply_standard(f, axis, reduce=reduce)
File "...\python36\win64\431\lib\site-packages\pandas\core\frame.py", line 4265, in _apply_standard
result = self._constructor(data=results, index=index)
File "...\python36\win64\431\lib\site-packages\pandas\core\frame.py", line 266, in __init__
mgr = self._init_dict(data, index, columns, dtype=dtype)
File "...\python36\win64\431\lib\site-packages\pandas\core\frame.py", line 402, in _init_dict
return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
File "...\python36\win64\431\lib\site-packages\pandas\core\frame.py", line 5408, in _arrays_to_mgr
return create_block_manager_from_arrays(arrays, arr_names, axes)
File "...\python36\win64\431\lib\site-packages\pandas\core\internals.py", line 4267, in create_block_manager_from_arrays
construction_error(len(arrays), arrays[0].shape, axes, e)
File "...\python36\win64\431\lib\site-packages\pandas\core\internals.py", line 4231, in construction_error
raise ValueError("Empty data passed with indices specified.")
ValueError: Empty data passed with indices specified.
编辑:
当我将熊猫更新为0.23.0时,此问题已解决
此外,该问题与https://github.com/pandas-dev/pandas/issues/17892
关联。