熊猫数据框申请获取列表引发错误

时间:2018-12-28 11:59:19

标签: python pandas apply pickle

我正在运行Python 3.6.x和pandas版本0.19.2。我正在尝试为数据框中的每个条目创建一个列表,如下所示。此示例有效。

df = pd.DataFrame({'names':['a', 'b', 'c'], 'year_min':[2001, 2010, 2005], 'year_max':[2018, 2019, 2017]})
start_year = 2017
df['years'] = df.apply(lambda x: list(range(max(x['year_min'],start_year), x['year_max']+1)), axis=1)

df
Out[37]: 
  names  year_max  year_min               years
0     a      2018      2001        [2017, 2018]
1     b      2019      2010  [2017, 2018, 2019]
2     c      2017      2005              [2017]

不幸的是,当我为dataframe in this pickle file尝试相同的代码行时,尽管两列的dtypes仍为int64,但我还是报错了。毫无疑问,我弄乱了这个数据框的一些内容,但是我不知道问题出在哪里(!)。有什么想法吗?

players = pd.read_pickle("players_2017_2019.p")
start_year = 2017
players['years']= players.apply(lambda x: list(range(max(x['year_min'],start_year), x['year_max']+1)), axis=1)

Traceback (most recent call last):
  File "...\python36\win64\431\lib\site-packages\pandas\core\internals.py", line 4262, in create_block_manager_from_arrays
    blocks = form_blocks(arrays, names, axes)
  File "...\python36\win64\431\lib\site-packages\pandas\core\internals.py", line 4339, in form_blocks
    int_blocks = _multi_blockify(int_items)
  File "...\python36\win64\431\lib\site-packages\pandas\core\internals.py", line 4408, in _multi_blockify
    values, placement = _stack_arrays(list(tup_block), dtype)
  File "...\python36\win64\431\lib\site-packages\pandas\core\internals.py", line 4453, in _stack_arrays
    stacked[i] = _asarray_compat(arr)
ValueError: could not broadcast input array from shape (2) into shape (3)

在处理上述异常期间,发生了另一个异常: 追溯(最近一次通话):

 File "...\python36\win64\431\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-7fc8712b01b0>", line 1, in <module>
    players.apply(lambda x: list(range(max(x['year_min'],start_year), x['year_max']+1)), axis=1)
  File "...\python36\win64\431\lib\site-packages\pandas\core\frame.py", line 4152, in apply
    return self._apply_standard(f, axis, reduce=reduce)
  File "...\python36\win64\431\lib\site-packages\pandas\core\frame.py", line 4265, in _apply_standard
    result = self._constructor(data=results, index=index)
  File "...\python36\win64\431\lib\site-packages\pandas\core\frame.py", line 266, in __init__
    mgr = self._init_dict(data, index, columns, dtype=dtype)
  File "...\python36\win64\431\lib\site-packages\pandas\core\frame.py", line 402, in _init_dict
    return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
  File "...\python36\win64\431\lib\site-packages\pandas\core\frame.py", line 5408, in _arrays_to_mgr
    return create_block_manager_from_arrays(arrays, arr_names, axes)
  File "...\python36\win64\431\lib\site-packages\pandas\core\internals.py", line 4267, in create_block_manager_from_arrays
    construction_error(len(arrays), arrays[0].shape, axes, e)
  File "...\python36\win64\431\lib\site-packages\pandas\core\internals.py", line 4231, in construction_error
    raise ValueError("Empty data passed with indices specified.")
ValueError: Empty data passed with indices specified.

编辑: 当我将熊猫更新为0.23.0时,此问题已解决 此外,该问题与https://github.com/pandas-dev/pandas/issues/17892关联。

0 个答案:

没有答案