熊猫分组数据框的最大值返回空白断言错误

时间:2020-05-06 20:27:09

标签: python pandas

尝试查找分组数据帧的最大值时出现以下错误。我在此数据框中有数十列,并且我知道一个或多个原因引起了此问题。但是我不知道是哪一个。请从蛮力中解救我,以解决这个问题。

哪种数据类型会导致此问题?是什么导致断言错误为空?

原始代码:

preclin.groupby(['StudyLocation', 'StudyID',
                               'ProductLotNo', 'ProductLotNoDetails',
                               'Dose_CARCellperBody', 'Dose_CellperBody', 'SubjectID'],
                              as_index=False).min()['Time_Days']

错误消息:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-64-c45e8b9b8ce4> in <module>
      2                                'ProductLotNo', 'ProductLotNoDetails',
      3                                'Dose_CARCellperBody', 'Dose_CellperBody', 'SubjectID'],
----> 4                               as_index=False).min()['Time_Days']
      5 # preclin[['StudyLocation', 'StudyID', 'ProductLotNo', 'ProductLotNoDetails', 'Time_Days']].groupby(
      6 #     ['StudyLocation', 'StudyID', 'ProductLotNo', 'ProductLotNoDetails']).max().describe()

/anaconda3/lib/python3.7/site-packages/pandas/core/groupby/groupby.py in f(self, **kwargs)
   1369                 # try a cython aggregation if we can
   1370                 try:
-> 1371                     return self._cython_agg_general(alias, alt=npfunc, **kwargs)
   1372                 except DataError:
   1373                     pass

/anaconda3/lib/python3.7/site-packages/pandas/core/groupby/generic.py in _cython_agg_general(self, how, alt, numeric_only, min_count)
    992     ) -> DataFrame:
    993         agg_blocks, agg_items = self._cython_agg_blocks(
--> 994             how, alt=alt, numeric_only=numeric_only, min_count=min_count
    995         )
    996         return self._wrap_agged_blocks(agg_blocks, items=agg_items)

/anaconda3/lib/python3.7/site-packages/pandas/core/groupby/generic.py in _cython_agg_blocks(self, how, alt, numeric_only, min_count)
   1098             # Clean up the mess left over from split blocks.
   1099             for locs, result in zip(split_items, split_frames):
-> 1100                 assert len(locs) == result.shape[1]
   1101                 for i, loc in enumerate(locs):
   1102                     new_items.append(np.array([loc], dtype=locs.dtype))

AssertionError: 

enter image description here

1 个答案:

答案 0 :(得分:0)

您很可能会收到此错误,因为您的 DataFrame 中有 NaN 值。

max()min() 如果您尝试在有空值的地方进行聚合,似乎都会抛出一个空白断言错误。尝试删除它们还是先过滤掉它们?