如何合并具有相同索引的多行,每一行在熊猫中只有一个真实值?

时间:2019-01-28 10:10:50

标签: python pandas dataframe nan

我有一个熊猫数据框,其形状如下:

                          OPEN_INT PX_HIGH PX_LAST VOL
timestamp  ticker source     
2018-01-01   AAPL   NYSE         1      NaN    NaN NaN
2018-01-01   AAPL   NYSE       NaN        2    NaN NaN
2018-01-01   AAPL   NYSE       NaN      NaN      3 NaN
2018-01-01   AAPL   NYSE       Nan      NaN    NaN   4
2018-01-01   MSFT   NYSE         5      NaN    NaN NaN
2018-01-01   MSFT   NYSE       NaN        6    NaN NaN
2018-01-01   MSFT   NYSE       NaN      NaN      7 NaN
2018-01-01   MSFT   NYSE       Nan      NaN    NaN   8

在每个组(时间戳,行情指示器,源)的每一列中,仅保证有一个值,所有其他值均为Nan,是否有任何方法可以将它们组合成单个行,因此如下所示:

                          OPEN_INT PX_HIGH PX_LAST VOL
timestamp  ticker source     
2018-01-01   AAPL   NYSE         1      2        3   4
2018-01-01   MSFT   NYSE         5      6        7   8

我尝试使用df.groupby(['timestamp', 'ticker', 'source']).agg(lambda x: x.dropna(),但是出现错误提示Function does not reduce

1 个答案:

答案 0 :(得分:2)

使用GroupBy.first

df.groupby(['timestamp', 'ticker', 'source']).first()

如果始终是maxminsummean ...汇总的每个组中只有一个值:

df.groupby(['timestamp', 'ticker', 'source']).max()