熊猫groupby-自定义功能

时间:2019-02-27 15:32:37

标签: python pandas pandas-groupby

我使用了groupby和sum()的以下数据框:

d = {'col1': ["A", "A", "A", "B", "B", "B", "C", "C","C"], 'col2': [1,2,3,4,5,6, np.nan, np.nan, np.nan]}

df = pd.DataFrame(data=d)

df.groupby("col1").sum()

结果如下:

col1 col2   
A   6.0
B   15.0
C   0.0

我希望C显示NaN而不是0,因为C的所有值都是NaN。我该怎么做? Apply()与lambda函数?任何帮助,将不胜感激。

3 个答案:

答案 0 :(得分:3)

使用此:

    private async void nextBtn_Tapped(object sender, RoutedEventArgs e)
    {
        Task<string> help = searchtext();

        box.Text = await help; //this is seems weird

        DoForwardSearch();
    }

    private async Task<string> searchtext()
    {
        box.Text = "Searching...";
        await Task.Delay(100); // I kinda want to avoid a delay
        return "Searching..."; //this is somewhat unnecessary
    }

不用@ df.groupby('col1').apply(pd.DataFrame.sum,skipna=False).reset_index(drop=True) #Or --> df.groupby('col1',as_index=False).apply(pd.DataFrame.sum,skipna=False) 来感谢@piRSquared:

apply()

感谢@Alollz: 如果您想返回包含NaN而不只是NaN的组的总和

df.set_index('col1').sum(level=0, min_count=1).reset_index()

输出

df.set_index('col1').sum(level=0,min_count=1).reset_index()

答案 1 :(得分:1)

使求和调用具有参数skipna = False。

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sum.html

该链接应提供您需要的文档,我希望它将解决您的问题。

答案 2 :(得分:1)

感谢@ piRSquared,@ Alollz和@ anky_91:

您可以使用而无需设置索引和重置索引:

d = {'col1': ["A", "A", "A", "B", "B", "B", "C", "C","C"], 'col2': [1,2,3,4,5,6, np.nan, np.nan, np.nan]}

df = pd.DataFrame(data=d)

df.groupby("col1", as_index=False).sum(min_count=1)

输出:

  col1  col2
0    A   6.0
1    B  15.0
2    C   NaN