Question

想象一下你在R这个'dplyr'代码：

test <- data %>%
        group_by(PrimaryAccountReference) %>% 
        mutate(Counter_PrimaryAccountReference = n()) %>% 
        ungroup()

如何将其准确转换为pandas等效代码？不久，我需要分组添加另一列，然后取消组合初始查询。我关心的是如何使用pandas包进行“取消组合”功能。

Answer 1

这是使用transform函数的pandas方式：

data['Counter_PrimaryAccountReference'] = data.groupby('PrimaryAccountReference')['PrimaryAccountReference'].transform('size')

Answer 2

现在您可以使用 datar：

from datar import f
from datar.dplyr import group_by, mutate, ungroup, n

test = data >> \
       group_by(f.PrimaryAccountReference) >> \
       mutate(Counter_PrimaryAccountReference = n()) >> \
       ungroup()

我是包的作者。如果您有任何问题，请随时提交问题。

r - Dplyr'ungroup'在熊猫中的功能

2 个答案: