熊猫文档说:
“ GroupBy中的NA组被自动排除。例如,此行为与R一致”
我了解文档,但是与R如何保持一致?这是将数据框x与tidyverse一起使用的示例。
> x
c b a
1 NA 1 NA
2 NA 2 NA
3 NA 3 1
4 3 4 2
> x %>% group_by(c, a) %>% summarise(x = mean(b))
Source: local data frame [3 x 3]
Groups: c [?]
c a x
<dbl> <dbl> <dbl>
1 3 2 4.0
2 NA 1 3.0
3 NA NA 1.5
> x %>% group_by(c) %>% summarise(x = mean(b))
# A tibble: 2 × 2
c x
<dbl> <dbl>
1 3 4
2 NA 2
答案 0 :(得分:0)
R不是整齐的。当熊猫文档说它与R一致时,表示以R为底。Tidyverse通过不同的假设起作用。
答案 1 :(得分:0)
datar
尝试遵循 tidyrverse
的 API 设计:
>>> from datar.all import f, c, tibble, group_by, summarise, mean, NA
>>> x = tibble(c=[NA,NA,NA,3], b=[1,2,3,4], a=[NA,NA,1,2])
>>> x
c b a
0 NaN 1 NaN
1 NaN 2 NaN
2 NaN 3 1.0
3 3.0 4 2.0
>>> x >> group_by(f.c, f.a) >> summarise(x=mean(f.b))
[2021-06-08 12:55:47][datar][ INFO] `summarise()` has grouped output by ['c'] (override with `_groups
` argument)
c a x
0 3.0 2.0 4.0
1 NaN 1.0 3.0
2 NaN NaN 1.5
[Groups: ['c'] (n=2)]
>>> x >> group_by(f.c) >> summarise(x=mean(f.b))
c x
0 3.0 4.0
1 NaN 2.0
我是包的作者。如果您有任何问题,请随时提交问题。