我在报告中使用Tableau Fixed LOD功能,并且正在寻找在R中模仿此功能的方法。
数据集如下:
Soldto<-c("123456","122456","123456","122456","124560","125560")
Shipto<-c("123456","122555","122456","124560","122560","122456")
IssueDate<-as.Date(c("2017-01-01","2017-01-02","2017-01-01","2017-01-02","2017-01-01","2017-01-01"))
Method<-c("Ground","Ground","Ground","Air","Ground","Ground")
Delivery<-c("000123","000456","000123","000345","000456","000555")
df1<-data.frame(Soldto,Shipto,IssueDate,Method,Delivery)
我要做的是“对于每个售达/发货/方法计数唯一交货ID的数量”。
目的是找出可能被“聚合”的唯一交付数量。
在Tableau中,该功能如下所示:
{FIXED [Soldto],[Shipto],[IssueDate],[Method],:countd([Delivery])
可以使用aggregate
或summarize
完成,如下例所示:
df.new<-ddply(df,c("Soldto","Shipto","Method"),summarise,
Deliveries = n_distinct(Delivery))
答案 0 :(得分:1)
dplyr
这相当容易。您正在为delivery
,soldto
和shipto
的每个组合寻找唯一method
的数量,这只是group_by
然后是summarise
:
library(tidyverse)
tbl <- tibble(
soldto = c("123456","122456","123456","122456","124560","125560"),
shipto = c("123456","122555","122456","124560","122560","122456"),
issuedate = as.Date(c("2017-01-01","2017-01-02","2017-01-01","2017-01-02","2017-01-01","2017-01-01")),
method = c("Ground","Ground","Ground","Air","Ground","Ground"),
delivery = c("000123","000456","000123","000345","000456","000555")
)
tbl %>%
group_by(soldto, shipto, method) %>%
summarise(uniques = n_distinct(delivery))
#> # A tibble: 6 x 4
#> # Groups: soldto, shipto [?]
#> soldto shipto method uniques
#> <chr> <chr> <chr> <int>
#> 1 122456 122555 Ground 1
#> 2 122456 124560 Air 1
#> 3 123456 122456 Ground 1
#> 4 123456 123456 Ground 1
#> 5 124560 122560 Ground 1
#> 6 125560 122456 Ground 1
由reprex package(v0.2.0)创建于2018-03-02。