我有像
这样的数据trackingnumer = c(1,1,2,2,3)
date = c("2017-08-01", "2017-08-10", "2017-08-02", "2017-08-05", "2017-08-12")
scan = c("Pickup", "Delivered", "Pickup", "Delivered", "Delivered")
df = data.frame(trackingnumer, date, scan)
我想通过trackignumber转置此数据
df2 <- df %>%
group_by(trackingnumer) %>%
mutate(n = row_number()) %>%
{data.table::dcast(data = setDT(.), trackingnumer ~ n, value.var = c('date', 'scan'))}
我尝试了这个,但是我无法获得理想的结果。我想将data_1设置为取件日期,将date_2设置为交付日期。如您所见,trackingnumber 3没有提取记录,所以我希望date_1为NA。
答案 0 :(得分:3)
基准R尝试,使用relevel
设置scan
列的相应排序:
reshape(
cbind(df, time=as.numeric(relevel(df$scan, "Pickup"))),
idvar="trackingnumer", direction="wide", sep="_"
)
# trackingnumer date_1 scan_1 date_2 scan_2
#1 1 2017-08-01 Pickup 2017-08-10 Delivered
#3 2 2017-08-02 Pickup 2017-08-05 Delivered
#5 3 <NA> <NA> 2017-08-12 Delivered
答案 1 :(得分:2)
问题是你在mutate中的函数只计算行数,而不是注意它们中的内容。 case_when()函数允许您根据“scan”的值为“n”列指定特定值
df2 <- df %>%
group_by(trackingnumer) %>%
mutate(n = case_when(scan == "Pickup" ~ 1,
scan == "Delivered" ~ 2)) %>%
{data.table::dcast(data = setDT(.), trackingnumer ~ n, value.var = c('date', 'scan'))}
答案 2 :(得分:1)
或tidyr
library(tidyr)
df %>% group_by(trackingnumer,scan2 = scan) %>%
nest(date,scan) %>%
spread(scan2,data) %>%
mutate_at(c("Delivered","Pickup"),~ifelse(map_lgl(.x,is_tibble),.x,lst(tibble(date=NA,scan=NA)))) %>%
unnest %>%
rename_at(c("date","scan"),paste0,2)
# # A tibble: 3 x 5
# trackingnumer date2 scan2 date1 scan1
# <dbl> <fctr> <fctr> <fctr> <fctr>
# 1 1 2017-08-10 Delivered 2017-08-01 Pickup
# 2 2 2017-08-05 Delivered 2017-08-02 Pickup
# 3 3 2017-08-12 Delivered <NA> <NA>