如何从R中的转置结果中删除相似的值并创建新变量?

时间:2019-05-16 13:46:55

标签: r tidyr

我有此数据,除Product_Code之外的所有变量都是重复的。我想创建新变量,例如:Prod_,Prod_2 ....而不是为新变量转置Product_Code并消除重复项。

       ID    DATE        DAYS MONTH Product_Code
1  00003600B 2018-06-30  854   6    83648
2  00003600B 2018-06-30  854   6    40984
3  00003600B 2018-06-30  854   6    14534
4  00003600B 2018-06-30  854   6    18708
5  00003600B 2018-06-30  854   6    18710

我尝试了散布和转置功能,但没有用。

spread(data = Tickets, key = ID, value = Product_Code)

我也尝试过移调,但效果不佳

Tickets.t = t(Tickets)

关于如何执行此操作的任何想法?

我希望与此类似:

ID        DATA       DAYS MONTH PROD_1 PROD_2  PROD_3  PROD_4  PROD_5
00003600B 2018-06-30  854   6   83648   40984   14534   18708   18710
00003600B 2016-02-27  280   2   999195  999154  999339  0   0
00003600B 2015-05-23   77   5   999026  999339  999021  27640   999195

2 个答案:

答案 0 :(得分:1)

在这里,我们需要一个序列列。按“ ID”,“ DATE”,“ DAYS”,“ MONTH”分组,通过将字符串“ PROD”与row_number()串联来创建“ PROD”列,然后将其用于spread“ Product_Code” '值

library(tidyverse)
Tickets %>%
  group_by(ID, DATE, DAYS, MONTH) %>% 
  mutate(PROD = str_c("PROD_", row_number())) %>% 
  spread(PROD, Product_Code)
# A tibble: 1 x 9
# Groups:   ID, DATE, DAYS, MONTH [1]
#  ID        DATE        DAYS MONTH PROD_1 PROD_2 PROD_3 PROD_4 PROD_5
#  <chr>     <chr>      <int> <int>  <int>  <int>  <int>  <int>  <int>
#1 00003600B 2018-06-30   854     6  83648  40984  14534  18708  18710

数据

Tickets <- structure(list(ID = c("00003600B", "00003600B", "00003600B", 
"00003600B", "00003600B"), DATE = c("2018-06-30", "2018-06-30", 
"2018-06-30", "2018-06-30", "2018-06-30"), DAYS = c(854L, 854L, 
854L, 854L, 854L), MONTH = c(6L, 6L, 6L, 6L, 6L), Product_Code = c(83648L, 
40984L, 14534L, 18708L, 18710L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5"))

答案 1 :(得分:1)

在使用点差之前,您需要添加一个与产品编号相对应的变量。

library(tidyverse)

Ticket %>%
   group_by(ID, DATE, DAYS, MONTH) %>%
   mutate(PROD = 1:n()) %>%
   spread(key = PROD, value = Product_code)