通过R中的组连接列

时间:2013-12-11 19:00:27

标签: string r oracle text aggregation

假设我已获得此员工名单:

 Dept Date      Name            
----- --------- --------------- 
   30 07-DEC-02 Raphaely        
   30 18-MAY-03 Khoo            
   40 07-JUN-02 Mavris          
   50 01-MAY-03 Kaufling        
   50 14-JUL-03 Ladwig          
   70 07-JUN-02 Baer            
   90 13-JAN-01 De Haan
   90 17-JUN-03 King  
  100 16-AUG-02 Faviet
  100 17-AUG-02 Greenberg 
  110 07-JUN-02 Gietz           
  110 07-JUN-02 Higgins         

我希望按照R部门(类似于Oracle PL/SQL's LISTAGG function)的列表聚合来生成最后一列:

 Dept Date      Name            Emp_list
----- --------- --------------- ---------------------------------------------
   30 07-DEC-02 Raphaely        Raphaely; Khoo
   30 18-MAY-03 Khoo            Raphaely; Khoo
   40 07-JUN-02 Mavris          Mavris
   50 01-MAY-03 Kaufling        Kaufling; Ladwig
   50 14-JUL-03 Ladwig          Kaufling; Ladwig
   70 07-JUN-02 Baer            Baer
   90 13-JAN-01 De Haan         De Haan; King
   90 17-JUN-03 King            De Haan; King
  100 16-AUG-02 Faviet          Faviet; Greenberg
  100 17-AUG-02 Greenberg       Faviet; Greenberg
  110 07-JUN-02 Gietz           Gietz; Higgins
  110 07-JUN-02 Higgins         Gietz; Higgins

有什么建议吗?

2 个答案:

答案 0 :(得分:7)

您可以使用avepaste

within(mydf, {
  Emp_list <- ave(Name, Dept, FUN = function(x) paste(x, collapse = "; "))
})
#   Dept      Date      Name          Emp_list
# 1    30 07-DEC-02  Raphaely    Raphaely; Khoo
# 2    30 18-MAY-03      Khoo    Raphaely; Khoo
# 3    40 07-JUN-02    Mavris            Mavris
# 4    50 01-MAY-03  Kaufling  Kaufling; Ladwig
# 5    50 14-JUL-03    Ladwig  Kaufling; Ladwig
# 6    70 07-JUN-02      Baer              Baer
# 7    90 13-JAN-01   De Haan     De Haan; King
# 8    90 17-JUN-03      King     De Haan; King
# 9   100 16-AUG-02    Faviet Faviet; Greenberg
# 10  100 17-AUG-02 Greenberg Faviet; Greenberg
# 11  110 07-JUN-02     Gietz    Gietz; Higgins
# 12  110 07-JUN-02   Higgins    Gietz; Higgins

答案 1 :(得分:1)

或者plyr:

gr<-read.csv("gr.csv")
require(plyr)
merge(gr,ddply(gr,.(Dept),summarise,Emp_List=paste0(Name,collapse="; ")),by="Dept")

Dept      Date      Name          Emp_List
1    30 07-DEC-02  Raphaely    Raphaely; Khoo
2    30 18-MAY-03      Khoo    Raphaely; Khoo
3    40 07-JUN-02    Mavris            Mavris
4    50 01-MAY-03  Kaufling  Kaufling; Ladwig
5    50 14-JUL-03    Ladwig  Kaufling; Ladwig
6    70 07-JUN-02      Baer              Baer
7    90 13-JAN-01   De Haan     De Haan; King
8    90 17-JUN-03      King     De Haan; King
9   100 16-AUG-02    Faviet Faviet; Greenberg
10  100 17-AUG-02 Greenberg Faviet; Greenberg
11  110 07-JUN-02     Gietz    Gietz; Higgins
12  110 07-JUN-02   Higgins    Gietz; Higgins