如何合并R中数据框中因子相似的行?

时间:2015-10-28 16:08:51

标签: r rstudio

假设我有一个如下所示的数据框:

                                        CURRENT_USE  freq
1                                                        31
79                          2 Unit Converted Dwelling    31
83                                    2 Unit Dwelling    31
94                                            2 Units    31
118                                   3 Unit Dwelling    31
231                                         Apartment    31
236                                        Apartment     31
258                                    Apartment Bldg    31
264                                Apartment Building    32
327                                    Apartment Unit    32
354                               Appartment Building    32
363                                               Apt    32
366                                          Apt Bldg    33
369                                      Apt Building    34
377                                              Apt.    34
378                                         Apt. Bldg    35
381                                     Apt. Building    35
392                                             Arena    35
521                                              Bank    35
534                                      Banquet Hall    35
590                                     Bowling Alley    36
666                                          Bungalow    36
705                                    Car Dealership    36
795                                            Church    36
827                                              Club    37
852                                           College    37
879                                        Commercial    37
964                            Commercial/Residential    37
1001                                 Community Centre    38
1013                                   Community Hall    38
1040                                            Condo    38
1075                                      Condominium    38
1120                               Converted Dwelling    38
1138                                  Converted House    38
1150                        Converted House - 2 Units    39
1153                        Converted House - 3 Units    39
1171                        Converted House (2 Units)    39
1181                          Converted House 2 Units    39
1202                         Converted House, 2 Units    40
1204                         Converted House, 3 Units    40

如何合并行,使我的数据框看起来像

                                         CURRENT_USE  freq
1                                                        31
79                          2 Unit Converted Dwelling    93
118                                   3 Unit Dwelling    31
231                                         Apartment   392
392                                             Arena    35
521                                              Bank    35
534                                      Banquet Hall    35
590                                     Bowling Alley    36
666                                          Bungalow    36
705                                    Car Dealership    36
795                                            Church    36
827                                              Club    37
852                                           College    37
879                                        Commercial    74
1001                                 Community Centre    76
1040                                            Condo    76
1120                               Converted Dwelling   312

基本上合并具有相似名称的行。我假设我可以将每一行重命名为相同的名称并合并它们,但只是想知道是否有代码可以更容易地执行此操作

1 个答案:

答案 0 :(得分:0)

您可以使用聚合函数返回最终摘要:

X<-data.frame(CURRENT_USE = c("A","B","A","B"), Freq = c(10,12,1,3), stringsAsFactors=F)
X_Out<-aggregate(Freq~CURRENT_USE,data=X, FUN=sum)

标准化当前使用列需要更多的苦差事。 grep系列函数可用于这些类型的任务。例如,如果要将当前使用中包含文本“Apt”(例如“公寓”)的任何值更改为“公寓”:

X$CURRENT_USE[grepl("Apt", X$CURRENT_USE)]<-"Apartment"