假设我有一个如下所示的数据框:
CURRENT_USE freq
1 31
79 2 Unit Converted Dwelling 31
83 2 Unit Dwelling 31
94 2 Units 31
118 3 Unit Dwelling 31
231 Apartment 31
236 Apartment 31
258 Apartment Bldg 31
264 Apartment Building 32
327 Apartment Unit 32
354 Appartment Building 32
363 Apt 32
366 Apt Bldg 33
369 Apt Building 34
377 Apt. 34
378 Apt. Bldg 35
381 Apt. Building 35
392 Arena 35
521 Bank 35
534 Banquet Hall 35
590 Bowling Alley 36
666 Bungalow 36
705 Car Dealership 36
795 Church 36
827 Club 37
852 College 37
879 Commercial 37
964 Commercial/Residential 37
1001 Community Centre 38
1013 Community Hall 38
1040 Condo 38
1075 Condominium 38
1120 Converted Dwelling 38
1138 Converted House 38
1150 Converted House - 2 Units 39
1153 Converted House - 3 Units 39
1171 Converted House (2 Units) 39
1181 Converted House 2 Units 39
1202 Converted House, 2 Units 40
1204 Converted House, 3 Units 40
如何合并行,使我的数据框看起来像
CURRENT_USE freq
1 31
79 2 Unit Converted Dwelling 93
118 3 Unit Dwelling 31
231 Apartment 392
392 Arena 35
521 Bank 35
534 Banquet Hall 35
590 Bowling Alley 36
666 Bungalow 36
705 Car Dealership 36
795 Church 36
827 Club 37
852 College 37
879 Commercial 74
1001 Community Centre 76
1040 Condo 76
1120 Converted Dwelling 312
基本上合并具有相似名称的行。我假设我可以将每一行重命名为相同的名称并合并它们,但只是想知道是否有代码可以更容易地执行此操作
答案 0 :(得分:0)
您可以使用聚合函数返回最终摘要:
X<-data.frame(CURRENT_USE = c("A","B","A","B"), Freq = c(10,12,1,3), stringsAsFactors=F)
X_Out<-aggregate(Freq~CURRENT_USE,data=X, FUN=sum)
标准化当前使用列需要更多的苦差事。 grep系列函数可用于这些类型的任务。例如,如果要将当前使用中包含文本“Apt”(例如“公寓”)的任何值更改为“公寓”:
X$CURRENT_USE[grepl("Apt", X$CURRENT_USE)]<-"Apartment"