在R中使用dplyr
,我尝试根据另一列的属性添加新列。例如,我有一个包含数千行状态代码的数据框(如table1
)。现在我想添加一个名为Region
的新列,并将状态代码分配给该区域(如table2
)。怎么会在dplyr中完成?
table1 <- data.frame(State = c('NY','IL','CA','PA','FL','MI','AZ'))
table2 <- data.frame(State = c('NY','IL','CA','PA','FL','MI','AZ'),
Region = c('Northeast','Midwest','West','Northeast','Southeast','Midwest','West'))
答案 0 :(得分:1)
这是一个JOIN
问题。只需使用left_join
包中的dplyr
即可。在下面的示例中,我重新排序table1
中的状态,以表明无论顺序如何,它都可以翻译它们:
library(dplyr)
table1 <- data.frame(State = c('PA','FL','MI','AZ','NY','IL','CA'))
table2 <- data.frame(State = c('NY','IL','CA','PA','FL','MI','AZ'),
Region = c('Northeast','Midwest','West','Northeast','Southeast','Midwest','West'))
left_join(table1, table2, by = "State")
State Region
1 PA Northeast
2 FL Southeast
3 MI Midwest
4 AZ West
5 NY Northeast
6 IL Midwest
7 CA West