我有一个描述大量人的数据框。我想根据几个变量将每个人分配到一个组。例如,假设我的变量“state”有5个状态,变量“age group”有4个组,变量“income”有5个组。我将有5x4x5 = 100组,我想用数字从1到100来命名。我过去总是使用ifelse语句的组合来完成这个,但是现在我有100个可能的结果我想知道是否有比手动指定每个组合更快的方法。
这是一个具有预期结果的MWE:
mydata <- as.data.frame(cbind(c("FR","UK","UK","IT","DE","ES","FR","DE","IT","UK"),
c("20","80","20","40","60","20","60","80","40","60"),c(1,4,2,3,1,5,5,3,4,2)))
colnames(mydata) <- c("Country","Age","Income")
group_grid <- transform(expand.grid(state = c("IT","FR","UK","ES","DE"),
age = c("20","40","60","80"), income = 1:5), val = 1:100)
desired_result <- as.data.frame(cbind(c("FR","UK","UK","IT","DE","ES","FR","DE","IT","UK"),
c("20","80","20","40","60","20","60","80","40","60"),
c(1,4,2,3,1,5,5,3,4,2),
c(2,78,23,46,15,84,92,60,66,33)))
colnames(desired_result) <- c("Country","Age","Income","Group_code")
答案 0 :(得分:1)
mydata$Group_code <- with(mydata, as.integer(interaction(Country, Age, Income)))
应该这样做。
答案 1 :(得分:0)
以下是使用readLn :: Read a => IO a
left_join
选项
dplyr