我正在尝试根据百分比将数据框架租用分为2列。
group percentage
0 hired 60%
0 hired next_month 65%
0 or 1 hired 68%
0 or 1 hired next_month 70%
1 hired 79%
1 or 2 employee 80%
2 retired 85%
2 or 3 fired 92%
3 not-retired 96%
我想要2列组和决策输出应该是(列百分比和决定应该是没有变化,如果百分比介于60%到69%之间,则列组应该是0(第3行),如果是,则组应该是1百分比介于70%至79%之间(第四排),如果百分比介于80%至89%之间,则组应为2,如果百分比介于90%至99%之间,则组应为3)
group decision percentage
0 hired 60%
0 hired next_month 65%
0 hired 68%
1 hired next_month 70%
1 hired 79%
2 employee 80%
2 retired 85%
3 fired 92%
3 not-retired 96%
我的代码:
df1 <- structure(list(
group = c("0 hired", "0 hired next_month ", "0 or 1 hired",
"0 or 1 hired next_month", "1 hired", "1 or 2 employee",
"2 retired", "2 or 3 fired", "3 not-retired"),
percentage = c("60%", "65%", "68%", "70%", "79%", "80%", "89%", "90%", "96%") ),
.Names = c("group", "percentage"), class = "data.frame", row.names = c(NA, -9L))
df2 <- df1 %>% extract(group, into = c('group', 'decision'), "^(\\d+).*(hired|hired next_month|employee|retired|fired|not-retired)")%>% mutate(group = replace(group, parse_number(percentage)>=100, 3))
任何人都可以提供帮助。提前致谢
答案 0 :(得分:1)
您可以像这样
在基地R中执行此操作df2 = data.frame(percentage = df1$percentage)
df2$decision = sub(".*\\d\\s*", "", df1$group)
df2$group = as.numeric(cut(as.numeric(sub("%", "", df1$percentage)),
breaks = c(59, 69, 79,89,100))) - 1
df2 = df2[,3:1]
df2
group decision percentage
1 0 hired 60%
2 0 hired next_month 65%
3 0 hired 68%
4 1 hired next_month 70%
5 1 hired 79%
6 2 employee 80%
7 2 retired 89%
8 3 fired 90%
9 3 not-retired 96%