在R中,根据某些条件

时间:2017-07-01 06:09:24

标签: r split multiple-columns

我正在尝试根据百分比将数据框架租用分为2列。

|小组|百分比|

| 0雇用| 60%|

| 0雇用了next_month | 65%|

| 0或1雇用| 68%|

| 0或1雇用next_month | 70%|

| 1雇用| 79%|

| 1或2名员工| 80%|

| 2退休| 85%|

| 2或3解雇| 92%|

| 3未退休| 96%|

  • 我想要2列组和决策输出应该是(列百分比和决定应该是没有变化,如果百分比介于60%到69%之间,则列组应该是0(第3行),如果是,则组应为1百分比在70%至79%之间(第4行),如果百分比在80%至89%之间,则组应为2,如果百分比在90%至99%之间,则组应为3)

|小组|决定|百分比

| 0 |雇用| 60%

| 0 |雇用next_month | 65%

| 0 |雇用| 68%

| 1 |雇用next_month | 70%

| 1 |雇用| 79%

| 2 |员工| 80%|

| 2 |退休| 85%|

| 3 |射击| 92%|

| 3 |未退休| 96%|

我的代码:               foo< -str_split_fixed(雇用$ group,"或",2)

任何人都可以提供帮助。提前致谢

1 个答案:

答案 0 :(得分:0)

我们可以使用tidyverseextract(来自tidyr)将“群组”拆分为“群组”和“决策”,然后replace“群组”值如果'百分比'(从parse_number中提取readr的数字)大于或等于90,则为1。

library(tidyverse)
df1 %>%
    extract(group, into = c('group', 'decision'), "^(\\d+).*(hired.*)") %>% 
    mutate(group = replace(group, parse_number(percentage)>=90, 1))
#    group         decision percentage
#1     0            hired        80%
#2     0 hired next_month        85%
#3     0            hired        88%
#4     1 hired next_month        90%
#5     1            hired        99%

数据

df1 <- structure(list(group = c("0 hired", "0 hired next_month", "0 or 1 hired", 
"0 or 1 hired next_month", "1 hired"), percentage = c("80%", 
"85%", "88%", "90%", "99%")), .Names = c("group", "percentage"
), class = "data.frame", row.names = c(NA, -5L))