在R

时间:2018-10-04 07:23:21

标签: r if-statement grep str-replace grepl

最近,我开始学习R,并尝试通过自动化过程来探索更多内容。下面是示例数据,我正在尝试通过查找并替换标签(商品名:名称)中的特定文本来创建新列。

从那时起,我正在处理大量新数据,我想使用R编程而不是使用excel公式来自动化。

数据集:

strings<-c("Zonal Manager","Department Manager","Network Manager","Head of Sales","Account Manager","Alliance Manager","Additional Manager","Senior Vice President","General manager","Senior Analyst", "Solution Architect","AGM")

我使用的R代码:

t<-data.frame(strings,stringsAsFactors = FALSE)
colnames(t)[1]<-"Designations"
y<-sub(".*Manager*","Manager",strings,ignore.case = TRUE)

挑战:
在此过程中,所有数据都更改为Manager,但我需要用主要主题替换其他名称。

我尝试使用ifelse语句,grep,grepl,str,sub等,但是我没有得到想要的东西

由于主要主题分散,我不能使用第一/第二/最后一个词(作为“定界”)。例如:首席信息官或商业财务经理或股东周年大会

Excel工作:
我已经将300个主要主题编码为...

经理(适用于所有总经理,助理经理,销售经理等) 建筑师(Solution Arch,Sr。Arch等) 主任(高级主任,主任,助理主任等) 资深分析师 分析员 主管(代表销售主管)

我要寻找的是: 我需要创建一个新列,并且应该像在Excel中使用R一样用相关的主要主题替换文本。

如果我可以将我已经在excel中编码的主要主题与使用R编程(例如excel中的vlookup)相匹配的主题,那就可以了。

预期结果: enter image description here 预先感谢您的帮助!

是的,我正在处理的完全相同。谢谢!!但是当我通过上传新数据集(excel文件)并使用

df %>% 
   mutate(theme=gsub(".*(Manager|Lead|Director|Head|Administrator|Executive|Executive|VP|President|Consultant|CFO|CTO|CEO|CMO|CDO|CIO|COO|Cheif Executive Officer|Chief Technological Officer|Chief Digital Officer|Chief Financial Officer|Chief Marketing Officer|Chief Digital Officer|Chief Information Officer,Chief Operations Officer)).*","\\1",Designations,ignore.case = TRUE))

它没有用。我应该在其他地方纠正吗?

2 个答案:

答案 0 :(得分:2)

数据:

strings<-c("Zonal Manager","Department Manager","Network Manager","Head of Sales","Account Manager",
           "Alliance Manager","Additional Manager","Senior Vice President","General manager","Senior Analyst", "Solution Architect","AGM")

您需要准备一个良好的查找表:(完成并使其完美。)

lu_table <- data.frame(new = c("Manager", "Architect","Director"), old = c("Manager|GM","Architect|Arch","Director"), stringsAsFactors = F)

然后,您可以让mapply完成这项工作:

mapply(function(new,old) {ans <- strings; ans[grepl(old,ans)]<-new; strings <<- ans; return(NULL)}, new = lu_table$new, old = lu_table$old)

现在看看strings

> strings
 [1] "Manager"               "Manager"               "Manager"               "Head of Sales"         "Manager"               "Manager"              
 [7] "Manager"               "Senior Vice President" "General manager"       "Senior Analyst"        "Architect"             "Manager" 

请注意:

此解决方案使用<<-。因此,这可能不是最好的解决方案。但是在这种情况下有效。

答案 1 :(得分:1)

您的意思是这样的吗?

library(dplyr)
strings <-
  c(
    "Zonal Manager",
    "Department Manager",
    "Network Manager",
    "Head of Sales",
    "Account Manager",
    "Alliance Manager",
    "Additional Manager",
    "Senior Vice President",
    "General manager",
    "Senior Analyst",
    "Solution Architect",
    "AGM"
  )

df = data.frame(Designations = strings)


df %>%
  mutate(
    theme = gsub(
      ".*(manager|head|analyst|architect|agm|director|president).*",
      "\\1",
      Designations,
      ignore.case = TRUE
    )
  )
#>             Designations     theme
#> 1          Zonal Manager   Manager
#> 2     Department Manager   Manager
#> 3        Network Manager   Manager
#> 4          Head of Sales      Head
#> 5        Account Manager   Manager
#> 6       Alliance Manager   Manager
#> 7     Additional Manager   Manager
#> 8  Senior Vice President President
#> 9        General manager   manager
#> 10        Senior Analyst   Analyst
#> 11    Solution Architect Architect
#> 12                   AGM       AGM

reprex package(v0.2.1)于2018-10-04创建