R dplyr left_join出错

时间:2016-11-10 21:03:13

标签: r join left-join dplyr

所以我一直在尝试使用left_join将新数据集的列添加到我的主数据集(称为employee)

我已经仔细检查了矢量名称和我没有的清洁,似乎没有任何效果。这是我的代码。非常感谢任何帮助。

job_codes <- read_csv("Quest_UMMS_JobCodes.csv")
job_codes <- job_codes %>%
clean_names() %>%
select(job_code, pos_desc = pos_des_desc)

job_codes$is_nurse <- str_detect(tolower(job_codes$pos_desc), "nurse") 

employee <- employee %>%
left_join(job_codes, by = "job_code")

我不断得到的错误:eval中的错误(替换(expr),envir,enclos):   rhs中找不到'job_code'列,无法加入

这是

的结果
names(job_code)
> names(job_codes)
[1] "job_code" "pos_desc" "is_nurse"

names(employee)
> names(employee)
 [1] "REC_NUM"             "ZIP"                 "STATE"              
 [4] "SEX"                 "EEO_CLASS"           "BIRTH_YEAR"         
 [7] "EMP_STATUS"          "PROCESS_LEVEL"       "DEPARTMENT"         
 [10] "JOB_CODE"            "UNION_CODE"          "SUPERVISOR"         
 [13] "DATE_HIRED"          "R_SHIFT"             "SALARY_CLASS"       
 [16] "EXEMPT_EMP"          "PAY_RATE"            "ADJ_HIRE_DATE"      
 [19] "ANNIVERS_DATE"       "TERM_DATE"           "NBR_FTE"            
 [22] "PENSION_PLAN"        "PAY_GRADE"           "SCHEDULE"           
 [25] "OT_PLAN_CODE"        "DECEASED"            "POSITION"           
 [28] "WORK_SCHED"          "SUPERVISOR_IND"      "FTE_TOTAL"          
 [31] "PRO_RATE_TOTAL"      "PRO_RATE_A_SAL"      "NEW_HIRE_DATE"      
 [34] "COUNTY"              "FST_DAY_WORKED"      "date_hired"         
 [37] "date_hired_adj"      "term_date"           "employment_duration"
 [40] "current"             "age"                 "emp_duration_years" 
 [43] "DESCRIPTION.x"       "PAY_STATUS.x"        "DESCRIPTION.y"      
 [46] "PAY_STATUS.y"      

1 个答案:

答案 0 :(得分:10)

现在,在OP在Q中添加了两个表的列名后,显然加入的列以不同的方式写入(大写与小写)。

如果列名不同,help("left_join")建议:

  

要通过x和y上的不同变量连接,请使用命名向量。例如,by = c(&#34; a&#34; =&#34; b&#34;)将x.a与y.b匹配。

所以,在这种情况下应该阅读

employee <- employee %>% left_join(job_codes, by = c("JOB_CODE" = "job_code"))