所以我一直在尝试使用left_join将新数据集的列添加到我的主数据集(称为employee)
我已经仔细检查了矢量名称和我没有的清洁,似乎没有任何效果。这是我的代码。非常感谢任何帮助。
job_codes <- read_csv("Quest_UMMS_JobCodes.csv")
job_codes <- job_codes %>%
clean_names() %>%
select(job_code, pos_desc = pos_des_desc)
job_codes$is_nurse <- str_detect(tolower(job_codes$pos_desc), "nurse")
employee <- employee %>%
left_join(job_codes, by = "job_code")
我不断得到的错误:eval中的错误(替换(expr),envir,enclos): rhs中找不到'job_code'列,无法加入
这是
的结果names(job_code)
> names(job_codes)
[1] "job_code" "pos_desc" "is_nurse"
names(employee)
> names(employee)
[1] "REC_NUM" "ZIP" "STATE"
[4] "SEX" "EEO_CLASS" "BIRTH_YEAR"
[7] "EMP_STATUS" "PROCESS_LEVEL" "DEPARTMENT"
[10] "JOB_CODE" "UNION_CODE" "SUPERVISOR"
[13] "DATE_HIRED" "R_SHIFT" "SALARY_CLASS"
[16] "EXEMPT_EMP" "PAY_RATE" "ADJ_HIRE_DATE"
[19] "ANNIVERS_DATE" "TERM_DATE" "NBR_FTE"
[22] "PENSION_PLAN" "PAY_GRADE" "SCHEDULE"
[25] "OT_PLAN_CODE" "DECEASED" "POSITION"
[28] "WORK_SCHED" "SUPERVISOR_IND" "FTE_TOTAL"
[31] "PRO_RATE_TOTAL" "PRO_RATE_A_SAL" "NEW_HIRE_DATE"
[34] "COUNTY" "FST_DAY_WORKED" "date_hired"
[37] "date_hired_adj" "term_date" "employment_duration"
[40] "current" "age" "emp_duration_years"
[43] "DESCRIPTION.x" "PAY_STATUS.x" "DESCRIPTION.y"
[46] "PAY_STATUS.y"
答案 0 :(得分:10)
现在,在OP在Q中添加了两个表的列名后,显然加入的列以不同的方式写入(大写与小写)。
如果列名不同,help("left_join")
建议:
要通过x和y上的不同变量连接,请使用命名向量。例如,by = c(&#34; a&#34; =&#34; b&#34;)将x.a与y.b匹配。
所以,在这种情况下应该阅读
employee <- employee %>% left_join(job_codes, by = c("JOB_CODE" = "job_code"))