如何在多个条件下合并两个数据帧?

时间:2019-06-23 15:20:23

标签: r join

我有两个数据框-员工打孔数据和员工姓名数据:

DF1

    punch_out punch_in date employee_number
 1  16:00:00  06:00:00 2018-01-01 00000001
 2  15:00:00  08:00:00 2018-08-01 00000001

DF2

employee_numb  job_title  start_date end_date
00000001        worker    2017-08-05 2018-07-01
00000001        manager   2018-07-01 3000-01-01

我需要加入他们,以便在DF1中有一个新列-“职位”,根据日期可以正确反映实际职位。

我的挣扎围绕着约会的条件。因此,从上面的示例中可以看出:根据示例日期,观察1应该具有职务“工人”,但是观察2必须具有“经理”。

如果我进行传统的联接-它会重复记录,并且DF1的每一行都会有两行,而员工2018年1月1日的00000001将是工作人员和管理人员。

结果应如下所示

    punch_out punch_in date employee_number Job Title
 1  16:00:00  06:00:00 2018-01-01 00000001  worker
 2  15:00:00  08:00:00 2018-08-01 00000001  manager

2 个答案:

答案 0 :(得分:3)

var bytes = await rootBundle.load(assetName); var asBase64 = base64.encode(bytes); 包是这里的一个选项,它使我们可以使用SQL语法来短语数据帧连接:

sqldf

答案 1 :(得分:0)

还可以:

library(data.table)

setkey(setDT(DF2)[, start_date := as.Date(start_date)], employee_numb, start_date)
setkey(setDT(DF1)[, date := as.Date(date)], employee_number, date)

DF2[DF1, roll = T, .(punch_out, punch_in, employee_number, job_title)]

如果您的列已经是日期,则可以执行以下操作:

setkey(setDT(DF2), employee_numb, start_date)
setkey(setDT(DF1), employee_number, date)

DF2[DF1, roll = T, .(punch_out, punch_in, employee_number, job_title)]

使用的数据:

DF2 <- structure(list(employee_numb = c("00000001", "00000001"), job_title = structure(2:1, .Label = c("manager", 
                                                                                                       "worker"), class = "factor"), start_date = structure(c(17383, 
                                                                                                                                                              17713), class = "Date"), end_date = structure(1:2, .Label = c("2018-07-01", 
                                                                                                                                                                                                                            "3000-01-01"), class = "factor")), row.names = c(NA, -2L), class = "data.frame")

DF1 <- structure(list(punch_out = structure(2:1, .Label = c("15:00:00", 
                                                            "16:00:00"), class = "factor"), punch_in = structure(1:2, .Label = c("06:00:00", 
                                                                                                                                 "08:00:00"), class = "factor"), date = structure(c(17532, 17744
                                                                                                                                 ), class = "Date"), employee_number = c("00000001", "00000001"
                                                                                                                                 )), row.names = c(NA, -2L), class = "data.frame")