我正在研究一个小型R项目。
给出两个不同长度的数据帧:
df1 = data.frame(Plane.Id = c(19924519, 19924321, 19992436, 19924119, 19924208, 19924330),
Block.ID = c(090LC, 090LC, 001UG, 002LM, 001OI, 001UG),
Hour1 = c(0.02222222, 0.02222222, 15.07222, 15.44444, 6.652778, 3.286111))
df2 = data.frame(Block.Id = c(090LC, 001UG, 001UG, 002LM, 001OI),
Sector.ID = c(BIRDFIS, UKOVS, LLLLALL, EBBUEHS, LEBLDDN),
Hour_In = c(0.000000, 0.000000, 13.000000, 0.000000, 0.000000),
Hour_Out = c(23.50000, 13.000000, 23.50000, 23.50000, 23.50000))
根据一天中的小时,将不同的Sector.ID分配给相同的Block.ID。
是否可以根据以下条件将它们合并到单个数据框中?
我正在寻找的是一个长度为df1的数据帧,其中包含数据Plane.ID,Block.ID和Sector.ID。这样的事情(我不知道如何在这里建立表格,所以我用表格上传了图像):
df_final
我尝试了rbind,left_join,merge,cbind,但没有任何好处。我什至尝试循环执行此操作,但不是一个好主意。
答案 0 :(得分:0)
如何在内部使用dplyr
在“ Block_id”上进行内部联接并通过“ Hour1”进行过滤?
df1 =
data.frame(
Plane.Id = c(19924519, 19924321, 19992436, 19924119, 19924208, 19924330),
Block.ID = c("090LC", "090LC", "001UG", "002LM", "001OI", "001UG"),
Hour1 = c(0.02222222, 0.02222222, 15.07222, 15.44444, 6.652778, 3.286111)
)
df2 = data.frame(
Block.ID = c("090LC", "001UG", "001UG", "002LM", "001OI"),
Sector.ID = c("BIRDFIS", "UKOVS", "LLLLALL", "EBBUEHS", "LEBLDDN"),
Hour_In = c(0.000000, 0.000000, 13.000000, 0.000000, 0.000000),
Hour_Out = c(23.50000, 13.000000, 23.50000, 23.50000, 23.50000)
)
dplyr::inner_join(df1, df2, by="Block.ID") %>%
dplyr::filter(Hour1 > Hour_In & Hour1 < Hour_Out)
答案 1 :(得分:0)
这是使用data.table
的替代解决方案:
library(data.table)
setDT(df1)
setDT(df2)
df1[df2, on = .(Block.ID, Hour1 >= Hour_In, Hour1 <= Hour_Out), .(Plane.Id, Block.ID, Sector.ID)]
输出
Plane.Id Block.ID Sector.ID
1: 19924519 090LC BIRDFIS
2: 19924321 090LC BIRDFIS
3: 19924330 001UG UKOVS
4: 19992436 001UG LLLLALL
5: 19924119 002LM EBBUEHS
6: 19924208 001OI LEBLDDN