一个数据帧行与另一数据帧行的频率

时间:2019-07-23 06:38:08

标签: r

有人可以帮助我如何从另一个数据框中进行计数吗?

df1(out)

structure(list(Item = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), class = "factor", .Label = "0S1576"), LC = structure(c(1L, 
1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L), class = "factor", .Label = c("MW92", 
"OY01", "RM11")), Fiscal.Month = c("2019-M06", "2019-M07", "2019-M06", 
"2019-M07", "2019-M08", "2019-M09", "2019-M06", "2019-M07", "2019-M08"
)), row.names = c(NA, -9L), class = "data.frame")

df2(tempdf)

structure(list(Item = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "0S1576", class = "factor"), 
    LC = structure(c(1L, 1L, 1L, 1L, 2L, 3L, 4L, 6L, 5L, 1L, 
    2L, 2L, 3L, 3L), .Label = c("MW92", "OY01", "RM11", "RS11", 
    "WK14", "WK15"), class = "factor"), Fiscal.Month = structure(c(1L, 
    2L, 3L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("2019-M06", 
    "2019-M07", "2019-M08", "2019-M09"), class = "factor"), fcst = c(22L, 
    21L, 20L, 19L, 12L, 10L, 10L, 12L, 10L, 12L, 10L, 10L, 10L, 
    10L)), row.names = c(NA, -14L), class = "data.frame")

我想计算df2中df1的Item,LC,Fiscal.month的频率

1 个答案:

答案 0 :(得分:1)

您可以使用table进行计数,并使用df1df2factor合并,并且您需要interaction,因为要使用多个列进行合并。

table(factor(interaction(df2[c("Item","LC","Fiscal.Month")]), levels=interaction(df1)))
#0S1576.MW92.2019-M06 0S1576.MW92.2019-M07 0S1576.OY01.2019-M06 
#                   2                    1                    3 
#0S1576.OY01.2019-M07 0S1576.OY01.2019-M08 0S1576.OY01.2019-M09 
#                   0                    0                    0 
#0S1576.RM11.2019-M06 0S1576.RM11.2019-M07 0S1576.RM11.2019-M08 
#                   3                    0                    0 

或者使用matchtabulate改进速度的版本:

(df1$freq <-  tabulate(match(interaction(df2[c("Item","LC","Fiscal.Month")]), interaction(df1)), nrow(df1)))
#[1] 2 1 3 0 0 0 3 0 0

有时甚至可以使用fastmatch更快:

library(fastmatch)
df1$freq <-  tabulate(fmatch(interaction(df2[c("Item","LC","Fiscal.Month")]), interaction(df1)), nrow(df1))