计算两列对

时间:2019-04-12 14:14:03

标签: r

我有一个这样的数据框:

df <- data.frame(var1 = c("google", "yahoo", "google", "yahoo", "google"), 
                 var2 = c("price1","price1","price1","price1","price2"))

我想计算两列的成对频率。这里是预期的输出:

df_output <- data.frame(var1 = c("google","google","yahoo","yahoo"), 
                        var2 = c("price1","price2","price1","price2"), count = c(2,1,2,0))
df_output
#      var1   var2 count
# 1 google price1     2
# 2 google price2     1
# 3  yahoo price1     2
# 4  yahoo price2     0

我该怎么做?

3 个答案:

答案 0 :(得分:4)

Base R解决方案:

as.data.frame(table(df$var1, df$var2))
#     Var1   Var2 Freq
# 1 google price1    2
# 2  yahoo price1    2
# 3 google price2    1
# 4  yahoo price2    0

答案 1 :(得分:3)

一种tidyverse可能是:

df %>%
 count(var1, var2) %>%
 complete(var1, nesting(var2), fill = list(n = 0))

  var1   var2       n
  <fct>  <fct>  <dbl>
1 google price1     2
2 google price2     1
3 yahoo  price1     2
4 yahoo  price2     0

在此,它按“ var1”和“ var2”计数,然后生成缺失的组合,并用0填充它们。

答案 2 :(得分:1)

使用dcastmelt

> as.data.frame(melt(dcast(df,var1~var2)))

OR

如果您有许多列,则将名称作为向量传递-

> var_select = c("var1", "var2")
> as.data.frame(table(subset(df, select = var_select)))

   var1   var2  Freq
1 google price1    2
2  yahoo price1    2
3 google price2    1
4  yahoo price2    0

注意-第二种解决方案基于@thothal提供的table功能