如何比较R中的表条目?

时间:2016-06-09 23:35:00

标签: r

我使用table()命令(在矩阵上)有两个不同的表。第一个表有大约200个单词和它们的出现频率,第二个表有大约400个单词和它们的出现频率。我想知道第一张桌子和第二张桌子上每个单词出现的次数(不是出现的总量)。

1 个答案:

答案 0 :(得分:1)

x <- c("a", "the", "cat", "dog", "money", "dog", "money", "dog", "money")
y <- c("a", "the", "cat", "cat", "cat", "dog", "money", "dog", "money", "women")

xx <- table(x)
yy <- table(y)


xx <- data.frame(xx) # Get it out of the table class
colnames(xx) <- c("word", "table1_freq") # name columns appropriately
yy <- data.frame(yy) # Get it out of the table class
colnames(yy) <- c("word", "table2_freq") # name columns appropriately

pacman::p_load(rowr)
result <- cbind.fill(xx,yy, fill=NA)

# Now to replace the NA's with what you requested in the comment:
result$table1_freq <- as.numeric(result$table1_freq)
result$table2_freq <- as.numeric(result$table2_freq)

result$table1_freq[is.na(result$table1_freq)] <- 0  
result$table2_freq[is.na(result$table2_freq)] <- 0  

result[,1] <- as.character(result[,1])
result[,3] <- as.character(result[,3])
result[is.na(result[,1]),1] <- result[is.na(result[,1]),3]
result[is.na(result[,3]),3] <- result[is.na(result[,3]),1]
result

   word table1_freq  word table2_freq
1     a           1     a           1
2   cat           1   cat           3
3   dog           2   dog           2
4 money           2 money           2
5   the           1   the           1
6 women           0 women           1
> 

我在这里使用了pacman,但您也可以正常安装软件包,如果您没有,请使用requirelibrary