假设数据框如下所示:
A<-c("John","John","James","Brad")
B<-c("Deb","Deb","Henry","Suzie")
C<-c("Barry","Beth","Deb","Louise")
D<-c("Ben","Dory","John","Simon")
df<-data.frame(A,B,C,D)
df
A B C D
1 John Deb Barry Ben
2 John Deb Beth Dory
3 James Henry Deb John
4 Brad Suzie Louise Simon
如何生成频率表,显示A列和A列中值组合的总次数。 B位于同一行。此输出如下所示。
A B n
1 Brad Suzie 1
2 James Henry 1
3 John Deb 3
我知道使用dplyr
的简单频率表,但我无法在这种情况下使用它。
答案 0 :(得分:0)
df<-data.frame(A = c("John","John","James","Brad"),
B = c("Deb","Deb","Henry","Suzie"),
C = c("Barry","Beth","Deb","Louise"),
D = c("Ben","Dory","John","Simon"), stringsAsFactors = F)
df$seq <- paste(df$A, df$B, df$C, df$D, sep = ",")
names <- unique(c(df$A,df$B))
pairs <- combn(names, 2)
finaldf <- data.frame(name1 = NULL, name2 = NULL, count = NULL)
for(i in 1:ncol(pairs)){
name1 <- pairs[1,i]
name2 <- pairs[2,i]
count <- length(which( grepl(name1,df$seq) & grepl(name2,df$seq) ))
finaldf <- rbind(finaldf, data.frame(name1 = name1, name2 = name2, count = count))
}
finaldf
> finaldf
name1 name2 count
1 John James 1
2 John Brad 0
3 John Deb 3
4 John Henry 1
5 John Suzie 0
6 James Brad 0
7 James Deb 1
8 James Henry 1
9 James Suzie 0
10 Brad Deb 0
11 Brad Henry 0
12 Brad Suzie 1
13 Deb Henry 1
14 Deb Suzie 0
15 Henry Suzie 0