背景
我有两个data.frames,一个有多个公司,另一个有一个索引,
我想在满足这些两个条件时计算:
第一个条件:两家公司继续前进(仅当A = A或C = C时)
第二个条件:索引显示方向相反,当公司显示A = A,索引显示C或公司显示C = C时,索引显示A
示例:第1列 - Comp1(C)Comp3(C)&第1列 - index1(A)| COUNT = 1
6对将是Comp1& Comp2,Comp1& Comp3,Comp1& Comp4,Comp2& Comp3,Comp2& Comp4和Comp3& Comp4 - 加上每对的索引
不知道哪个功能可以帮助我...
data.frames代码:
#Data.frame1 COMPANIES
comp1 <- c("C","A","B","B","A")
comp2 <- c("A","A","C","C","C")
comp3 <- c("C","B","B","A","A")
comp4 <- c("C","C","A","A","A")
dfcomp <- data.frame(comp1, comp2, comp3, comp4)
#Data.frame2 INDEX
index1 <- c("A","B","C","C","C")
dfindex <- data.frame(index1)
最终输出:像4x4矩阵一样产生一行(只是有趣的值)
[12i] [13i] [14i] [23i] [24i] [34i]
[1] 0 2 2 0 0 3
答案 0 :(得分:1)
其中一种方法可能是
library(dplyr)
comp_func <- function(x, y, temp, index){
temp <- bind_cols(temp[,!is.na(match(names(temp), c(x, y)))], index)
temp[,] <- lapply(temp, function(i) as.character(i))
ret <- sum(temp[,1] == temp[,2] &
temp[,1] %in% c('A', 'C') &
((temp[,1]=='A' & temp[,3]=='C') | (temp[,1]=='C' & temp[,3]=='A')))
return(ret)
}
df <- as.data.frame.matrix(t(combn(names(dfcomp),2)), stringsAsFactors = F)
df %>%
rowwise() %>%
mutate(val = comp_func(V1, V2, dfcomp, dfindex))
输出为:
V1 V2 val
1 comp1 comp2 0
2 comp1 comp3 2
3 comp1 comp4 2
4 comp2 comp3 0
5 comp2 comp4 0
6 comp3 comp4 3
示例数据:
dfcomp <- structure(list(comp1 = structure(c(3L, 1L, 2L, 2L, 1L), .Label = c("A",
"B", "C"), class = "factor"), comp2 = structure(c(1L, 1L, 2L,
2L, 2L), .Label = c("A", "C"), class = "factor"), comp3 = structure(c(3L,
2L, 2L, 1L, 1L), .Label = c("A", "B", "C"), class = "factor"), comp4 = structure(c(2L, 2L, 1L, 1L, 1L), .Label = c("A", "C"), class = "factor")), .Names = c("comp1", "comp2", "comp3",
"comp4"), row.names = c(NA, -5L), class = "data.frame")
dfindex <- structure(list(index1 = structure(c(1L, 2L, 3L, 3L, 3L), .Label = c("A",
"B", "C"), class = "factor")), .Names = "index1", row.names = c(NA,
-5L), class = "data.frame")