Question

我需要使用自定义函数作为分组条件，从我的数据框中创建行组。该函数将比较两对行，如果这些行应该组合在一起，则返回true / false。

在示例数据集中：

id   field        code1  code2
1    textField1   055    066
2    textField2   100    120
3    textField3   300    350
4    textField4   800    450
5    textField5   460    900
6    textField6   490    700

                         ...

该函数按对（ function（row1，row2））检查行字段之间的某些规则，如果这些行应该在一起，则返回 TRUE / FALSE 。

我需要将该函数应用于数据框中的所有可能对，并生成一个列表（或其他结构），其中所有ID都匹配在一起。

将函数应用于每对的一种方法显示在this answer：

中

lapply(seq_len(nrow(df) - 1),
       function(i){
         customFunction( df[i,], df[i+1,] )
       })

但我想不出一种方法可以将 TRUE 的行分组为

编辑：重新阅读我的问题，似乎需要一个例子：

如果我们创建了一个包含所有可能组合的矩阵，结果将是：

      [,1]   [,2]   [,3]   [,4]   [,5]   [,6]
[1,]  TRUE   FALSE  FALSE  FALSE  FALSE  FALSE
[2,]  FALSE  TRUE   TRUE   TRUE   FALSE  FALSE
[3,]  FALSE  TRUE   TRUE   FALSE  FALSE  FALSE
[4,]  FALSE  TRUE   FALSE  TRUE   FALSE  FALSE
[5,]  FALSE  FALSE  FALSE  FALSE  TRUE   TRUE
[6,]  FALSE  FALSE  FALSE  FALSE  TRUE   TRUE

结果组将是：

1
2,3,4
5,6

Answer 1

这是一个执行您指定的功能：

mx <- matrix(c( TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,
FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,
FALSE,TRUE,TRUE,FALSE,FALSE,FALSE,
FALSE,TRUE,FALSE,TRUE,FALSE,FALSE,
FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,
FALSE,FALSE,FALSE,FALSE,TRUE,TRUE),6)


groupings <- function(mx){

    out <- list()
    while(dim(mx)[1]){
        # get the groups that match the first column
        g = which(mx[,1])

        # expand the selection to any columns for which 
        # there is match in the first row
        (expansion = which(apply(cbind(mx[,g]),1,any)))
        while(length(expansion) > length(g)){
            g = expansion

            # expand the selection to any columns for which 
            # there is match to the current group
            expansion = which(apply(cbind(mx[,g]),1,any))
        }

        out <- c(out,list(g))
        mx <- mx[-g,-g]
    }
    return(out)

}

groupings(mx)
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 1 2 3
#> 
#> [[3]]
#> [1] 1 2

R基于真/假函数对数据帧中的行组成对

1 个答案: