Question

我目前正在使用包含以下数据的列表：

/etc/nginx/nginx.conf

我想将每列的所有数字相加，并且只保留每一行中存在的第1列中的值。

输出应如下所示：

>resultsList

$`1`
[1] "x" "0"           "1"           "1"           "1"           "5"          
$`2`
[1] "x /// y" "0"     "1"           "1"           "2"           "3"          
$`3`
[1] "x" "0"           "1"           "3"           "2"           "4"          
$`4`
[1] "x /// z" "0"     "1"           "2"           "2"           "2"          
$`5`
[1] "x" "0"           "1"           "3"           "3"           "4"          
$`6`
[1] "x" "0"           "0"           "0"           "1"           "2"          
$`7`
[1] "x" "0"           "2"           "2"           "1"           "4"               
$`8`
[1] "x /// y" "0"     "2"           "2"           "1"           "2"

我该如何做到这一点？

Answer 1

您可以使用此方法：

c(resultsList[[1]][1], 
  colSums("mode<-"(do.call(rbind, resultsList)[ , -1], "numeric")))
# "x"  "0"  "9"  "14" "13" "26"

这里，函数"mode<-"用于改变矩阵do.call(rbind, resultsList)[ , -1]的模式，包括表示为字符串的数字。

字符矩阵：

do.call(rbind, resultsList)[ , -1]
#     [,1] [,2] [,3] [,4] [,5]
# [1,] "0"  "1"  "1"  "1"  "5" 
# [2,] "0"  "1"  "1"  "2"  "3" 
# [3,] "0"  "1"  "3"  "2"  "4" 
# [4,] "0"  "1"  "2"  "2"  "2" 
# [5,] "0"  "1"  "3"  "3"  "4" 
# [6,] "0"  "0"  "0"  "1"  "2" 
# [7,] "0"  "2"  "2"  "1"  "4" 
# [8,] "0"  "2"  "2"  "1"  "2"

数字矩阵：

"mode<-"(do.call(rbind, resultsList)[ , -1], "numeric")
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    0    1    1    1    5
# [2,]    0    1    1    2    3
# [3,]    0    1    3    2    4
# [4,]    0    1    2    2    2
# [5,]    0    1    3    3    4
# [6,]    0    0    0    1    2
# [7,]    0    2    2    1    4
# [8,]    0    2    2    1    2

命令"mode<-"(x, y)与mode(x) <- y类似，但不会更改x并返回结果。

Answer 2

修改这是一个解决方案，假设您的所有第1列字符串都采用“var /// var2 /// ...”的形式。我们首先恢复所有这些唯一变量：

resultsList <- list(c("x","0","1","1","1","5"), c("x /// y","0","1","1","2","3"), c("x","0","1","3","2","4"), c("x /// z","0","1","2","2","2"), c("x","0","1","3","3","4"), c("x","0","0","0","1","2"), c("x","0","2","2","1","4"), c("x /// y","0","2","2","1","2")) firstColumn <- sapply(resultsList,"[[",1) listsOfVariables <- c(strsplit(firstColumn," /// ")) vector <- c() for(i in 1:length(listsOfVariables)) { vector <- c(vector,listsOfVariables[[i]]) } uniqueVariables <- unique(vector) uniqueVariables [1] "x" "y" "z"

接下来，我们找出哪些变量包含在所有单独的行中：

matches <- sapply(1:length(uniqueVariables), function(x,y) grep(uniqueVariables[x],y), y=firstColumn) variablesMatchingAllRows <- uniqueVariables[sapply(matches,"length")==length(resultsList)] variablesMatchingAllRows [1] "x"

然后我们将变量粘贴在一起（如果您有多个匹配所有行的变量）：

variablesMatchingAllRowsTest <- c("x","y","z") paste(variablesMatchingAllRowsTest,collapse=" /// ") [1] "x /// y /// z"

我们获得最后一列1字符串并添加列总和：

> finalString <- paste(variablesMatchingAllRows,collapse=" /// ") > c(finalString,colSums("mode<-"(do.call(rbind, resultsList)[ , -1], "numeric"))) [1] "x" "0" "9" "14" "13" "26"

OLD ANSWER

在下面的示例中，我们将首先在第1列中找到具有最小字符串大小的唯一字符串，然后我们将检查该最小字符串是否包含在其他字符串中。然后，我们将计算匹配行的列数。我们将此数据用于示例：

> resultsList <- list(c("x","0","1","1","1","5"), + c("a b x /// y","0","1","1","2","3"), + c("x","0","1","3","2","4"), + c("a /// z","0","1","3","3","4"), + c("bd x","0","1","5","3","6")) > resultsList [[1]] [1] "x" "0" "1" "1" "1" "5" [[2]] [1] "a b x /// y" "0" "1" "1" "2" "3" [[3]] [1] "x" "0" "1" "3" "2" "4" [[4]] [1] "a /// z" "0" "1" "3" "3" "4" [[5]] [1] "bd x" "0" "1" "5" "3" "6"

首先，我们找到与此minimalString匹配的minimalString和相应的行索引：

firstColumn <- sapply(resultsList,"[[",1) minimalString <- unique(firstColumn[nchar(firstColumn)==min(nchar(firstColumn))]) indices <- grep(minimalString,firstColumn) # Grep on the first element in minimalString

我们得到：

> minimalString [1] "x" > indices [1] 1 2 3 5

换句话说，除了第4行之外的所有行都匹配你的minimalString。接下来，我们在匹配的行上添加所有columnums，如下所示：

> c(minimalString, as.character(apply(sapply(2:6,function(x,y,z) as.numeric(sapply(y,"[[",x)),y=resultsList)[indices,],2,sum))) [1] "x" "0" "4" "10" "8" "18"

为了清晰起见，我们将进一步细分：

内部sapply(y,"[[",x))将获取列表y中索引x的所有元素，并将它们作为向量返回。我们为y = resultsList和x = 2:6执行此操作。请注意，我们还必须先将字符转换为数字：

> intermediateResult <- sapply(2:6,function(x,y,z) as.numeric(sapply(y,"[[",x)),y=resultsList) > intermediateResult [,1] [,2] [,3] [,4] [,5] [1,] 0 1 1 1 5 [2,] 0 1 1 2 3 [3,] 0 1 3 2 4 [4,] 0 1 3 3 4 [5,] 0 1 5 3 6

接下来，我们计算与indices匹配的行的列号：

> sums <- apply(intermediateResult[indices,],2,sum) > sums [1] 0 4 10 8 18

最后，我们仍然需要将总和转换回字符并在前面添加唯一的第1列标识符。我们得到：

> finalResult <- c(minimalString,as.character(sums)) > finalResult [1] "x" "0" "4" "10" "8" "18"

对于您的示例，我们得到以下结果：

> resultsList <- list(c("x","0","1","1","1","5"), + c("x /// y","0","1","1","2","3"), + c("x","0","1","3","2","4"), + c("x /// z","0","1","2","2","2"), + c("x","0","1","3","3","4"), + c("x","0","0","0","1","2"), + c("x","0","2","2","1","4"), + c("x // y","0","2","2","1","2")) > firstColumn <- sapply(resultsList,"[[",1) > minimalString <- unique(firstColumn[nchar(firstColumn)==min(nchar(firstColumn))]) > indices <- grep(minimalString,firstColumn) # Grep on the first element in minimalString > minimalString [1] "x" > indices [1] 1 2 3 4 5 6 7 8 > c(minimalString, as.character(apply(sapply(2:6,function(x,y,z) as.numeric(sapply(y,"[[",x)),y=resultsList)[indices,],2,sum))) [1] "x" "0" "9" "14" "13" "26"

在列表中添加值

2 个答案: