Question

我有以下文件：

文件1

structure(list(Total_Gene_Symbol = c("5S_rRNA", "7SK", "A1BG-AS1"
), Test = c("1.02, 1.12, 1.11, 1.18, 1.12, 1.19, 1.25, 1.24, 1.24, 1.02", 
"1.97, 2.27, 2.14, 1.15", "1.3, 1.01, 1.36, 1.42, 1.38, 1.01, 1.31, 1.34, 
1.29, 1.34, 2.02, 1.12, 1.01, 1.31, 1.22"
)), .Names = c("Total_Gene_Symbol", "Test"), row.names = c(NA, 
3L), class = "data.frame")

文件1列测试是用“，”分隔的数字。

我尝试了

mat <- stri_split_fixed(Down_FC, ',', simplify=T)
mat <- `dim<-`(as.numeric(mat), dim(mat))  # convert to numeric and save dims
rowMeans(mat, na.rm=T)->M
View(M)

但是上面的代码是平均整个数据。

我想要输出相同的文件2 文件2

structure(list(Total_Gene_Symbol = c("5S_rRNA", "7SK", "A1BG-AS1"
), Test = c("1.02, 1.12, 1.11, 1.18, 1.12, 1.19, 1.25, 1.24, 1.24, 1.02", 
"1.97, 2.27, 2.14, 1.15", "1.3, 1.01, 1.36, 1.42, 1.38, 1.01, 1.31, 1.34, 
1.29, 1.34, 2.02, 1.12, 1.01, 1.31, 1.22"
), Average = c(11.49, 7.53, 19.44)), .Names = c("Total_Gene_Symbol", 
"Test", "Average"), row.names = c(NA, 3L), class = "data.frame")

Answer 1

使用apply

d1$sum <- apply(d1,1,
                function(x)(sum(as.numeric(unlist(strsplit(x['Test'],','))),na.rm = TRUE)))

Answer 2

您想要的是总和而不是平均值！平均值类似于众数，中位数，均值。

library(magrittr)
df1$total_sum<-
    df1$Test %>% str_split(.,",\\s+") %>% sapply(function(x) as.numeric(x) %>% sum(na.rm=T))

Answer 3

您可以使用scan：

df$sum     <- sapply(df$Test, function(x)  sum(scan(text = x, what=numeric(),sep=","), na.rm=TRUE))
df$average <- sapply(df$Test, function(x) mean(scan(text = x, what=numeric(),sep=","), na.rm=TRUE))

#   Total_Gene_Symbol                                                                                                  Test   sum average
# 1           5S_rRNA                                            1.02, 1.12, 1.11, 1.18, 1.12, 1.19, 1.25, 1.24, 1.24, 1.02 11.49  1.1490
# 2               7SK                                                                                1.97, 2.27, 2.14, 1.15  7.53  1.8825
# 3          A1BG-AS1 1.3, 1.01, 1.36, 1.42, 1.38, 1.01, 1.31, 1.34, \n            1.29, 1.34, 2.02, 1.12, 1.01, 1.31, 1.22 19.44  1.2960

如何计算R中逗号分隔的数字串的平均值

3 个答案: