我有一个带有表达式值的数据帧df
,而我在数据帧Weights
中具有权重。
对于df
中的每一列,我想将df
中的每一行与Weights
中具有相似行名的对应行相乘。
然后对于df
中的每一列,您将获得行的加权值。
请查看我的检查输出。
df
Gene MMRF_1021 MMRF_1024 MMRF_1029 MMRF_1030 MMRF_1031
ENSG00000007062 0.05374547 0.01258559 0.0000000 1.2985088 0.37618693
ENSG00000012124 0.13436368 0.27688288 0.2780448 0.7158432 0.03271195
重量
Gene Pre.BI Pre.BII Immature Naive Memory Plasmacell
ENSG00000007062 0.006368928 0.000000e+00 0.000000000 0.0000000000 0.000000000 0.000000000
ENSG00000012124 0.000000000 0.000000e+00 0.000000000 0.0000000000 0.000000000 -0.009728154
退出:
Sample Gene Pre.BI Pre.BI Immature Naive Memory Plasmacell
MMRF_1021 ENSG00000007062 0.000342301 0 0 0 0 0
MMRF_1021 ENSG00000012124 0 0 0 0 0 -0.001307111
MMRF_1024 ENSG00000007062 8.015672e-05 0 0 0 0 0
MMRF_1024 ENSG00000012124 0 0 0 0 0 -0.002693559
.....
dput df:
structure(list(MMRF_1021 = c(0.0537454710193116, 0.134363677548279
), MMRF_1024 = c(0.0125855939107651, 0.276882875966623), MMRF_1029 = c(0,
0.278044754955015), MMRF_1030 = c(1.29850876031527, 0.715843203834688
), MMRF_1031 = c(0.37618693249153, 0.032711952160723)), row.names = c("ENSG00000007062",
"ENSG00000012124"), class = "data.frame")
输出权重:
structure(list(Pre.BI = c(0.006368928, 0), Pre.BII = c(0, 0),
Immature = c(0, 0), Naive = c(0, 0), Memory = c(0, 0), Plasmacell = c(0,
-0.009728154)), row.names = c("ENSG00000007062", "ENSG00000012124"
), class = "data.frame")
答案 0 :(得分:2)
我认为您可能正在寻找:
library(tidyverse)
joinedDataframe <- df %>%
rownames_to_column("gene") %>%
gather("sample", "value", -gene) %>%
left_join(weights %>%
rownames_to_column("gene")
, by = "gene")
joinedDataframe %>%
mutate(Pre.BI = Pre.BI * value
, Pre.BII = Pre.BII * value
, Immature = Immature * value
, Naive = Naive * value
, Memory = Memory * value
, Plasmacell = Plasmacell * value) %>%
select(-value)
gene sample Pre.BI Pre.BII Immature Naive Memory Plasmacell
1 ENSG00000007062 MMRF_1021 3.423010e-04 0 0 0 0 0.0000000000
2 ENSG00000012124 MMRF_1021 0.000000e+00 0 0 0 0 -0.0013071105
3 ENSG00000007062 MMRF_1024 8.015674e-05 0 0 0 0 0.0000000000
4 ENSG00000012124 MMRF_1024 0.000000e+00 0 0 0 0 -0.0026935593
5 ENSG00000007062 MMRF_1029 0.000000e+00 0 0 0 0 0.0000000000
6 ENSG00000012124 MMRF_1029 0.000000e+00 0 0 0 0 -0.0027048622
7 ENSG00000007062 MMRF_1030 8.270109e-03 0 0 0 0 0.0000000000
8 ENSG00000012124 MMRF_1030 0.000000e+00 0 0 0 0 -0.0069638329
9 ENSG00000007062 MMRF_1031 2.395907e-03 0 0 0 0 0.0000000000
10 ENSG00000012124 MMRF_1031 0.000000e+00 0 0 0 0 -0.0003182269
答案 1 :(得分:1)
看到您的预期结果,我认为以下是您所追求的。例如,Plasmacell
的{{1}}是-0.002693559(0.27688288 * -0.009728154)。为了获得此数字,我将两个数据帧都转换为长格式数据。然后,我加入了他们。到此时,您有两列要处理乘法(即gene_value和value)。之后,我将数据转换为宽格式的数据框。
MMRF_1024 ENSG00000012124
答案 2 :(得分:0)
这是基本的R解决方案
dfout <- do.call(rbind,
c(make.row.names = F,
lapply(seq(ncol(df)),
function(k) cbind(Gene = rownames(df[k]),
Sample = names(df[k]),
df[,k]*weights[match(rownames(weights),rownames(df)),]))))
这样
> dfout
Gene Sample Pre.BI Pre.BII Immature Naive Memory Plasmacell
1 ENSG00000007062 MMRF_1021 3.423010e-04 0 0 0 0 0.0000000000
2 ENSG00000012124 MMRF_1021 0.000000e+00 0 0 0 0 -0.0013071105
3 ENSG00000007062 MMRF_1024 8.015674e-05 0 0 0 0 0.0000000000
4 ENSG00000012124 MMRF_1024 0.000000e+00 0 0 0 0 -0.0026935593
5 ENSG00000007062 MMRF_1029 0.000000e+00 0 0 0 0 0.0000000000
6 ENSG00000012124 MMRF_1029 0.000000e+00 0 0 0 0 -0.0027048622
7 ENSG00000007062 MMRF_1030 8.270109e-03 0 0 0 0 0.0000000000
8 ENSG00000012124 MMRF_1030 0.000000e+00 0 0 0 0 -0.0069638329
9 ENSG00000007062 MMRF_1031 2.395907e-03 0 0 0 0 0.0000000000
10 ENSG00000012124 MMRF_1031 0.000000e+00 0 0 0 0 -0.0003182269