减去配对列

时间:2017-05-04 09:16:28

标签: r

我有一个包含192列的数据框。我想通过减去基于配对的匹配列来制作96列数据帧。配对信息位于 //add file uploaded handler uploader.addFileUploadedListener(new Plupload.FileUploadedListener() { @Override public void onFileUploaded(PluploadFile file) { File uploadedFile = file.getUploadedFile(); System.out.println("This file was just uploaded: " + uploadedFile.getAbsolutePath()); } }); 列的数据框Pairing中。列Match匹配我想要减去的数据框的列名作为子字符串。

如何使用Pos中的配对信息来确定要减去的列。

Pairing

2 个答案:

答案 0 :(得分:1)

我假设每个Case都有一个Control,反之亦然。将配对data.frame转换为对齐Case和Control似乎最简单。完成后,您可以构建所需的数据框。

## First, recreate your data
Pairing = read.table(text="Match              Pos
Control_70         001_A01
Control_56         001_A02
   Case_70         001_A03
   Case_56         001_A04
Control_21         001_A05
   Case_21         001_A06",
header=TRUE)

matures.cpm.spike.batch = read.table(text=" 001_A01_S1 001_A02_S2 001_A03_S3 001_A04_S4 001_A05_S5 001_A06_S6
hsa-let-7a-5p  16.566813  11.415796  12.400252  22.701457   8.864882  20.442599
hsa-let-7b-5p  15.574190  11.107133  12.196465  17.954547   8.527478  25.788286
hsa-let-7c-5p   5.976763   4.372978   5.984685   9.821348   6.341252   7.480211
hsa-let-7d-3p  16.508818  10.697730  11.001534  18.375286   7.583910  24.974774
hsa-let-7d-5p  13.273824   5.134547   9.456675  11.567230   7.096485  13.294108
hsa-let-7f-5p  13.900711   9.804384  11.481614  20.002110   7.878241  17.295909",
header=TRUE)

## Build Matches to replace your Pairing
Control = Pairing[grep("Control", Pairing$Match),]
Control = Control[order(Control$Match),]
Case = Pairing[grep("Case", Pairing$Match),]
Case = Case[order(Case$Match),]
Matches = cbind(Control, Case)

# Uses Matches to build desired data.frame
Diffs = data.frame(matures.cpm.spike.batch[, Matches[1,4]] - 
        matures.cpm.spike.batch[, Matches[1,2]])
colnames(Diffs)[1] = sub("Control", "Diff", Matches[1,1])
for(i in 2:nrow(Matches)) {
    Diffs[,i] = matures.cpm.spike.batch[, Matches[i,4]] - 
        matures.cpm.spike.batch[, Matches[i,2]]
    colnames(Diffs)[i] = sub("Control", "Diff", Matches[i,1])
}

## Result
    Diff_21   Diff_56   Diff_70
1 11.577717 11.285661 -4.166561
2 17.260808  6.847414 -3.377725
3  1.138959  5.448370  0.007922
4 17.390864  7.677556 -5.507284
5  6.197623  6.432683 -3.817149
6  9.417668 10.197726 -2.419097

答案 1 :(得分:0)

以防万一,采用不同的方法:

我们需要一个经过修改的匹配数据框,其中包含用于案例和控件的单独列:

library(tidyr)
library(reshape2)

P <- Pairing %>% 
    separate(Match, into = c("cc", "ind"), sep = "_") %>% 
    dcast(ind ~ cc, value.var = "Pos")

P:

  ind    Case Control
1  21 001_A06 001_A05
2  56 001_A04 001_A02
3  70 001_A03 001_A01

我们还希望colnames中的matures.cpm.spike.batchP中的姓名匹配:

df <- matures.cpm.spike.batch
colnames(df) <- gsub("^X|_S.*", "", colnames(df))

colnames(df):

[1] "001_A01" "001_A02" "001_A03" "001_A04" "001_A05" "001_A06"

现在我们可以完成以下任务:

case <- df[, P$Case]
control <- df[, P$Control]
res <- case - control

res:

                001_A06   001_A04   001_A03
hsa-let-7a-5p 11.577717 11.285661 -4.166561
hsa-let-7b-5p 17.260808  6.847414 -3.377725
hsa-let-7c-5p  1.138959  5.448370  0.007922
hsa-let-7d-3p 17.390864  7.677556 -5.507284
hsa-let-7d-5p  6.197623  6.432683 -3.817149
hsa-let-7f-5p  9.417668 10.197726 -2.419097