假设我有一个基因型数据集:geno
FID rs1 rs2 rs3
1 1 0 2
2 1 1 1
3 0 1 1
4 0 1 0
5 0 0 2
另一个数据集是:男女同校
rs1 rs2 rs3
0.6 0.2 0.3
执行以下代码:
geno$rs1 <- geno$rs1 * coed$rs1
geno$rs2 <- geno$rs2 * coed$rs2
geno$rs3 <- geno$rs3 * coed$rs3
sum3 <- rowSums(geno[,c(2:4)])
c <- cbind(geno,sum3)
我会得到我想要的输出
FID rs1 rs2 rs3 sum3
1 0.6 0 0.6 1.2
2 0.6 0.2 0.3 1.1
3 0 0.2 0.3 0.5
4 0 0.2 0 0.2
5 0 0 0.6 0.6
但是我有成千上万的SNP,我试图构建下面的循环
snp <- names(geno)[2:4]
geno.new <- numeric(0)
for (i in snp){
geno.new[i] = geno1[i] * coed[i]
}
结果不是我所期望的
$rs1
[1] 0.6 0.6 0.0 0.0 0.0
$rs2
[1] 0.0 0.2 0.2 0.2 0.0
$rs3
[1] 0.6 0.3 0.3 0.0 0.6
有人可以帮我改进吗?
由于
答案 0 :(得分:0)
我确实找到了解决方案,请参阅以下代码:
## read datasets
geno <- read.table("Genotype.csv",header=T,sep=",")
dim(geno)
coed <- read.table("beta.csv",header=T,sep=",")
## define the snp name
snp <- names(geno)[2:4]
## building for loop
for (i in snp){
geno[i] <- geno[i] * coed[i]
}
## caculate the sums
sum <- rowSums(geno[,c(2:4)])
## combind the results
all <- cbind(geno,sum)