这是一个分为两个部分的问题。我有一个数据集,试图将选择列一起添加,但是我也想更改数据,因此添加起来稍微容易一些。这是我的数据集的一个例子。数据集称为ChrData
ChrData
Chr location sample1 sample2 sample3 sample4 sample5
1 1 34234 ./. 0/1 1/1 0/1 0/0
2 1 5677876 0/1 1/1 1/2 0/0 1/1
3 1 75424 ./. ./. 1/1 0/1 0/0
4 1 98654 1/1 0/1 1/1 0/0 0/0
5 1 4534 1/1 0/1 ./. 0/0 2/2
所以我要设置
./. = 0
0/0 = 0
0/1 = 1
1/2 = 1
1/1 = 2
2/2 = 2
然后添加列:
ChrData$sample1 + ChrData$sample2 + ChrData$sample4
还有:
ChrData$sample3 + ChrData$sample5
,然后使用此数据创建两个新列。我只是不确定如何使R识别新变量,然后将其应用于每个单元格?
答案 0 :(得分:1)
首先要考虑的基本功能是,假设所有元素都是示例列中的字符,
replacement<-function(x){
x=replace(x,which(x=='./.'),0)
x=replace(x,which(x=='0/0'),0)
x=replace(x,which(x=='0/1'), 1)
x=replace(x,which(x=='1/2'),1)
x=replace(x,which(x=='1/1'),2)
x=replace(x,which(x=='2/2'),2)
}
ChrData=apply(ChrData,2,replacement)
ChrData[,3:7]=apply(ChrData,2,as.numeric)
ChrData$Sum1=ChrData$sample1 + ChrData$sample2 + ChrData$sample4
ChrData$Sum2=ChrData$sample3 + ChrData$sample5
答案 1 :(得分:1)
使用 dplyr :
# reproducible data
ChrData <- read.table(text = "
Chr location sample1 sample2 sample3 sample4 sample5
1 1 34234 ./. 0/1 1/1 0/1 0/0
2 1 5677876 0/1 1/1 1/2 0/0 1/1
3 1 75424 ./. ./. 1/1 0/1 0/0
4 1 98654 1/1 0/1 1/1 0/0 0/0
5 1 4534 1/1 0/1 ./. 0/0 2/2", stringsAsFactors = FALSE)
library(dplyr)
# make lookup map
MAP <- setNames(c(0,0,1,1,2,2), c("./.","0/0","0/1","1/2","1/1","2/2"))
# convert using MAP, then rowsums per sample groups
ChrData <- ChrData %>%
mutate_at(.vars = vars(starts_with("sample")), .funs = funs(MAP[ . ])) %>%
mutate(s124 = rowSums(.[ c("sample1","sample2","sample4") ]),
s35 = rowSums(.[ c("sample3","sample5") ]))
ChrData
# Chr location sample1 sample2 sample3 sample4 sample5 s124 s35
# 1 1 34234 0 1 2 1 0 2 2
# 2 1 5677876 1 2 1 0 2 3 3
# 3 1 75424 0 0 2 1 0 1 2
# 4 1 98654 2 1 2 0 0 3 2
# 5 1 4534 2 1 0 0 2 3 2