如何对数据表中单个列中的值进行算术运算?

时间:2017-05-08 03:46:55

标签: r

在data.table中是一个包含我想要算术运算的因子的列。我想将每个比率左侧的三个值相加,并将比率右侧的三个数相加,然后将该总和值作为比率返回。解释起来很棘手,但如果我把它作为数据表的一部分:

     FattyAcid
1    4:0/16:0/16:0
2    16:0/16:0/18:1
3    18:1/14:0/18:1

然后我想返回数据表

     FattyAcid        Assignment
1    4:0/16:0/16:0    36:0
2    16:0/16:0/18:1   50:1
3    18:1/14:0/18:1   50:2

即。对于条目1,(4 + 16 + 16):( 0 + 0 + 0)= 36:0

当我在str函数中调用数据集时,它显示相关列为:"因子w / 179级别"(10:0/10:0/12:0)&#34 ;,..:112 104 114 33 61 115 106 30 60 66 ..."

编辑:我找到了一个解决方案,但它并不优雅。基本上我必须使用tstrsplit()分隔值并将它们粘贴到新列中,最终生成六列。然后将它们转换为数字(来自字符),组合相关列,然后再次组合该结果。然后我只删除旧列。我确定有更好的方法,但我猜它有效:)

### split up the fatty acid factors into three columns separated by "/"     i.e. individual ID'd fatty acids.
### also remove the starting and trailing brackets
setDT(LipidDataShortest)[, paste0("FattyAcid", 1:3) := tstrsplit(FattyAcid, "/")]
LipidDataShortest <- as.data.table(sapply(LipidDataShortest, gsub, pattern="[(]", replacement = ""))
LipidDataShortest <- as.data.table(sapply(LipidDataShortest, gsub, pattern="[)]", replacement = ""))

### small issue - also removes bracket from "FattyAcid" column. Way to remove only from specific columns?

### split up the specific fatty acids into number of carbons and number of double bonds
setDT(LipidDataShortest)[, paste0("FattyAcidOne", 1:2) := tstrsplit(FattyAcid1, ":")]
setDT(LipidDataShortest)[, paste0("FattyAcidTwo", 1:2) := tstrsplit(FattyAcid2, ":")]
setDT(LipidDataShortest)[, paste0("FattyAcidThree", 1:2) := tstrsplit(FattyAcid3, ":")]

### convert from character to numeric
LipidDataShortest$FattyAcidOne1 <- as.numeric(LipidDataShortest$FattyAcidOne1)
LipidDataShortest$FattyAcidOne2 <- as.numeric(LipidDataShortest$FattyAcidOne2)
LipidDataShortest$FattyAcidTwo1 <- as.numeric(LipidDataShortest$FattyAcidTwo1)
LipidDataShortest$FattyAcidTwo2 <- as.numeric(LipidDataShortest$FattyAcidTwo2)
LipidDataShortest$FattyAcidThree1 <- as.numeric(LipidDataShortest$FattyAcidThree1)
LipidDataShortest$FattyAcidThree2 <- as.numeric(LipidDataShortest$FattyAcidThree2)

### combine the columns to get total carbons and create new column for that, then repeat for alkenes
setDT(LipidDataShortest)[, paste0("Carbons", 1) := LipidDataShortest$FattyAcidOne1 + LipidDataShortest$FattyAcidTwo1 + LipidDataShortest$FattyAcidThree1 ]
setDT(LipidDataShortest)[, paste0("DoubleBonds", 1) := LipidDataShortest$FattyAcidOne2 + LipidDataShortest$FattyAcidTwo2 + LipidDataShortest$FattyAcidThree2 ]

### combine final assignments into new column and delete the unnecessary columns used to get to this point
LipidDataShortest$Assignment <- paste(LipidDataShortest$Carbons1, LipidDataShortest$DoubleBonds1, sep = ":")
LipidDataShortest <- LipidDataShortest[, -c(10:20)]

0 个答案:

没有答案