我想创建一个第三列,该列创建前两列中各个组成部分的差异之和。
x <- data.frame("Start" = c("21,22","14,15","2,4,6,8,10"),
"End" = c("31,32","19,20","12,14,16,18,20"))
第1行第3列应为(31-21)+(32-22) = 20
。
第2行第3列应为(19-14)+(20-15) = 10
。
第3行第3列应为(12-2)+(14-4)+(16-6)+(18-8)+(20-10)=50
。
答案 0 :(得分:0)
尝试:
# install.packages("tidyverse") # if needed
library(tidyverse)
final <- x %>%
mutate(startList = str_split(Start, ","),
endList = str_split(End, ",")) %>%
unnest(startList, endList) %>%
mutate(subtraction = as.numeric(endList) - as.numeric(startList)) %>%
group_by(Start, End) %>%
mutate(calc = sum(subtraction)) %>%
slice(1) %>%
ungroup() %>%
select(Start, End, calc)
答案 1 :(得分:0)
以下将使用tidyverse
中的map
,purr
,str_split
中的stringr
:
get_sum = function(z){
sum(as.numeric(z))
}
x %>%
mutate(col3 = unlist(map(str_split(End, ','), get_sum)) - unlist(map(str_split(Start, ','), get_sum)))
Start End col3
1 21,22 31,32 20
2 14,15 19,20 10
3 2,4,6,8,10 12,14,16,18,20 50
答案 2 :(得分:0)
扫描功能可以读取单个值,就像它们是小的csv文件一样,然后可以使用diff和colSums处理结果
apply(x, 1, function(z){ diff( colSums( # the differences of the col sums
sapply( z, # sapply will return two column matrices
function(y) as.numeric( scan( text=y, sep=",", what="")
) ) ) )
})
Read 2 items
Read 2 items
Read 2 items
Read 2 items
Read 5 items
Read 5 items
[1] 20 10 50