我想从前3列(' group',' animal'和')中创建最后一列(' desired_result')全&#39)。下面是可重复示例的代码。
library(data.table)
data = data.table(group = c(1,1,1,2,2,2), animal = c('cat', 'dog', 'pig', 'giraffe', 'lion', 'tiger'), desired_result = c('dog, pig', 'cat, pig', 'cat, dog', 'lion, tiger', 'giraffe, tiger', 'giraffe, lion'))
data[, full := list(list(animal)), by = 'group']
data = data[, .(group, animal, full, desired_result)]
data
group animal full desired_result
1: 1 cat cat,dog,pig dog, pig
2: 1 dog cat,dog,pig cat, pig
3: 1 pig cat,dog,pig cat, dog
4: 2 giraffe giraffe,lion,tiger lion, tiger
5: 2 lion giraffe,lion,tiger giraffe, tiger
6: 2 tiger giraffe,lion,tiger giraffe, lion
基本上,我想修改完整的'所以它不包括相应的“动物”。我已尝试使用这些列的列表和字符版本的各种lapply命令,但无法解决此问题。
答案 0 :(得分:3)
这是一种可能的方法
data[, desired_result := {
temp <- unique(unlist(full))
toString(temp[-match(animal, temp)])
}, by = .(group, animal)]
data
# group animal full desired_result
# 1: 1 cat cat,dog,pig dog, pig
# 2: 1 dog cat,dog,pig cat, pig
# 3: 1 pig cat,dog,pig cat, dog
# 4: 2 giraffe giraffe,lion,tiger lion, tiger
# 5: 2 lion giraffe,lion,tiger giraffe, tiger
# 6: 2 tiger giraffe,lion,tiger giraffe, lion
答案 1 :(得分:3)
另一种选择:
data[, desired := .(Map(setdiff, list(animal), as.list(animal))), by = group]
#or if starting from full
data[, desired := .(Map(setdiff, full, animal))]
(回收魔法让第一个版本起作用)
答案 2 :(得分:1)
我也找到了一种方法!
通过将'animal'变成一个列表,我可以使用mapply。
data$animal = strsplit(data$animal, ' ')
data$check = mapply(function(x, y) {list(x[x != y]) }, data$full, data$animal)
data
group animal full desired_result check
1: 1 cat cat,dog,pig dog, pig dog,pig
2: 1 dog cat,dog,pig cat, pig cat,pig
3: 1 pig cat,dog,pig cat, dog cat,dog
4: 2 giraffe giraffe,lion,tiger lion, tiger lion,tiger
5: 2 lion giraffe,lion,tiger giraffe, tiger giraffe,tiger
6: 2 tiger giraffe,lion,tiger giraffe, lion giraffe,lion