我需要按字母顺序对data.table的“列表”列中的元素进行排序,并将其强制转换为R的data.table的另一个中间列中的字符向量。当前,无法发现第一行的错误。
以下用于生成原始数据的代码。表:
my_dt <- data.table(A = rep(1:5, 3), B = rnorm(15, mean=10, sd=2), C = list(c("mango", "pear", "apple")))
这里col。 C是在my_dt的所有15行中重复包含“ mango”,“ pear”和“ apple”元素的列表。
示例:my_dt $ C [1]产生:
[[1]]
[1] "mango" "pear" "apple"
接下来,我要对每一行的各个元素进行排序并将其存储在col中。 my_dt的D。我正在使用以下代码对任务进行排序和填充:
for (lmn in 1:nrow(my_dt)){
word1 <- sapply(my_dt$C[lmn], '[[', 1)
word2 <- sapply(my_dt$C[lmn], '[[', 2)
word3 <- sapply(my_dt$C[lmn], '[[', 3)
my_dt$D[lmn] <- list(sort(c(word1, word2, word3)))
}
但是,在打印输出即my_dt时,我看到以下内容:
A B C D
1: 1 7.781597 mango,pear,apple apple
2: 2 10.267061 mango,pear,apple apple,mango,pear
3: 3 10.670469 mango,pear,apple apple,mango,pear
4: 4 10.252527 mango,pear,apple apple,mango,pear
5: 5 10.605396 mango,pear,apple apple,mango,pear
6: 1 13.054545 mango,pear,apple apple,mango,pear
7: 2 12.401846 mango,pear,apple apple,mango,pear
8: 3 11.094550 mango,pear,apple apple,mango,pear
9: 4 10.220841 mango,pear,apple apple,mango,pear
10: 5 11.452469 mango,pear,apple apple,mango,pear
11: 1 11.827297 mango,pear,apple apple,mango,pear
12: 2 6.918918 mango,pear,apple apple,mango,pear
13: 3 9.757636 mango,pear,apple apple,mango,pear
14: 4 13.432524 mango,pear,apple apple,mango,pear
15: 5 10.648629 mango,pear,apple apple,mango,pear
我不确定为什么第1项位于col。与同一列下具有所有3个排序元素(即苹果,芒果和梨)的其余行相比,D仅显示苹果。理想情况下,我希望这些条目在col之间保持一致。 D,而不是部分填充(如第1行所示)。
谢谢。
答案 0 :(得分:2)
您可以简化代码,并在unlist
列表元素之前使用sort
:
my_dt[, D := toString(sort(unlist(C))), by = 1:nrow(my_dt)][]
# A B C D
# 1: 1 9.245525 mango,pear,apple apple, mango, pear
# 2: 2 10.195239 mango,pear,apple apple, mango, pear
# 3: 3 13.277489 mango,pear,apple apple, mango, pear
# 4: 4 8.248815 mango,pear,apple apple, mango, pear
# 5: 5 10.243520 mango,pear,apple apple, mango, pear
# 6: 1 12.724261 mango,pear,apple apple, mango, pear
# 7: 2 9.530758 mango,pear,apple apple, mango, pear
# 8: 3 7.893234 mango,pear,apple apple, mango, pear
# 9: 4 8.260433 mango,pear,apple apple, mango, pear
#10: 5 9.219746 mango,pear,apple apple, mango, pear
#11: 1 8.305300 mango,pear,apple apple, mango, pear
#12: 2 9.478721 mango,pear,apple apple, mango, pear
#13: 3 9.171161 mango,pear,apple apple, mango, pear
#14: 4 9.633898 mango,pear,apple apple, mango, pear
#15: 5 10.814112 mango,pear,apple apple, mango, pear
如果列D
应该是列表列,请
my_dt[, D := list(list(sort(unlist(C)))), by = 1:nrow(my_dt)]
my_dt
请参阅以下帖子中的Arun's answer:Using lists inside data.table columns