R的data.table

时间:2019-04-30 11:12:37

标签: r data.table

我需要按字母顺序对data.table的“列表”列中的元素进行排序,并将其强制转换为R的data.table的另一个中间列中的字符向量。当前,无法发现第一行的错误。

以下用于生成原始数据的代码。表:

my_dt <- data.table(A = rep(1:5, 3), B = rnorm(15, mean=10, sd=2), C = list(c("mango", "pear", "apple")))

这里col。 C是在my_dt的所有15行中重复包含“ mango”,“ pear”和“ apple”元素的列表。

示例:my_dt $ C [1]产生:

[[1]]
[1] "mango" "pear" "apple"

接下来,我要对每一行的各个元素进行排序并将其存储在col中。 my_dt的D。我正在使用以下代码对任务进行排序和填充:

for (lmn in 1:nrow(my_dt)){
  word1 <- sapply(my_dt$C[lmn], '[[', 1)
  word2 <- sapply(my_dt$C[lmn], '[[', 2)
  word3 <- sapply(my_dt$C[lmn], '[[', 3)
  my_dt$D[lmn] <- list(sort(c(word1, word2, word3)))
}

但是,在打印输出即my_dt时,我看到以下内容:

    A         B                C                D
 1: 1  7.781597 mango,pear,apple            apple
 2: 2 10.267061 mango,pear,apple apple,mango,pear
 3: 3 10.670469 mango,pear,apple apple,mango,pear
 4: 4 10.252527 mango,pear,apple apple,mango,pear
 5: 5 10.605396 mango,pear,apple apple,mango,pear
 6: 1 13.054545 mango,pear,apple apple,mango,pear
 7: 2 12.401846 mango,pear,apple apple,mango,pear
 8: 3 11.094550 mango,pear,apple apple,mango,pear
 9: 4 10.220841 mango,pear,apple apple,mango,pear
10: 5 11.452469 mango,pear,apple apple,mango,pear
11: 1 11.827297 mango,pear,apple apple,mango,pear
12: 2  6.918918 mango,pear,apple apple,mango,pear
13: 3  9.757636 mango,pear,apple apple,mango,pear
14: 4 13.432524 mango,pear,apple apple,mango,pear
15: 5 10.648629 mango,pear,apple apple,mango,pear

我不确定为什么第1项位于col。与同一列下具有所有3个排序元素(即苹果,芒果和梨)的其余行相比,D仅显示苹果。理想情况下,我希望这些条目在col之间保持一致。 D,而不是部分填充(如第1行所示)。

谢谢。

1 个答案:

答案 0 :(得分:2)

您可以简化代码,并在unlist列表元素之前使用sort

my_dt[, D := toString(sort(unlist(C))), by = 1:nrow(my_dt)][]
#    A         B                C                  D
# 1: 1  9.245525 mango,pear,apple apple, mango, pear
# 2: 2 10.195239 mango,pear,apple apple, mango, pear
# 3: 3 13.277489 mango,pear,apple apple, mango, pear
# 4: 4  8.248815 mango,pear,apple apple, mango, pear
# 5: 5 10.243520 mango,pear,apple apple, mango, pear
# 6: 1 12.724261 mango,pear,apple apple, mango, pear
# 7: 2  9.530758 mango,pear,apple apple, mango, pear
# 8: 3  7.893234 mango,pear,apple apple, mango, pear
# 9: 4  8.260433 mango,pear,apple apple, mango, pear
#10: 5  9.219746 mango,pear,apple apple, mango, pear
#11: 1  8.305300 mango,pear,apple apple, mango, pear
#12: 2  9.478721 mango,pear,apple apple, mango, pear
#13: 3  9.171161 mango,pear,apple apple, mango, pear
#14: 4  9.633898 mango,pear,apple apple, mango, pear
#15: 5 10.814112 mango,pear,apple apple, mango, pear

如果列D应该是列表列,请

my_dt[, D := list(list(sort(unlist(C)))), by = 1:nrow(my_dt)]
my_dt

请参阅以下帖子中的Arun's answerUsing lists inside data.table columns