我有一个列表作为行,现在我想取消列出该行中的所有元素并采用唯一元素。
library(data.table)
library(stringr)
Data<-data.table(
X=sample(1:10),
Y=list(c("between","between","before","pm"),c("am","in","at","am"),c("at","pm"),c("after","after","on"),c("on","am","on"),c("at","between","at"),c("at","between"),c("at","at","on"),c("pm","pm","am"),c("between","between","pm","between","pm","between","pm")))
现在,我要获得唯一元素以及列表中元素的数量。
例如,对于第一行,列表中存在4个元素,而“ beween”,“ before”,“ pm”是列表中的唯一元素。
所以我尝试了
Data[,unique_elements:=unique(Y),by=list(X)]
Data[,count:=length(Y),by=list(X)]
但是这两点并没有达到我的预期,也不知道我在哪里做错了。任何帮助表示赞赏。
答案 0 :(得分:2)
我们可以使用lapply
来获取每个unique
的{{1}}值,并使用Y
来获取lengths
中每个元素的长度。
Y
但是,此解决方案并非专门针对library(data.table)
Data[, c("unique_vals", "count") := list(lapply(Y, unique), lengths(Y))]
Data
# X Y unique_vals count
#1: 10 between,between,before,pm between,before,pm 4
#2: 4 am,in,at,am am,in,at 4
#3: 3 at,pm at,pm 2
#4: 6 after,after,on after,on 3
#5: 5 on,am,on on,am 3
#6: 1 at,between,at at,between 3
#7: 8 at,between at,between 2
#8: 7 at,at,on at,on 3
#9: 9 pm,pm,am pm,am 3
#10: 2 between,between,pm,between,pm,between,... between,pm 7
,我们可以使用data.table
dplyr
或基数R:
library(dplyr)
Data %>%
mutate(unique_vals = purr::map(Y, unique),
count = lengths(Y))
答案 1 :(得分:1)
非data.table
结果
lapply(Data$Y,unique)
获取唯一的字符串,并且
lapply(Data$Y,length)
获取列表中的元素数量。