library(data.table)
library(lubridate)
x1 <- c(20090101, "2009-01-02", "2009 01 03", "2009-1-4",
"2009-1, 5", "Created on 2009 1 6", "200901 !!! 07")
dt2 <- data.table(id = c(1,1,1,2,2,2,2), date1 = ymd(x1), charval = c("aa","vv","ss","a","b","c","d"))
id date1 charval
1: 1 2009-01-01 aa
2: 1 2009-01-02 vv
3: 1 2009-01-03 ss
4: 2 2009-01-04 a
5: 2 2009-01-05 b
6: 2 2009-01-06 c
7: 2 2009-01-07 d
我使用下一代码进行ID分组:
dt3 <- dt2[, Map(function(x,y) ifelse(x != "paste", get(x)(y, na.rm = TRUE), paste(y, sep = ";")),
setNames(c("mean", "paste"), names(.SD)), .SD), by = id]
得到这样的东西:
id date1 charval
1: 1 2009-01-02 aa;vv;ss
2: 2 2009-01-05 a;b;c;d
但实际上我看到了下一个结果:
id date1 charval
1: 1 NA aa
2: 2 NA a
1)我不明白为什么粘贴不起作用 2)我不明白为什么意思(date1)不起作用 因为例如下一个代码工作正常:
mean(dt2$date1)
[1] "2009-01-04"
答案 0 :(得分:1)
目前尚不清楚为什么我们必须通过Map
和get
。在按&#39; id&#39;分组后,获取&#39; date1&#39;的mean
和paste
&#39; charval&#39;一起
dt2[, .(date1 = mean(date1), charval = toString(charval)), id]
# id date1 charval
#1: 1 2009-01-02 aa, vv, ss
#2: 2 2009-01-05 a, b, c, d
注意:toString
为paste(..., collapse=', ')
dt2[, .(date1 = mean(date1), charval = paste(charval, collapse=";")), id]
# id date1 charval
#1: 1 2009-01-02 aa;vv;ss
#2: 2 2009-01-05 a;b;c;d
OP的问题是Map
使用get
来调用mean
。这似乎是在触发
if(!is.numeric(x)&amp;&amp;!is.complex(x)&amp;&amp;!is.logical(x)){ 警告(&#34;参数不是数字或逻辑:返回NA&#34;) 返回(NA_real _)
并在发现&#39; date1&#39;时返回NA属于Date
类,但它存储为numeric
。一种选择是在envir
get
另一个问题是使用ifelse
。最好使用if/else
,因为只有两个元素
dt2[, Map(function(x, y) if(x != "paste") get(x, envir = parent.frame())(y, na.rm = TRUE)
else paste(y, collapse=':'), setNames(c("mean", "paste"), names(.SD)), .SD), by = id]
# id date1 charval
#1: 1 2009-01-02 aa:vv:ss
#2: 2 2009-01-05 a:b:c:d
get
有点棘手,如果指定正确的环境,它会按预期工作
get("mean")(dt2$date1)
#[1] "2009-01-04"
或者代替if/else
到#34;粘贴&#34;字符串,我们可以检查列class
,如果它是character
,那么请执行paste
或者返回mean
dt2[, Map(function(x, y) if(is.character(y)) get(x)(y, collapse=":")
else get(x, envir = parent.frame())(y, na.rm = TRUE),
setNames(c("mean", "paste"), names(.SD)), .SD), by = id]
# id date1 charval
#1: 1 2009-01-02 aa:vv:ss
#2: 2 2009-01-05 a:b:c:d
请注意,最好不要轻易使用第一种方法