la laly

时间:2016-10-12 20:45:40

标签: r data.table lapply

我的数据如下。

visitor_id  timestamp   distance_value  guest_reviews   price   rating_value
--kkVxJWTRGDccUSZG9u2g  8/16/2016 14:03 7.441392    355 199     4.3
--kkVxJWTRGDccUSZG9u2g  8/16/2016 14:03 7.351424    359 110.67  4.4
--kkVxJWTRGDccUSZG9u2g  8/16/2016 14:03 17.556168   204 79.34   3.9
--kkVxJWTRGDccUSZG9u2g  8/16/2016 14:03 2.469943    429 159     4.2
-1IIpqtwRqeAV1P7yh0upw  8/10/2016 2:33  21.654525   142 58.79   4.1
-1IIpqtwRqeAV1P7yh0upw  8/10/2016 2:33  0.567264    436 83.29   4.4
-1IIpqtwRqeAV1P7yh0upw  8/10/2016 2:33  10.063784   195 56.95   4.2

尝试使用最后4列上的lapply来使用缩放函数

对它们进行标准化
cols<-c("distance_value","min_avg_nightly_before_tax","rating_value","guest_reviews")
norm_cols<-c("norm_distance_value","norm_min_avg_nightly_before_tax","norm_rating_value","norm_guest_reviews")

myframe1[, (norm_cols):=lapply(.SD, scale), by= list(visitor_id, timestamp), .SDcols=cols]

然而,这给了我以下错误

Error in `[.data.table`(myframe1, , `:=`((norm_cols), lapply(.SD, scale)),  : 
  All items in j=list(...) should be atomic vectors or lists. If you are trying something like j=list(.SD,newcol=mean(colA)) then use := by group instead (much quicker), or cbind or merge afterwards.

以下是dput(head(myframe2))

输出的样本数据的部分输出
  

“Zzy8RX7KSniWaNLyHwUXvQ”,“zZYJITYrSHmYJv8MUDw6Cw”,   “zzZ6OXcvROe88wD4JjTtnA”,“zzZBq1DsQe6PYM7AQNmbUQ”,   “zzzi__t-SRW9OZOKBdDfwg”,“ZZZTTaS6RD6bQ2KzdaiSVA”),class =   “因素”),时间戳=结构(c(1471381408.339,   1471381408.339,1471381408.339,1471381408.339,1471381408.339,   1471381408.339),class = c(“POSIXct”,“POSIXt”)),distance_value = c(7.4413922836545,   7.35142425353227,17.5561677012408,2.46994294033727,24.8529546453572,   21.8463254946658),rating_value = c(4.3,4.4,3.9,4.2,3.1,   4.4),guest_reviews = c(355L,359L,204L,429L,305L,633L),       min_avg_nightly_before_tax = c(199,110.67,79.34,159,77.62,       101.37)),。Name = c(“visitor_id”,“timestamp”,“distance_value”,“rating_value”,“guest_reviews”,“min_avg_nightly_before_tax”),   row.names = c(NA,6L),class =“data.frame”)

0 个答案:

没有答案