我试图找出,在数据表中显示的每一年,事件的最大每日频率是每年以及它发生的日期。我可以按年获得最大值:
dt[, .N, by = DATE][, .(max(N)), by=format(DATE, "%Y")]
但是,如何才能提出与此最大值匹配的完整DATE
(而不仅仅是年份)?
这是我尝试的内容:
dt[, .N, by=DATE][which(N==max(N)), .(max(N), d:=DATE),by=format(DATE, "%Y")]
根据此错误消息,它看起来确实不会起作用,并且不会:
Error in `[.data.table`(dt[, .N, by = DATE], which(N == max(N)), .(max(N), :
'by' appears to evaluate to column names but isn't c() or key(). Use by=list(...) if you can. Otherwise, by=eval(format(DATE, "%Y")) should work. This is for efficiency so data.table can detect which columns are needed.
我知道如何轻松回溯到dt
并抓住与最大值相对应的行,但我想做得更好。 如上所述,是否可以通过子集选择来实现此目的?
道歉,如果我错过了关于此的SO帖子,但无法找到任何内容。
以下是dt
的示例:
> dput(dt[sample(1:600000, size = 500), DATE])
structure(c(16091, 15909, 15987, 16509, 16294, 16610, 16297,
15898, 15928, 15949, 16351, 16203, 16215, 15799, 16506, 15931,
16091, 15825, 15860, 15814, 15975, 16233, 16108, 16590, 15700,
16019, 16178, 16287, 16730, 16366, 16678, 16010, 16157, 16116,
15794, 16157, 16010, 16171, 16721, 16640, 16302, 15939, 15928,
16325, 15837, 15848, 15730, 15828, 16414, 16431, 16389, 16003,
16444, 16255, 16268, 16226, 16205, 15765, 16060, 15938, 16376,
15934, 15871, 16163, 16568, 15899, 16597, 16160, 16538, 15703,
16002, 16371, 16019, 16138, 16091, 15874, 16298, 16086, 15753,
16310, 16209, 15843, 16307, 16472, 16319, 16519, 15743, 16480,
16323, 16674, 16147, 16013, 15986, 16616, 16480, 16494, 16030,
16614, 16447, 15991, 15977, 15884, 16707, 16614, 16470, 16193,
16453, 16342, 16109, 15731, 16321, 16421, 15974, 16578, 16718,
16183, 15721, 15854, 16470, 16368, 16399, 16433, 16721, 16624,
16514, 15918, 16370, 15910, 16308, 15973, 16579, 16606, 16192,
16445, 16671, 15927, 15958, 16140, 15957, 16623, 16416, 15852,
15913, 16190, 15930, 16420, 15808, 15862, 16507, 16447, 16109,
15732, 16700, 15911, 16183, 16215, 16584, 15840, 16628, 16138,
16500, 16477, 16184, 16510, 16374, 16668, 16278, 16642, 16713,
16324, 16200, 16255, 15960, 16395, 15869, 16282, 16736, 16164,
16416, 16496, 16565, 15741, 16308, 16441, 16607, 16190, 15938,
16045, 15758, 16219, 16165, 16357, 16353, 16731, 16063, 15740,
16220, 16522, 15864, 15922, 16223, 15806, 16660, 16471, 15954,
16369, 15750, 15957, 16156, 16367, 16654, 16165, 16109, 15863,
16204, 15929, 15812, 15987, 16275, 16552, 15741, 15906, 15929,
16295, 15974, 15749, 15830, 15892, 16266, 16208, 15793, 15768,
15721, 16707, 15903, 16624, 16552, 16695, 16116, 16573, 16344,
16452, 16539, 16195, 15851, 16140, 16152, 15736, 16179, 15846,
16363, 16404, 16522, 16723, 16021, 16232, 16081, 16206, 16183,
15920, 16543, 15989, 15974, 16212, 16396, 16473, 16502, 16532,
16326, 15882, 16607, 15848, 15954, 16419, 15752, 16030, 16429,
16222, 16213, 16626, 16049, 16738, 16256, 16198, 16599, 15727,
16707, 16433, 15863, 16145, 16188, 15862, 15707, 16475, 16130,
15887, 16647, 15974, 16221, 15773, 16059, 16662, 16250, 15689,
15753, 15833, 16365, 16646, 16366, 16130, 16712, 15859, 16480,
15983, 16377, 16091, 16121, 15821, 16505, 16018, 16254, 15937,
16322, 16490, 15899, 16377, 16319, 16262, 16215, 16005, 16318,
16488, 16350, 16275, 16723, 16616, 16593, 15918, 16264, 15897,
15931, 16204, 16603, 16192, 16377, 15837, 16737, 16466, 16271,
15804, 15987, 16622, 16634, 16227, 16297, 16597, 16232, 16393,
15842, 15999, 15716, 16092, 16080, 16553, 16068, 16129, 16012,
16383, 16150, 16611, 16602, 16254, 15728, 15958, 15827, 16111,
16097, 16112, 16648, 16510, 16417, 16021, 16660, 15793, 16016,
16188, 16034, 16415, 16270, 16728, 16153, 16028, 16286, 16731,
15905, 15710, 16208, 16300, 16522, 16062, 16310, 16535, 16111,
16682, 15957, 16051, 16597, 16063, 15828, 16658, 16213, 16262,
15814, 15912, 16115, 15716, 15976, 16665, 16723, 15766, 15825,
16682, 16547, 16402, 16486, 16085, 16231, 16126, 16398, 15762,
16563, 15796, 15993, 15943, 16020, 15727, 16671, 16044, 15921,
16511, 15787, 16128, 16376, 16502, 15751, 16317, 16444, 16032,
15839, 16588, 15780, 15926, 16722, 16225, 16523, 16450, 16661,
16702, 16223, 15977, 16586, 16221, 16252, 15853, 16309, 15838,
16505, 16143, 16526, 15980, 15970, 15718, 16713, 16021, 16546,
16469, 16452, 15729, 16309, 16543, 16386, 16554, 16349, 16595,
16499, 16359, 16322, 16547, 16415, 16112, 15898, 16008, 16275,
15975, 16197, 15740, 15959, 16346, 16364, 16522), class = "Date")
答案 0 :(得分:2)
为什么不简单地将.SD
与which.max(N)
进行分组?
require(data.table)
data.table(x)[, .N, by=x][, .SD[which.max(N)], by=year(x), .SDcols=1:2]
# year x N
# 1: 2014 2014-01-21 4
# 2: 2013 2013-09-26 4
# 3: 2015 2015-03-28 4
# 4: 2012 2012-12-26 1
熟悉.SD
后,大多数操作只使用基本R函数。
关于您的尝试:data.table的一般形式是<{1}}中的 susbet 行,然后计算i
按{分组{1}}。因此,您无法在j
和by
中的群组中提供条件。并且i
根本不是有效的语法。
请阅读vignettes。这些都在那里。
答案 1 :(得分:1)
这就是我提出的:
DT[, Y := year(DATE)]
DT[,
copy(.SD)[, n := .N , by=DATE][which.max(n)]
, by=Y]
Y DATE n
1: 2014 2014-01-21 4
2: 2013 2013-09-26 4
3: 2015 2015-03-28 4
4: 2012 2012-12-26 1
我希望有更好的方法。我创建了Y
,因为如果j
中出现任何转换,则data.table目前不允许在by
内使用列。