通过评估多列来子集data.table

时间:2014-09-02 15:59:56

标签: r max data.table subset

如何按最新(最新)类型为每个唯一名称返回1行?

包含6行的DataTable:

example <- data.table(c("Bob","May","Sue","Bob","Sue","Bob"), 
                      c("A","A","A","A","B","B"),
              as.Date(c("2010/01/01", "2010/01/01", "2010/01/01", 
                   "2012/01/01", "2012/01/11", "2014/01/01")))
setnames(example,c("Name","Type","Date"))
setkey(example,Name,Date)

应返回5行:

# 1:  Bob    A 2012-01-01
# 2:  Bob    B 2014-01-01
# 3:  May    A 2010-01-01
# 4:  Sue    A 2010-01-01
# 5:  Sue    B 2012-01-11

2 个答案:

答案 0 :(得分:3)

由于您已按NameDate排序,因此您可以使用列unique上的unique.data.table(调用Name)函数和{ {1}},Type

fromLast = TRUE

这将选择每个require(data.table) ## >= v1.9.3 unique(example, by=c("Name", "Type"), fromLast=TRUE) # Name Type Date # 1: Bob A 2012-01-01 # 2: Bob B 2014-01-01 # 3: May A 2010-01-01 # 4: Sue A 2010-01-01 # 5: Sue B 2012-01-11 组的最后一行。希望这会有所帮助。

PS:正如@mso指出的那样,这需要Name,Type因为1.9.3参数仅在fromLast中实现(可从github获得)。

答案 1 :(得分:1)

以下版本的@Arun回答工作:

unique(example[rev(order(Name,Date))], by=c("Name", "Type"), fromLast=TRUE)[order(Name,Date)]
   Name Type       Date
1:  Bob    A 2012-01-01
2:  Bob    B 2014-01-01
3:  May    A 2010-01-01
4:  Sue    A 2010-01-01
5:  Sue    B 2012-01-11

unique(example[order(Name, Date, decreasing=T)], by=c("Name","Type"))[order(Name, Date)]
   Name Type       Date
1:  Bob    A 2012-01-01
2:  Bob    B 2014-01-01
3:  May    A 2010-01-01
4:  Sue    A 2010-01-01
5:  Sue    B 2012-01-11