如何在R中编写下面的Excel代码?
COUNTIF($A$4:A4,A4)
我有> 100k行数据要在其中获取COUNTIF($A$4:A4,A4) = 1
值。我可以在Excel中完成此操作,但是我遇到了R
Date Worker ID
10/31/2017 3152
9/30/2017 3152
8/31/2017 3152
7/31/2017 3152
6/30/2017 3152
5/31/2017 3152
4/30/2017 3152
3/31/2017 3152
2/28/2017 3153
1/31/2017 3153
12/31/2016 3153
11/30/2016 3153
10/31/2017 3153
9/30/2017 3153
8/31/2017 3153
7/31/2017 3153
6/30/2017 3153
5/31/2017 3940
4/30/2017 3940
3/31/2017 3940
2/28/2017 3940
1/31/2017 3940
我有25列的相同数据集,其中每行具有不同的数据集,但最新/最后日期具有更新的信息。我要选择员工的最新日期行。
答案 0 :(得分:0)
您可以使用数据框子设置和duplicated
函数来模仿Excel的COUNTIF
。请参见下面的代码:
df <- structure(list(Date = structure(c(2L, 12L, 11L, 10L, 9L, 8L,
7L, 6L, 5L, 1L, 4L, 3L, 2L, 12L, 11L, 10L, 9L, 8L, 7L, 6L, 5L,
1L), .Label = c("1/31/2017", "10/31/2017", "11/30/2016", "12/31/2016",
"2/28/2017", "3/31/2017", "4/30/2017", "5/31/2017", "6/30/2017",
"7/31/2017", "8/31/2017", "9/30/2017"), class = "factor"), Worker_ID = c(3152L,
3152L, 3152L, 3152L, 3152L, 3152L, 3152L, 3152L, 3153L, 3153L,
3153L, 3153L, 3153L, 3153L, 3153L, 3153L, 3153L, 3940L, 3940L,
3940L, 3940L, 3940L)), class = "data.frame", row.names = c(NA,
-22L))
df[!duplicated(df$Worker_ID), ]
输出:
Date Worker_ID
1 10/31/2017 3152
9 2/28/2017 3153
18 5/31/2017 3940