如果列(YEAR,POL,CTY,ID,AMOUNT)在所有行的值中相等,我想删除行。请参阅下面的输出表。
有人可以帮助我实现这个吗?
提前致谢。
Table:
YEAR POL CTY ID AMOUNT RAN LEGAL
2017 30408 11 36 3500 RANGE1 L0015N20W23
2017 30408 11 36 3500 RANGE1 L00210N20W24
2017 30408 11 36 3500 RANGE1 L00310N20W25
2017 30409 11 36 3500 RANGE1 L0015N20W23
2017 30409 11 35 3500 RANGE2 NANANA
2017 30409 11 35 3500 RANGE3 NANANA
2017 30409 11 35 3500 RANGE3 NANANA
输出:
YEAR POL CTY ID AMOUNT RAN LEGAL
2017 30408 11 35 3500 RANGE1 L0015N20W23
答案 0 :(得分:0)
你可以试试这个:
no_duplicate_cols <- c("YEAR", "POL", "CTY", "ID", "AMOUNT")
new_df <- df[!duplicated(df[, no_duplicate_cols]), ]
数据框new_df
将保留df
中不重复的行。
答案 1 :(得分:0)
如果我正确理解了这个问题,那么我认为你可以试试这个
library(dplyr)
df %>%
group_by(YEAR, POL, CTY, ID, AMOUNT) %>%
filter(n() == 1)
输出(但似乎原始问题中提供的输出有点错字!):
# A tibble: 1 x 7
# Groups: YEAR, POL, CTY, ID, AMOUNT [1]
YEAR POL CTY ID AMOUNT RAN LEGAL
1 2017 30409 11 36 3500 RANGE1 L0015N20W23
#sample data
> dput(df)
structure(list(YEAR = c(2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L), POL = c(30408L, 30408L, 30408L, 30409L, 30409L, 30409L,
30409L), CTY = c(11L, 11L, 11L, 11L, 11L, 11L, 11L), ID = c(36L,
36L, 36L, 36L, 35L, 35L, 35L), AMOUNT = c(3500L, 3500L, 3500L,
3500L, 3500L, 3500L, 3500L), RAN = structure(c(1L, 1L, 1L, 1L,
2L, 3L, 3L), .Label = c("RANGE1", "RANGE2", "RANGE3"), class = "factor"),
LEGAL = structure(c(1L, 2L, 3L, 1L, 4L, 4L, 4L), .Label = c("L0015N20W23",
"L00210N20W24", "L00310N20W25", "NANANA"), class = "factor")), .Names = c("YEAR",
"POL", "CTY", "ID", "AMOUNT", "RAN", "LEGAL"), class = "data.frame", row.names = c(NA,
-7L))