删除值与列标题相同的重复行

时间:2019-02-24 01:41:38

标签: r duplicates rows columnheader

我的数据看起来像这样:

    +--------+--------+--------+
| region |  name  | salary |
+--------+--------+--------+
| west   | raj    | 100    |
| north  | simran | 150    |
| region | name   | salary |
| east   | prem   | 250    |
| region | name   | salary |
| south  | preeti | 200    |
+--------+--------+--------+

在第3行和第5行中重复列标题的名称。如何使用R删除第3行和第5行,并保持列标题不变,这样我的输出看起来像这样:

+--------+--------+--------+
| region |  name  | salary |
+--------+--------+--------+
| west   | raj    |    100 |
| north  | simran |    150 |
| east   | prem   |    250 |
| south  | preeti |    200 |
+--------+--------+--------+

假设我的原始数据有太多行,我不想简单地选择行号并使用Data [-c(3,5),]命令删除它们。

2 个答案:

答案 0 :(得分:0)

这是一个简单的解决方案

x <- data.frame(x =c("a", "b", "c", "x"), z = c("a", "b", "c", "z"))
## identify rows which match colnames 
matched <- apply(x,1, function(i) i[1] %in% colnames(x) && i[2] %in% colnames(x))

## Take the inverse of the match
x[!matched,]

答案 1 :(得分:0)

假设salary是一个数字字段,则只需执行此操作-

# assuming df is your dataframe

clean_df <- df[!is.na(as.numeric(df$salary)), ]