我的数据看起来像这样:
+--------+--------+--------+
| region | name | salary |
+--------+--------+--------+
| west | raj | 100 |
| north | simran | 150 |
| region | name | salary |
| east | prem | 250 |
| region | name | salary |
| south | preeti | 200 |
+--------+--------+--------+
在第3行和第5行中重复列标题的名称。如何使用R删除第3行和第5行,并保持列标题不变,这样我的输出看起来像这样:
+--------+--------+--------+
| region | name | salary |
+--------+--------+--------+
| west | raj | 100 |
| north | simran | 150 |
| east | prem | 250 |
| south | preeti | 200 |
+--------+--------+--------+
假设我的原始数据有太多行,我不想简单地选择行号并使用Data [-c(3,5),]命令删除它们。
答案 0 :(得分:0)
这是一个简单的解决方案
x <- data.frame(x =c("a", "b", "c", "x"), z = c("a", "b", "c", "z"))
## identify rows which match colnames
matched <- apply(x,1, function(i) i[1] %in% colnames(x) && i[2] %in% colnames(x))
## Take the inverse of the match
x[!matched,]
答案 1 :(得分:0)
假设salary
是一个数字字段,则只需执行此操作-
# assuming df is your dataframe
clean_df <- df[!is.na(as.numeric(df$salary)), ]