Question

我有一个包含几百列的数据框。我想删除值为“项目已跳过”或“”的选定列的行。

例如参见下文。理想情况下，我想删除“动物”和“保险”列中包含“项目已跳过”或“”的所有行，但不希望将其应用于其他列。

在我的实际数据框中，大约有34列我想删除带有这些字符串的列，而有128列我不想删除。建议将不胜感激。

function onEdit(e){
  if(/* add here your conditions */) {
     SwitchHotSeat();
  }
}

Answer 1

您可以对选定的列或列范围使用filter_at

library(dplyr)

dat %>%
  filter_at(vars(animal,Insurance), all_vars(!. %in% c("Item skipped", "")))

#  animal Insurance condition age
#1    dog         Y             6
#2    cat         N    Asthma   6

或者使用基数R，您可以使用rowSums

cols <- c('animal', 'Insurance')
dat[rowSums(dat[cols] == "Item skipped" | dat[cols] == "") == 0, ]

Answer 2

在基数R中，没有for循环：

dat[!rownames(dat) %in% which(dat$animal %in% c("Item skipped", "") | dat$Insurance %in% c("Item skipped", "")), ]`

Answer 3

您总是可以使用for循环来执行此操作，尤其是因为您的数据集很小。

> remove_cols <- c('animal', 'Insurance') # vector of names of all columns you'll use to drop rows
> remove_vals <- c('', 'Item skipped') # values which indicate a row that should be dropped
> 
> for(col in remove_cols){
+   dat <- dat[!dat[[col]] %in% remove_vals, ]
+ }
> 
> head(dat)
  animal Insurance condition age
1    dog         Y             6
2    cat         N    Asthma   6

Answer 4

Using R base without need of applying more packages:

# Find rows that match content of 2 column cell values.
rows_to_delete <- which(dat$animal == "Item skipped" & dat$Insurance == "Item skipped")

# Delete row. 
# Add result in new dataframe [dat2].
# Keep old dataframe for comparison [dat].
dat2 <- dat[-rows_to_delete, ]

删除具有特定字符串值的多行

4 个答案: