我需要删除我的数据帧的特定行,但我遇到了麻烦。 数据集如下所示:
> head(mergedmalefemale)
coupleid gender shop time amount
1 1 W 3 1 29.05
2 1 W 1 2 31.65
3 1 W 3 3 NA
4 1 W 2 4 17.75
5 1 W 3 5 -28.40
6 2 W 1 1 42.30
我想要做的是删除至少有一个数量是NA或者负数的夫妇的所有记录。在上面的示例中,应删除所有具有coupleid“1”的行,因为存在具有负值和NA的行。
我尝试使用na.omit(mergedmalefemale)
等函数,但这只删除了NA的行,但没有删除具有相同cupleid的其他行。因为我是初学者,如果有人可以帮助我,我会很高兴。
答案 0 :(得分:2)
由于您不希望仅省略NA或负数,但想要省略具有相同ID的所有数据,您必须先找到要删除的ID,然后将其删除。
mergedmalefemale <- read.table(text="
coupleid gender shop time amount
1 1 W 3 1 29.05
2 1 W 1 2 31.65
3 1 W 3 3 NA
4 1 W 2 4 17.75
5 1 W 3 5 -28.40
6 2 W 1 1 42.30",
header=TRUE)
# Find NA and negative amounts
del <- is.na(mergedmalefemale[,"amount"]) | mergedmalefemale[,"amount"]<0
# Find coupleid with NA or negative amounts
ids <- unique(mergedmalefemale[del,"coupleid"])
# Remove data with coupleid such that amount is NA or negative
mergedmalefemale[!mergedmalefemale[,"coupleid"] %in% ids,]
答案 1 :(得分:1)
这是另一种选择。考虑一下您的data.frame被称为df
> na.omit(df[ rowSums(df[, sapply(df, is.numeric)]< 0, na.rm=TRUE) ==0, ])
coupleid gender shop time amount
1 1 W 3 1 29.05
2 1 W 1 2 31.65
4 1 W 2 4 17.75
6 2 W 1 1 42.30
答案 2 :(得分:1)
另一个应用data.table
require(data.table)
mergedmalefemale <- as.data.table(mergedmalefemale)
mergedmalefemale[, if(!any(is.na(amount) | amount < 0)) .SD, by=coupleid]
# coupleid gender shop time amount
#1: 2 W 1 1 42.3
答案 3 :(得分:0)
这是一种相当肮脏的方式
# identify the coupleids that need to stay/be removed
agg <- aggregate(amount ~ coupleid, data=mergedmalefemale, FUN=function(x) min(is.na(x)|(x>0)))
# insert a column alongside "amount.y" that puts a 0 next to rows to be deleted
df.1 <- merge(mergedmalefemale, agg, by="coupleid")
# delete the rows
df.1 <- df.1[df.1$amount.y == 1, ]