> str(store)
'data.frame': 1115 obs. of 10 variables:
$ Store : int 1 2 3 4 5 6 7 8 9 10 ...
$ StoreType : Factor w/ 4 levels "a","b","c","d": 3 1 1 3 1 1 1 1 1 1 ...
$ Assortment : Factor w/ 3 levels "a","b","c": 1 1 1 3 1 1 3 1 3 1 ...
$ CompetitionDistance : int 1270 570 14130 620 29910 310 24000 7520 2030 3160 ...
$ CompetitionOpenSinceMonth: int 9 11 12 9 4 12 4 10 8 9 ...
$ CompetitionOpenSinceYear : int 2008 2007 2006 2009 2015 2013 2013 2014 2000 2009 ...
$ Promo2 : int 0 1 1 0 0 0 0 0 0 0 ...
$ Promo2SinceWeek : int NA 13 14 NA NA NA NA NA NA NA ...
$ Promo2SinceYear : int NA 2010 2011 NA NA NA NA NA NA NA ...
$ PromoInterval : Factor w/ 4 levels "","Feb,May,Aug,Nov",..: 1 3 3 1 1 1 1 1 1 1 ...
我试图根据Promo2值替换NA。如果Promo2 == 0,则该行中的NA值必须为零,否则如果Promo2 == 1缺少值应替换为列的意思。
不明白我的代码为什么不编辑商店数据。
for (i in 1:nrow(store)){
if(is.na(store[i,])== TRUE & store$Promo2[i] ==0){
store[i,] <- ifelse(is.na(store[i,]),0,store[i,])
}
else if (is.na(store[i,])== TRUE & store$Promo2[i] ==1){
for(j in 1:ncol(store)){
store[is.na(store[i,j]), j] <- mean(store[,j], na.rm = TRUE)
}
}
}
答案 0 :(得分:3)
对于Promo2SinceWeek专栏:
store$Promo2SinceWeek[store$Promo2==0 & is.na(store$Promo2SinceWeek)] <- 0
store$Promo2SinceWeek[store$Promo2==1 & is.na(store$Promo2SinceWeek)] <- mean(store$Promo2SinceWeek, na.rm=TRUE)
对于其他专栏,请使用相同的方法。矢量化函数是R的一个非常有用的特性。
答案 1 :(得分:0)
修复for循环:
for(i in 1:nrow(store)) {
col <- which(is.na(store[i,]))
store[i,][col] <- if(store$Promo2[i] == 1) colMeans(store[col], na.rm=TRUE) else 0
}
或者,如果您不想要任何if语句:
for (i in 1:nrow(store)) {
store[i,][is.na(store[i,]) & store$Promo2[i] ==0] <- 0
store[i,][is.na(store[i,]) & store$Promo2[i] ==1] <-
colMeans(store[,is.na(store[i,]) & store$Promo2[i] ==1], na.rm = TRUE)
}
您的循环无效,因为if
语句接受来自测试的一个条件值。您的循环向其发送if(is.na(store[i,])== TRUE & store$Promo2[i] ==0)
。但该条件语句将具有许多值TRUE FALSE FALSE FALSE TRUE...
。它只是一个值,它是一系列的真实和谬误,一个 TRUE或一个 FALSE。只有在给出倍数时,该函数才会取第一个值。
可重复的示例
store
# Promo2 gear carb
#Mazda RX4 1 NA NA
#Mazda RX4 Wag 1 4 4
#Datsun 710 1 4 1
#Hornet 4 Drive 0 3 1
#Hornet Sportabout 0 3 NA
#Valiant 0 3 1
for(i in 1:nrow(store)) {
col <- which(is.na(store[i,]))
store[i,][col] <- if(store$Promo2[i] == 1) colMeans(store[col], na.rm=TRUE) else 0
}
store
# Promo2 gear carb
#Mazda RX4 1 3.4 1.75
#Mazda RX4 Wag 1 4.0 4.00
#Datsun 710 1 4.0 1.00
#Hornet 4 Drive 0 3.0 1.00
#Hornet Sportabout 0 3.0 0.00
#Valiant 0 3.0 1.00
数据强>
store <- head(mtcars)
store <- store[-(1:8)]
names(store)[1] <- "Promo2"
store[1,2] <- NA
store[5,3] <- NA
store[1,3] <- NA
store