我有一些代码如下。我想要做的就是为我的数据框中的每个列运行此代码,然后将输出保存在某处(?另一个具有关联列名称的数据框)。 我试过创建一个循环但继续收到错误:
(finalset_row_number [i] - 1)中的错误:( finalset_row_number [i] + 1): NA / NaN参数。
我知道为什么会发生错误,但我不希望它阻止其他值被创建。代码如下:
df1 <- read.delim(file.choose(),header=TRUE)
#Take the control samples and average each row for three columns excluding the first two columns- add the per row means to the data frame
df$Means <- rowMeans(df[,30:32])
RowVar <- function(x) {rowSums((x - rowMeans(x))^2)/(dim(x)[2] - 1)}
df$sd=sqrt(RowVar(df[,c(30:32)]))
#Get a Z score by dividing the test sample count at each locus by the average for the control samples and divide everything by the st dev for controls at each locus.
{
df$ZScore <- (df[,35]-df$Means)/(df$sd)
######################################### QUARTILE FILTER ###########################################################
alpha=1.5
numberofControls = 3
UL = median(df$ZScore, na.rm = TRUE) + alpha*IQR(df$ZScore, na.rm = TRUE)
LL = median(df$ZScore, na.rm = TRUE) - alpha*IQR(df$ZScore, na.rm = TRUE)
#Copy the Z score if the score is > or < a certain number, i.e. LL or UL.
Zoutliers <- which(df$ZScore > UL | df$ZScore < LL)
df$Zoutliers <- ifelse(df$ZScore > UL |df$ZScore <LL ,1,-1)
tempout = ifelse(df$ZScore[Zoutliers] > UL,1,-1)
######################################### Three neighbour Isolation filter ##############################################################################
finalSeb=c()
for(i in 2:(length(Zoutliers)-1)){
j=Zoutliers[i]
if(sum(ifelse((j-1) == Zoutliers,1,0)) > 0 & tempout[i] == tempout[i-1] & sum(ifelse((j+1) == Zoutliers,1,0)) > 0 & tempout[i] == tempout[i+1]){
finalSeb = c(finalSeb,i)
}
}
finalset_row_number = Zoutliers[finalSeb]
#View(finalset_row_number)
p_seq = rep(0,nrow(df))
for(i in 1:length(finalset_row_number)){
p_seq[(finalset_row_number[i]-1):(finalset_row_number[i]+1)] = median(df$ZScore[(finalset_row_number[i]-1):(finalset_row_number[i]+1)])
}
sum(p_seq !=0)
CopyNumberCount
nrow(SU)
nrow(as.data.frame(finalset_row_number))
}
我尝试使用如下尝试:
for(i in 3:ncol(df)) try(
df$ZScore <- (df[,i]-df$Means)/(df$sd)
等(在最后一个之后括号关闭)但这只是给了我一些相同的错误消息,但没有错误的列工作。理想情况下,如果我收到此错误消息,nrow的最终输出(as.data.frame(finalset_row_number))只是设置为零。任何人都可以帮助这个循环吗?
答案 0 :(得分:0)
保持循环结果的一种方法是将其添加到数据框或数组中。
e.g。使用数据框:
# Set up result dataframe before the loop
results <- data.frame("column"=c(1:ncol(df)), "result"=rep(0,ncol(df))
# do the loop
for (i in ...){
# calculation...
x <- result of your calculation
# store result
results$column[i] <- i
results$result[i] <- x
}
这样你就可以在循环结束时拥有一个带有列名和相关结果的干净数据框。