我创建了一个函数,其中R查看许多变量,然后以这种方式填充新列:
- 如果任何变量都有" 1"条目,新列应该是" 1"
- 如果所有变量都有NA条目,则新列应具有NA值。
这应该很简单,但它不知何故不起作用。 我认为问题出在代码的一部分,我检查它们都不是NA值:" if(!((is.na(variable))| ..."
是否有更好的方法来编写代码?请帮忙!
注意:在这个函数中有更多的计算,但是为了显示函数结构和我的具体问题,我只把它放在里面。
#if they answered "1" (yes) to recieving any specific treatment,
#then say "1" (yes) to a new columns called treated_psych
diag_treated <- function(x){
for (v in 1:length(x)) assign(names(x)[v], x[[v]])
if(!((is.na(CurrTx6.1_Group))|(is.na(CurrTx6.1_Ind))| (is.na(CurrTx6.1_Fam))|
(is.na(CurrTx6.1_Couples))|(is.na(CurrTx7a_CBTAnx))|(is.na(CurrTx7b_CBTDep))|
(is.na(CurrTx7c_CBTInsom)))){
if(CurrTx6.1_Group==1 | CurrTx6.1_Ind==1 | CurrTx6.1_Fam==1 | CurrTx6.1_Couples==1 |
CurrTx7a_CBTAnx==1 | CurrTx7b_CBTDep==1 | CurrTx7c_CBTInsom==1)
{
treated_psych <-1
}
else{treated_psych <- 0}
}else{treated_psych<-NA}
treat <- data.frame(treated_psych)
return(treat)
}
#call function
diagnoses_treated <- adply(dataset, 1, diag_treated)
答案 0 :(得分:1)
我根据您描述数据的方式生成了此示例数据。如果这不正确,请提供可重复的样本数据。
sample_data=data.frame("CurrTx6.1_Group"=c(1,1,0,0,NA),
"CurrTx6.1_Fam"=c(NA,NA,0,0,NA),
"CurrTx7b_CBTDep"=c(1,1,0,1,NA))
sample_data
new_var<-rep("xxx",nrow(sample_data)) #Initialize new column variable
for(i in 1:nrow(sample_data)){
if(all(is.na(sample_data[i,]))){
new_var[i]=NA #If any elements in the row are NA, mark the new variable NA
}
}
not_na_index=which(!is.na(new_var)) #Find places where the new value will be 0 or 1
new_var[not_na_index]=rowSums(sample_data, na.rm = TRUE)[not_na_index] #Sum the rows, since everything that is 0 should stay 0, and a single 1 will make the final variable a 1
new_var<-as.numeric(new_var) #Change to numeric (was initialized as string)
new_var[which(new_var>1)]=1 #Change any number higher than 1 to 1
sample_data$new_column=new_var
sample_data
返回的新变量是1 1 0 1 NA
答案 1 :(得分:0)
我最终做了一个列的子集,2个应用函数,然后是一个for循环,它通过apply函数创建的两个向量来创建我的新变量。不是很优雅或高效但它有效。
#if they answered "1" (yes) to recieving any specific treatment,
#then say "1" (yes) to a new columns called treated_psych
#subset data by just these columns
df_psych<- dat_with_pcl5[c("CurrTx6.1_Group", "CurrTx6.1_Ind", "CurrTx6.1_Fam",
"CurrTx6.1_Couples", "CurrTx7a_CBTAnx", "CurrTx7b_CBTDep", "CurrTx7c_CBTInsom")]
#make one vector if ANY are 1, make another vector if ALL are NA
treated_psych1<- apply(df_psych, 1, function(r) any(r %in% "1"))
treated_psych.na<- apply(df_psych, 1, function(r) all(r %in% NA))
# Loop through both vectors and create new variable
#if true treated_psych1 then 1, if true in treated_psych.na then NA
for(i in 1:length(treated_psych0)){
if (treated_psych1[i]==TRUE){treated_psych[i] <- 1}
if (treated_psych.na[i] ==TRUE){treated_psych[i] <- NA}
}