根据条件和列名字符串更改多列

时间:2019-03-16 01:05:19

标签: r if-statement dplyr

我有一个非常稀疏的数据集-以下是格式的示例。我想根据以下解释的逻辑对特定列进行更改

# create dummy data set
pb=c('1','0','0','0','0','1','Not_ans','1','0','Not_ans')
qa=c('1','1','0','0','1','0','Not_ans','1','Not_ans','Not_ans')
#zy=c('1','Not_ans','0','1','Not_ans','0','1','1','1','Not_ans')

#sub questions for pb
pb.abr=c('1','0','0','0','0','1','0','1','0','0')
pb.ras=c('0','0','0','0','1','0','0','1','0','0')
pb.sfg=c('1','0','0','0','0','0','0','1','0','0')

#sub questions for qa
qa.fgs=c('1','0','0','0','0','0','0','1','0','0')
qa.sdf=c('0','1','0','0','0','0','0','0','0','0')
qa.tyu=c('0','0','0','0','1','0','0','1','0','0')

df=data.frame(pb,qa,pb.abr,pb.ras,pb.sfg,qa.fgs,qa.sdf,qa.tyu)
df

        pb      qa pb.abr pb.ras pb.sfg qa.fgs qa.sdf qa.tyu
1        1       1      1      0      1      1      0      0
2        0       1      0      0      0      0      1      0
3        0       0      0      0      0      0      0      0
4        0       0      0      0      0      0      0      0
5        0       1      0      1      0      0      0      1
6        1       0      1      0      0      0      0      0
7  Not_ans Not_ans      0      0      0      0      0      0
8        1       1      1      1      1      1      0      1
9        0 Not_ans      0      0      0      0      0      0
10 Not_ans Not_ans      0      0      0      0      0      0

这两个列pb和qa称为基本列,它们还有用于命名约定为pb的其他子列。和qa。 -因此,我们看到pa的三个子列和qa的3个子列。我想根据基础列(pa或qa)的条件对这些子列进行更改。

条件是如果列pb =='Not_ans'然后使所有子列(pb.abr,pb.ras和pb.sfg)='Not_applicable'

我该如何编写实现此目的的功能?我在其中指定基本列名称,即pb和下面的子列示例'pb.'的命名-会像下面这样但不会给出结果

data.frame(ifelse(df['base_q']=='Not_ans',
df[ , grepl( paste('base_q','.') , names(df) )]=='Not_applicable',df[,grepl( 
paste('base_q','.') , names(df)) ])

我该如何编写一个通用函数,将基本列号作为输入,例如1,2,在此处应用该函数,即无论pb是Not_ans还是哪里,它将sub_columns(pb.abr,pb.ras,pb.sfg)更改为不适用,然后移至第2列(qa)并应用相同的逻辑?

3 个答案:

答案 0 :(得分:2)

您可以使用

yf=function(df,v){
   df[df[v]=='Not_ans',][,names(df)[substr(names(df),1,nchar(v)+1)==paste0(v,'.')]]='Not_applicable'
   return(df)
 }
yf(df,'pb')
        pb      qa         pb.abr         pb.ras         pb.sfg qa.fgs qa.sdf qa.tyu
1        1       1              1              0              1      1      0      0
2        0       1              0              0              0      0      1      0
3        0       0              0              0              0      0      0      0
4        0       0              0              0              0      0      0      0
5        0       1              0              1              0      0      0      1
6        1       0              1              0              0      0      0      0
7  Not_ans Not_ans Not_applicable Not_applicable Not_applicable      0      0      0
8        1       1              1              1              1      1      0      1
9        0 Not_ans              0              0              0      0      0      0
10 Not_ans Not_ans Not_applicable Not_applicable Not_applicable      0      0      0

数据输入

df=data.frame(pb,qa,pb.abr,pb.ras,pb.sfg,qa.fgs,qa.sdf,qa.tyu,stringsAsFactors = F) 
# notice stringsAsFactors 

答案 1 :(得分:0)

以下是一种方法。您可以在var的{​​{1}}中指定要应用一个或多个函数的列。在这里,我使用mutate_at()指定列名称。然后,当pb ==“ Not_ans”替换为“ Not_applicable”时,我将列中的数值替换了。

contains()

如果您想对mutate_at(df, vars(contains("pb.")), .funs = funs(ifelse(pb == "Not_ans", "Not_applicable", .))) # pb qa pb.abr pb.ras pb.sfg qa.fgs qa.sdf qa.tyu #1 1 1 2 1 2 1 0 0 #2 0 1 1 1 1 0 1 0 #3 0 0 1 1 1 0 0 0 #4 0 0 1 1 1 0 0 0 #5 0 1 1 2 1 0 0 1 #6 1 0 2 1 1 0 0 0 #7 Not_ans Not_ans Not_applicable Not_applicable Not_applicable 0 0 0 #8 1 1 2 2 2 1 0 1 #9 0 Not_ans 1 1 1 0 0 0 #10 Not_ans Not_ans Not_applicable Not_applicable Not_applicable 0 0 0 pb都应用相同的任务,则可以两次使用qa

mutate_at()

答案 2 :(得分:0)

基于@ Wen-Ben给出的答案-以下代码有效-

class BMI: def __init__(self, firstName, lastName, age, height, weight): self.firstName = firstName self.lastName = lastName self.fullName = firstName + " " + lastName self.age = age self.height = (height * 0.025) ** 2 self.weight = weight * 0.45 def setFullName(self, firstName, lastName): self.firstName = firstName self.lastName = lastName self.fullName = firstName + " " + lastName print(self.fullName) def setAge(self, age): self.age = age def setHeight(self, height): self.height = (height * 0.025) ** 2 def setWeight(self, weight): self.weight = weight * 0.45 def getBMI(self): bmi = self.weight // self.height return bmi def getStatus(self): getBMI() if bmi < 19: print("You have an unhealthy BMI, gain some weight!") elif bmi > 19 and bmi < 25: print("You have a healthy BMI") else: print("You have an unhealthy BMI, lose some weight!") firstName = input("Enter your first name: ") lastName = input("Enter your last name: ") age = int(input("Enter your age: ")) height = int(input("Enter your height in inches: ")) weight = int(input("Enter your weight in lbs: ")) userInputBMI = BMI(firstName, lastName, age, height, weight) print(userInputBMI.setFullName(firstName, lastName)) print("Your BMI is:", userInputBMI.getBMI()) print(userInputBMI.getStatus())