Question

我尝试根据过滤的列更新函数内的数据框列。

#example dataframe
my.df = data.frame(A=1:10)

#define function to classify column passed as argument 2 based on argument 3
classify = function(df, col, threshold){
  df[df$col<threshold, 2] <- "low"
  df[df$col>=threshold, 2] <- "high"

  return(df)
}

#assign output to new.df
new.df = classify(my.df, A, 5)

我希望新列包含＆＃39; low＆＃39;的字符值。或者＆＃39; high＆＃39;，而是他们全部<NA>。

Answer 1

我们可以使用dplyr的devel版本（即将发布0.6.0）来执行此操作。 enquo接受输入参数并将其转换为quosure，将通过取消引用（mutate/group_by/filter）

在UQ等内进行评估

library(dplyr)
classify <- function(df, col, threshold){
   col <- enquo(col)

   df %>%
       mutate(categ = ifelse(UQ(col) < threshold, "low", "high"))

}

classify(my.df, A, 5)
#    A categ
#1   1   low
#2   2   low
#3   3   low
#4   4   low
#5   5  high
#6   6  high
#7   7  high
#8   8  high
#9   9  high
#10 10  high

Answer 2

只需传递列名"A"的字符串文字，然后在函数内部接收带有单括号[[...]]索引的参数，而不是$：

# example dataframe
my.df = data.frame(A=1:10)

# define function to classify column passed as argument 2 based on argument 3
classify = function(df, col, threshold){
  df[df[[col]] < threshold, 2] <- "low"
  df[df[[col]] >= threshold, 2] <- "high"

  return(df)
}

# assign output to new.df
new.df = classify(my.df, "A", 5)

new.df    
#     A   V2
# 1   1  low
# 2   2  low
# 3   3  low
# 4   4  low
# 5   5 high
# 6   6 high
# 7   7 high
# 8   8 high
# 9   9 high
# 10 10 high

使用列参数过滤函数内的数据框

2 个答案: