使用dplyr

时间:2016-07-19 10:31:11

标签: r dplyr

我正在尝试从字符串变量中选择列,并执行一些计算。

我们假设我正在分析iris,我想找到长度和宽度之间的所有比率。

# Manual mutation (ie: adding the column names explicitly in the mutate statement) 
iris %>% 
  mutate(Sepal.ratio = Sepal.Length/Sepal.Width, 
         Petal.ratio = Petal.Length/Petal.Width)

# Output: 
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.ratio Petal.ratio
# 1          5.1         3.5          1.4         0.2  setosa    1.457143        7.00
# 2          4.9         3.0          1.4         0.2  setosa    1.633333        7.00
# 3          4.7         3.2          1.3         0.2  setosa    1.468750        6.50
# 4          4.6         3.1          1.5         0.2  setosa    1.483871        7.50
# 5          5.0         3.6          1.4         0.2  setosa    1.388889        7.00
# 6          5.4         3.9          1.7         0.4  setosa    1.384615        4.25


问题: 有没有办法使用指定列名的变量或数据框(如下面定义的ratioSets)?

# Predefined or preprocessed column name set: 
ratioSets = rbind(c(value = 'Sepal.ratio', numerator = 'Sepal.Length', denominator = 'Sepal.Width'), 
                 c(value = 'Petal.ratio', numerator = 'Petal.Length', denominator = 'Petal.Width'))

# Automated mutation:
iris %>% 
  mutate(
    # How can I use the ratioSets here?
    # Something like : ratioSets$value = ratioSets$numerator / ratioSets$denominator
  )


# Expected Output: 
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.ratio Petal.ratio
# 1          5.1         3.5          1.4         0.2  setosa    1.457143        7.00
# 2          4.9         3.0          1.4         0.2  setosa    1.633333        7.00
# 3          4.7         3.2          1.3         0.2  setosa    1.468750        6.50
# 4          4.6         3.1          1.5         0.2  setosa    1.483871        7.50
# 5          5.0         3.6          1.4         0.2  setosa    1.388889        7.00
# 6          5.4         3.9          1.7         0.4  setosa    1.384615        4.25

1 个答案:

答案 0 :(得分:1)

假设分子总是在分母之前(即宽度之前的长度)的一种方法

sapply(unique(sub('\\..*', '', names(iris[,-ncol(iris)]))), function(i)
        Reduce('/', iris[,-ncol(iris)][,grepl(i, sub('\\..*', '', names(iris[,-ncol(iris)])))]))

head(cbind(iris, sapply(unique(sub('\\..*', '', names(iris[,-ncol(iris)]))), 
         function(i) Reduce('/', iris[,-ncol(iris)][,grepl(i, sub('\\..*', '', names(iris[,-ncol(iris)])))]))))

#  Sepal.Length Sepal.Width Petal.Length Petal.Width Species    Sepal Petal
#1          5.1         3.5          1.4         0.2  setosa 1.457143  7.00
#2          4.9         3.0          1.4         0.2  setosa 1.633333  7.00
#3          4.7         3.2          1.3         0.2  setosa 1.468750  6.50
#4          4.6         3.1          1.5         0.2  setosa 1.483871  7.50
#5          5.0         3.6          1.4         0.2  setosa 1.388889  7.00
#6          5.4         3.9          1.7         0.4  setosa 1.384615  4.25