如何在使用plyr时避免重复代码

时间:2012-12-13 15:03:51

标签: r plyr

我想为某些数据组合生成相同类型的图表。目前,我使用plyr来分割数据并为每个组合执行一些代码。

例如,假设dataframe包含公司,部门,地区和收入。这是我的伪代码:

     d_ply(dataframe, .(company),  function(df) {
      d_ply(df, .(department),  function(df) {
        d_ply(df, .(region), function(df) {
           bar_chart(df$region, df$revenue)
        })
            bar_chart(df$department, df$revenue)
      })
            bar_chart(df$company, df$revenue)
    })

在我的实例中,我需要做多件事,代码是10行左右。有没有办法避免重复每个组合中的代码,除了创建一个函数,只是传递适当的参数?我希望有一些神奇的plyr技巧。

1 个答案:

答案 0 :(得分:1)

虚拟数据:

d <- data.frame(company=letters[1:26],
                department=sample(letters[1:10],26,replace=TRUE),
                region=sample(letters[1:3],26,replace=TRUE),
                revenue=round(runif(26)*10000))

更新

我认为有必要对您的代码进行解释:

d_ply(dataframe, .(company),  function(df) { # by company
      d_ply(df, .(department),  function(df) { # by department
        d_ply(df, .(region), function(df) { # by region
           bar_chart(df$region, df$revenue)
           # this part is essentially equal to
           # d_ply(df, .(company,department,region), function(df), plot(df)) 
    })
  bar_chart(df$department, df$revenue)
  # this part is essentially equal 
  # d_ply(df,.(company,department), function(df), fun(df))
  })
 bar_chart(df$company, df$revenue)
 # this part is essentially equal to 
 # d_ply(df,.(company), function(df), fun(df))
})

我发现你的代码非常难以理解。它可以替换为:

some.fun <- function(df, ...) {
# ...
}

d_ply(d, .(company), function(df) some.fun(df, ...))
d_ply(d, .(company,department), function(df) some.fun(df, ...)) 
d_ply(d, .(company,department,region), function(df) some.fun(df, ...))