如何从R中的自定义函数向数据框添加多个列

时间:2016-08-23 06:20:00

标签: r function dataframe

我创建的代码将采用输入向量,根据输入创建数据帧,优化某些值并返回其中一些值。我现在把它变成一个函数,它将在输入数据帧上按行计算。下面是我想要实现的最小工作示例(我的实际功能在这里分享的时间太长了!):

# Randomly generated dataframe
df <-  data.frame(a = rnorm(10, 0, 1), x = rnorm(10, 1, 3), y = rnorm(10, 2, 3))

# Function that takes multiple arguments and returns multiple values in a list
zsummary <- function(x, y) { 
  if (y < 0) return(list(NA, NA))
  z = rnorm(10, x, abs(y))
  return(list(mean(z), sd(z)))
}

# Example of something that works using dplyr
#    However, this results in a lot of function calls...
#    especially if there were a lot of columns in the list...
library(dplyr)
df %>% rowwise() %>%
  mutate(mean = zsummary(x,y)[[1]], sd = zsummary(x,y)[[1]])

正如您所看到的,我无法将单个函数应用于每个新的df$meandfsd列,因为它们依赖于只能生成一次的z向量。我已经看过SO,但我还没有找到答案。我认为解决方案是使用apply函数之一,而不是来自dplyr的函数,但我真的从未完全理解apply函数。我也喜欢使用for循环和rbind的解决方案,因为我在以前的项目中尝试过这种方法,对于大型数据帧,它变得非常慢!

1 个答案:

答案 0 :(得分:2)

我们可以使用mapply。由于zsummary有两个参数,mapply将是一个选项,因为它采用&#39; x&#39;的相应元素。并且&#39; y&#39;申请zsummary

t(mapply(zsummary, df$x, df$y))

我们也可以稍微更改功能并使用dplyr

获取输出
zsummary <- function(x, y) { 
   if (y < 0) return(data.frame(mean = NA, sd = NA))
   z = rnorm(10, x, abs(y))
   data.frame(mean = mean(z), sd = sd(z))
}

 df %>%
     rowwise() %>% 
     do(data.frame(., zsummary(.$x, .$y)))

或者正如我们在评论中讨论的那样,不是让函数采用多个参数,而是使用applyMARGIN=1zsummary2 <- function(v1){ if(v1[2] < 0) return(c(mean = NA, sd = NA)) z <- rnorm(10, v1[1], abs(v1[2])) c(mean = mean(v1), sd= sd(v1)) } t(apply(df[-1], 1, zsummary2)) # mean sd # [1,] 1.403066 0.8757504 # [2,] 5.058188 5.1401507 # [3,] 4.288365 1.4194393 # [4,] 1.932829 6.7587054 # [5,] -1.864236 3.7587462 # [6,] NA NA # [7,] 3.328629 1.3711950 # [8,] -2.347699 5.0449958 # [9,] 2.936615 1.7332283 #[10,] NA NA 应用于每一行。

rnorm

注意:每次运行的值都不同,因为我们没有为 $config = array(); $config["base_url"] = base_url() . "crm/crm/contactmgmt/modid/" . $modid; if($this->input->post('Search')=='Search'){ $data['pn']=$data1['pn']=$this->input->post('sel_prod'); //echo "hello"; } $config["total_rows"] = $this->Crm_model->record_count_unsubinst(); $config["per_page"] = 6; $this->load->config('pagination'); $config["uri_segment"] = 6; $config['enable_query_strings']='true'; $data['activeTab'] = "View"; $this->pagination->initialize($config); $page = ($this->uri->segment(6)) ? $this->uri->segment(6) : 0; $data['query'] = $this->Crm_model->listUnsubscribedInstn($config["per_page"], $page); $data["link"] = $this->pagination->create_links(); $data["links"]=$data["link"]; $data["cnt"]=$config["total_rows"]; $data["sno"]=$this->uri->segment(6); 设置任何种子。