Question

所以最初我有以下对象：

> head(gs)
  year disturbance lek_id  complex tot_male
1 2006           N     3T  Diamond        3
2 2007           N     3T  Diamond       17
3 1981           N   bare 3corners        4
4 1982           N   bare 3corners        7
5 1983           N   bare 3corners        2
6 1985           N   bare 3corners        5

我计算了一般统计数据： tot_male 的n，min，max，mean和sd，复杂中的年份。然后，我使用以下内容将这些在复合体中按年合并到一个数据集中：

gsnew <- gs %>% group_by(year, complex) %>% summarise(n = length(tot_male), male_min = min(tot_male), male_max = max(tot_male), male_mean = mean(tot_male), male_sd = sd(tot_male))

导致：

> gsnew Source: local data frame [119 x 7] Groups: year [?] year complex n male_min male_max male_mean male_sd (int) (fctr) (int) (int) (int) (dbl) (dbl) 1 1967 Diamond 2 33 101 67.000000 48.083261 2 1969 Diamond 2 29 69 49.000000 28.284271 3 1970 3corners 1 26 26 26.000000 NA 4 1970 Diamond 4 3 51 26.250000 21.093048 5 1971 3corners 3 6 22 12.333333 8.504901

我如何用以下格式编写一般函数

FunctionName=function(Argument1,...,ArgumentN) {Statement1,...,StatementN} • Argument1-N are any variable from object(s) • Statement1-N are any valid R statements

这允许我： •导入数据 •从数据中选择需要统计数据的指定年份; •计算lek complex中指定年份的平均值，2SD，n和90％置信区间 •将基于年度的输出写为单独的* .csv文件

year complex mean st.dev2 n lo90ci hi90ci 2007 3corners 26.28571 52.04760 7 -393.50827 446.07970 2007 Blue 18.87500 20.15476 8 -40.00856 77.75856 2007 book_cliffs 4.50000 13.19091 6 -24.62443 33.62443 2007 Diamond 13.25000 48.83431 20 -205.38461 231.88461

Answer 1

嗯，我觉得你很近。它可能看起来像这样：

read_write = function(file_name, this_year) {
  file_name %>%
  read.csv %>%
  filter(year == this_year) %>%
  summarise(n = length(tot_male), 
            male_min = min(tot_male), 
            male_max = max(tot_male), 
            male_mean = mean(tot_male), 
            male_sd = sd(tot_male),
            male_2sd = 2*male_sd,
            male_upper_bound = male_mean + 1.645*male_sd,
            male_lower_bound = male_mean - 1.645*male_sd) %>%
  write.csv("out_" %>% paste0(filename), row.names = false)
  }

Answer 2

感谢@bramtayl

以下是最终代码：

> library(dplyr)
> annualleksummary = function(x1) {
+   x1 %>%
+   read.csv %>% 
+   filter(tot_male, year == 2007) %>% group_by(year, complex) %>%
+   summarise(n = length(tot_male), 
+             male_min = min(tot_male), 
+             male_max = max(tot_male), 
+             male_mean = mean(tot_male), 
+             male_sd = sd(tot_male),
+             male_2sd = 2*male_sd,
+             male_upper_bound = male_mean + 1.645*male_sd,
+             male_lower_bound = male_mean - 1.645*male_sd) %>%
+   write.csv("2007_" %>% paste0(x1), row.names = F) 
+   }
> annualleksummary("gsg_leks.csv")

编写一个函数，用于导入数据并按可变条件计算汇总统计信息并写入输出文件

2 个答案: