我有一些代码从数据帧中随机抽取1到10行,随机抽样复制5次,并在每个随机样本上计算网络度量(连接)。但是,我想在我的数据框中的“site”和“method”的每个级别分别运行此代码。
如何按站点和方法拆分数据框(df),在每个子集上运行以下代码,然后将所有输出返回到包含“site”,“method”,“size”列的单个文件中(采样的行数)和“连接”?
这是我到目前为止所做的:
df <- read.table(text = "bird_sp plant_sp value site method
1 species_a plant_a 1 a m
2 species_a plant_a 1 a m
3 species_b plant_b 1 a m
4 species_b plant_b 1 a m
5 species_c plant_c 1 a m
6 species_a plant_a 1 b m
7 species_a plant_a 1 b m
8 species_b plant_b 1 b m
9 species_b plant_b 1 b m
10 species_c plant_c 1 b m
11 species_a plant_a 1 a f
12 species_a plant_a 1 a f
13 species_b plant_b 1 a f
14 species_b plant_b 1 a f
15 species_c plant_c 1 a f
16 species_a plant_a 1 b f
17 species_a plant_a 1 b f
18 species_b plant_b 1 b f
19 species_b plant_b 1 b f
20 species_c plant_c 1 b f", header = TRUE)
#make sample function
sample_fun <- function(x,size){
rows <- sample(1:nrow(x),size,replace=FALSE)
intlist <- x[rows,]
return(intlist)
}
#convert list to interaction matrix
make_mat <- function(x){
mat <- with(x,tapply(value, list(plant_sp, bird_sp), sum))
mat[is.na(mat)] <- 0
return(mat)
}
#create vector with required sample size and replication
size_vector <- rep(1:10,5)
#use vector to generate list of interactions
samples_Data <- lapply(size_vector, function(x) sample_fun(df,x))
output <- lapply(samples_Data, function(x)
make_mat(x))
library(bipartite)
#calculate connectance on each element (matrix) in output list
#ignore warnings
metrics <- lapply(output, networklevel, index=c("connectance"))
met <- data.frame(unlist(metrics))
names(met) <- names(metrics[[1]])
#Add number of interactions sampled
met$size <- size_vector
答案 0 :(得分:0)
You can split the dataset by site and method with the following command.
df_split <- split(df, paste0(df$site, df$method))
Afterwards you can apply a function to each subset with lapply
, i.e.:
lapply(df_split, FUN = nrow)
To get your output you can do, i.e.:
result <- unique(df[, c("site", "method")])
result <- result[order(result$site, result$method),] # !! SEE BELOW
result$rows <- lapply(df_split, FUN = nrow)
result
site method rows
11 a f 5
1 a m 5
16 b f 5
6 b m 5
Be sure to do the order command!! Split seems to automatically order the subsets alphabetically.
To generate your variable just put all your code from above into a function and run it on each subset the same way as the nrow
function seen above.