我编写了一个for循环,从数据帧中取出一组5行并将其传递给函数,然后该函数在对这5行进行一些操作后返回一行。以下是代码:
for (i in 1:nrow(features_data1)){
if (i - start == 4){
group = features_data1[start:i,]
group <- as.data.frame(group)
start <- i+1
sub_data = feature_calculation(group)
final_data = rbind(final_data,sub_data)
}
}
任何人都可以建议我替代这个,因为for循环需要花费很多时间。功能feature_calculation很大。
答案 0 :(得分:0)
尝试使用基本R方法:
# convert features to data frame in advance so we only have to do this once
features_df <- as.data.frame(features_data1)
# assign each observation (row) to a group of 5 rows and split the data frame into a list of data frames
group_assignments <- as.factor(rep(1:ceiling(nrow(features_df) / 5), each = 5, length.out = nrow(features_df)))
groups <- split(features_df, group_assignments)
# apply your function to each group individually (i.e. to each element in the list)
sub_data <- lapply(X = groups, FUN = feature_calculation)
# bind your list of data frames into a single data frame
final_data <- do.call(rbind, sub_data)
您可以使用purrr和dplyr软件包进行加速。后者的函数bind_rows
比do.call(rbind, list_of_data_frames)
快得多,如果这可能非常大。