遍历数据框以创建和填充新数据框

时间:2020-06-01 08:39:06

标签: r list dataframe

我有以下大型数据框:

Jan_Feb2019 
Mar_Apr2019 
May_Jun2019 
Jul_Aug2019 
Sep_Oct2019 
Nov_Dec2019 
Jan_Feb2020 
Mar_2020 

然后我使用以下代码生成其他数据框,并用我想要的数据填充列。

#Jan_Feb2019
Jan_Feb2019_df <- as.data.frame(Jan_Feb2019$reactions$summary$total_count)
colnames(Jan_Feb2019_df)[1] <- "Reactions"
Jan_Feb2019_df$Shares <- Jan_Feb2019$shares$count
Jan_Feb2019_df$Comments <- Jan_Feb2019$comments$summary$total_count
Jan_Feb2019_df$Message <- Jan_Feb2019$message
Jan_Feb2019_df$Likes <- Jan_Feb2019$likes$summary$total_count
Jan_Feb2019_df$CreatedDate <- Jan_Feb2019$created_time
Jan_Feb2019_df$PostID <- Jan_Feb2019$id
Jan_Feb2019_df$Love <- Jan_Feb2019$reacts_love$summary$total_count
Jan_Feb2019_df$Angry <- Jan_Feb2019$reacts_angry$summary$total_count
Jan_Feb2019_df$Sad <- Jan_Feb2019$reacts_sad$summary$total_count
Jan_Feb2019_df$HAHA <- Jan_Feb2019$reacts_haha$summary$total_count
Jan_Feb2019_df$WOW <- Jan_Feb2019$reacts_wow$summary$total_count
Jan_Feb2019_df$CreatedDate <- anytime(Jan_Feb2019_df[,6])
Jan_Feb2019_df$insights.data <- Jan_Feb2019$insights$data

Jan_Feb2019_df <- Jan_Feb2019_df %>% 
  unnest(insights.data) %>% 
  unnest(values) %>% 
  select(Message,Shares,Comments,Reactions,Likes,CreatedDate,PostID,Love,Angry,Sad,HAHA,WOW,name,value) %>% 
  pivot_wider(names_from = name, values_from = value)

是否有一种方法可以在上述所有数据帧之间进行迭代,所以我不必重复此过程8次? 谢谢

1 个答案:

答案 0 :(得分:1)

以下代码未经测试。我试图遵循问题中的代码,使其变得通用。有2个功能。

  • IF将旧对象作为唯一参数,并创建并填充新数据框。
  • fillNewDf以旧对象 name 作为参数,并调用makeNewDf返回其值。

如果对象位于全局环境中,则使用fillNewDf参数makeNewDf的默认值。

envir

现在获取要用fillNewDf <- function(X){ vec <- X[['reactions']][['summary']][['total_count']] Y <- data.frame(Reactions = vec) Y[['Shares']] <- X[['shares']][['count']] Y[['Comments']] <- X[['comments']][['summary']][['total_count']] Y[['Message']] <- X[['message']] Y[['Likes']] <- X[['likes']][['summary']][['total_count']] Y[['CreatedDate']] <- X[['created_time']] Y[['PostID']] <- X[['id']] Y[['Love']] <- X[['reacts_love']][['summary']][['total_count']] Y[['Angry']] <- X[['reacts_angry']][['summary']][['total_count']] Y[['Sad']] <- X[['reacts_sad']][['summary']][['total_count']] Y[['HAHA']] <- X[['reacts_haha']][['summary']][['total_count']] Y[['WOW']] <- X[['reacts_wow']][['summary']][['total_count']] Y[['CreatedDate']] <- anytime(Y[, 6]) Y[['insights.data']] <- X[['insights']][['data']] Y %>% unnest(insights.data) %>% unnest(values) %>% select(Message, Shares, Comments, Reactions, Likes, CreatedDate, PostID, Love, Angry, Sad, HAHA, WOW, name, value) %>% pivot_wider(names_from = name, values_from = value) } makeNewDf <- function(X, envir = .GlobalEnv){ DF <- get(X, envir = envir) filNewDf(DF) } 处理的对象的名称,并创建一个包含新数据框的列表。

ls()

如果这些新数据框要成为全局环境中的对象,则old_names <- ls(pattern = '\\d{4}$') new_list <- lapply(old_list, makeNewDf) names(new_list) <- paste(old_names, "df", sep = "_") 将使用与list2env(new_list)的names属性相同的名称来创建它们。