R:将函数应用于数据框列表并保存到工作区

时间:2017-08-31 08:03:11

标签: r dataframe apply workspace

我有很多与此类似的食物:

dftest_tw <- structure(list(text = c("RT @BitMEXdotcom: A new high: US$500M turnover in the last 24 hours, over 80% of it on $XBTUSD. Congrats to the team and thank you to our u…", 
"RT @Crowd_indicator: Thank you for this nice video, @Nicholas_Merten", 
"RT @Crowd_indicator: Review of #Cindicator by DataDash: t.co/D0da3u5y3V"
), Tweet.id = c("896858423521837057", "896858275689398272", "896858135314538497"
), created.date = structure(c(17391, 17391, 17391), class = "Date"), 
    created.week = c(33, 33, 33)), .Names = c("text", "Tweet.id", 
"created.date", "created.week"), row.names = c(NA, -3L), class = c("tbl_df", 
"tbl", "data.frame"))

这是我想要应用于所有元素的函数

编辑以下评论,我在我的函数中添加x作为最后一行

MyCount <- function(x){
  x$retweet <- NA
  x$custom <- NA
  x$retweet <- grepl(retw, x$text) * 1
  x$custom <- (grepl(cust, x$text) & !grepl(retw, x$text)) * 1
  x
}

我以这种方式访问​​这些元素:

myUser_tw <- ls(,pattern = "_tw")

因为他们都是我的env中唯一以 _tw 结尾的人。

现在我要做的是如何应用功能:

for (i in 1:length(myUserList_tw)){
  lapply(mget(myUserList_tw), MyCount)
}

但实际上它不会改变任何东西。运行以下一个df将按照我想要的方式更改它们。打印结果没问题。

lapply(mget(myUser_tw[x]), MyCount) 

现在我找不到将结果分配给工作区中的df的方法。我尝试了很多这样的事情:

myUser_tw[x] <- lapply(mget(myUser_tw[x]), MyCount) 

或在我的功能结束时包含x <<- x,但没有成功。

Cany有人帮我将修改过的df保存在我的工作区吗?谢谢

1 个答案:

答案 0 :(得分:1)

您的示例代码中存在许多问题。

myUser_tw未被重复使用,您使用myUserList_tw代替,可能是拼写错误。我会使用myUserList因为使用以'tw'结尾的变量不一致,因为您认为那些是tibbles

您的Mycount函数未返回x(在编辑中已更改)

retwcust未定义,因此我假设它们是字符串而您忘记了引号。

你的循环并没有真正循环任何东西(i未被使用),lapply的结果没有分配给任何东西。

这应该有效:

dftest_tw <- structure(list(text = c("RT @BitMEXdotcom: A new high: US$500M turnover in the last 24 hours, over 80% of it on $XBTUSD. Congrats to the team and thank you to our u…", 
                                     "RT @Crowd_indicator: Thank you for this nice video, @Nicholas_Merten", 
                                     "RT @Crowd_indicator: Review of #Cindicator by DataDash: t.co/D0da3u5y3V"
), Tweet.id = c("896858423521837057", "896858275689398272", "896858135314538497"
), created.date = structure(c(17391, 17391, 17391), class = "Date"), 
created.week = c(33, 33, 33)), .Names = c("text", "Tweet.id", 
                                          "created.date", "created.week"), row.names = c(NA, -3L), class = c("tbl_df", 
                                                                                                             "tbl", "data.frame"))

dftest2_tw <- dftest_tw # so we have 2

MyCount <- function(x){
  x$retweet <- NA
  x$custom <- NA
  x$retweet <- grepl("retw", x$text) * 1
  x$custom <- (grepl("cust", x$text) & !grepl("retw", x$text)) * 1
  x
}

myUserList <- ls(,pattern = "_tw")
for(var in myUserList){
  assign(var,MyCount(get(var))) # assign to the variable described by string `var` the result of the function MyCount applied on the value of `var` (itself obtained by `get`) 
}