R提取数据帧值以便在R-markdown中打印

时间:2016-10-03 16:18:19

标签: r r-markdown

我有一个数据框,用于提取在论坛上发布的消息线索。通过连接数据库中的表,我得到一个如下所示的结构:

threadStarterName1    threadstarter1    comment1    commenterName1
threadStarterName1    threadstarter1    comment2    commenterName2
threadStarterName1    threadstarter1    comment3    commenterName3
threadStarterName1    threadstarter1    comment4    commenterName4
threadStarterName1    threadstarter1    comment5    commenterName5

创建此数据框的代码:

      df=data.frame("threadStarterName"=c("threadStarterName1","threadStarterName1","threadStarterName1","threadStarterName1","threadStarterName1"),
"threadStarter"=c("threadStarter1","threadStarter1","threadStarter1","threadStarter1","threadStarter1"),
"comment"=c("comment1","comment2","comment3","comment4","comment5"),
"commenterName"=c("commenterName1","commenterName2","commenterName3","commenterName4","commenterName5"))

我想重新格式化此数据框以提取如下值,然后我可以在R-markdown中打印报告:

threadstarter1    threadStarterName1
   comment1       commenterName1
   comment2       commenterName2
   comment3       commenterName3
   comment4       commenterName4
   comment5       commenterName5

提前致谢!

1 个答案:

答案 0 :(得分:0)

如果我理解正确,原始帖子(和它的作者)会在每一行上重复,而你希望它们只出现一次,与评论内容和评论作者在同一列中。

如果是这样,应该这样做:

onlyOnce <-
  data.frame(
    user = c(df$threadStarterName[1]
             , df$commenterName)
    , commentPosted = c(df$threadStarter[1]
                        , df$comment)
  )

它需要第一个线程作者条目(和他们的帖子)并将其置于评论作者(及其评论)之上的顶部。