删除空格:清除R Web数据奇数格式中的数据

时间:2015-11-15 11:12:03

标签: r whitespace removing-whitespace

因此,当我使用write.csv时,我有一些网络数据被抓取,我在Excel中获得了巨大的空白区域。以下是我数据框中2行的示例:

Font

有人会碰巧知道如何处理删除空格的问题吗?

2 个答案:

答案 0 :(得分:1)

这里有两个半复杂的问题。第一个“有人会碰巧知道如何解决移除空白的问题吗?”对于我来说太过模糊和复杂,除了建议使用stringr包中的函数之外真的帮助你? ¯\ _(ツ)_ /¯idk如果有帮助吗?

第二个“中学:有人可以通过向我展示如何清理我的”referee.report“文本来帮助我吗?这是我最感兴趣的专栏。我特别想删除”\ r \ n n“除其他外。”更多的是要解决的问题。

referee.report = structure(c("\r\n                                    \r\n                                        DOI: 10.5256/f1000research.6599.r7859\r\n                                    \r\n                                                                                                                                                                                                                        \r\n                                        I have read the revised article by Horrell and D'Orazio. They have responded appropriately to\r\n                                                                                    ... Continue reading\r\n                                                                            \r\n                                    \r\n                                        I have read the revised article by Horrell and D'Orazio. They have responded appropriately to the concerns/questions raised by all 3 reviewers. Accordingly, I recommend indexing the submitted revised article.\r\n                                        \r\n                                                                                            \r\n                                                                                                                I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.                                                                                                     \r\n                                                                                    \r\n                                        Competing Interests:\r\n                                        No competing interests were disclosed.\r\n                                                                                Close\r\n                                    \r\n                                    \r\n                                        REPORT A CONCERN\r\n                                    \r\n                                ", 
                             "\r\n                                    \r\n                                        DOI: 10.5256/f1000research.6601.r7701\r\n                                    \r\n                                                                                                                                                                                                                        \r\n                                        The revision\r\n                                                                                    ... Continue reading\r\n                                                                            \r\n                                    \r\n                                        The revision is approved\r\n                                        \r\n                                                                                            \r\n                                                                                                                I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.                                                                                                     \r\n                                                                                    \r\n                                        Competing Interests:\r\n                                        No competing interests were disclosed.\r\n                                                                                Close\r\n                                    \r\n                                    \r\n                                        REPORT A CONCERN\r\n                                    \r\n                                "
), .Names = c("http://f1000research.com/articles/3-288/v2", "http://f1000research.com/articles/4-34/v2"
))

cleanOutput <- function(listObject){
  listObject = sapply(listObject, str_split,"\\r\\n")
  listObject = sapply(listObject, trimws)
  listObject = paste(listObject[listObject!=""]) ##This line eliminates empty values and NAs
  return(listObject)
}

cleanOutput(referee.report)

试试这个功能?

编辑:

此版本从行的开头删除\ t。

编辑: 结果str_trim删除行开头的“\ t”。不需要编辑。

答案 1 :(得分:0)

更新所以Polka的代码会运行一些但是lapply删除\,但是因为列表形式的变量我需要将它转换为一个字符但是当我做的时候返回:

更新 paste()以连接所有字符串并返回单个值会产生相同的结果。