Question

当我打印我的数据框时，有两列，它们之间有一堆空白区域，旁边的列紧挨着右侧的列。

它基本上是这样的：

c1 c2  (whitespace) c3c4 c5

我该如何解决这个问题？有格式化选项吗？

列名不再是列中的数据，因此这不是空格的原因。很抱歉，如果图片不清楚，但我是新来的，对格式化不是很熟悉。

Answer 1

如果列中存在过多的空格，则清理起来相当简单。 R具有多种文本处理功能。举个例子。

> x <- c("this        is     ", "a        column with   ",
         "     a lot of ", "            whitespace")
> (d <- data.frame(w = 1:4*100, x = x, y = 1:4*10))
    w                       x  y
1 100     this        is      10
2 200 a        column with    20
3 300               a lot of  30
4 400              whitespace 40

在包stringr中，str_trim()从字符串的任一端（或两端）中删除空格，str_replace()可用于将其他空格字符替换为单一空间。

> library(stringr)
> dd <- data.frame(sapply(d, function(x){
      str_replace(str_trim(x), "\\s+", " ")
      })))
    w             x  y
1 100       this is 10
2 200 a column with 20
3 300      a lot of 30
4 400    whitespace 40

ADDED：根据您关于中心列名称的评论中的问题，是的，我们可以通过使用str_pad()函数填充名称向量中的相关元素来实现封装

> num <- floor(max(nchar(as.character(dd$x)))/2)
> names(dd)[2] <- str_pad(names(dd)[2], num, "right")
> dd
    w        x       y
1 100       this is 10
2 200 a column with 20
3 300      a lot of 30
4 400    whitespace 40

Answer 2

这对于字符和因子向量起作用（.i.e。删除前导或尾随空格）。它是gdata:::trim.character和gdata:::trim.factor函数的简化组合版本：

trim <- function (s ) 
{   s <- as.character(s)
    s <- sub(pattern = "^[[:blank:]]+", replacement = "", x = s)
    s <- sub(pattern = "[[:blank:]]+$", replacement = "", x = s)
    s
}

Credit：pkg：gdata由Greg Warnes编写

列之间的空格

2 个答案: