增强输出

Question

我在正确转置数据时遇到了一些困难。我正在尝试获取列名称和列表的列表，其中列名称现在是行。我能够用下面的代码创建手段和sd：

data(iris)

mydata <- do.call(data.frame, aggregate(. ~ Species, iris, function(x) c(mean = mean(x), sd = sd(x))))

创建表格：

＆＃13;

<table><tbody><tr><th>Species</th><th>Sepal.Length.mean</th><th>Sepal.Length.sd</th><th>Sepal.Width.mean</th><th>Sepal.Width.sd</th><th>Petal.Length.mean</th><th>Petal.Length.sd</th><th>Petal.Width.mean</th><th>Petal.Width.sd</th></tr><tr><td>setosa</td><td>5.006</td><td>0.3524897</td><td>3.428</td><td>0.3790644</td><td>1.462</td><td>0.173664</td><td>0.246</td><td>0.1053856</td></tr><tr><td>versicolor</td><td>5.936</td><td>0.5161711</td><td>2.77</td><td>0.3137983</td><td>4.26</td><td>0.469911</td><td>1.326</td><td>0.1977527</td></tr><tr><td>virginica</td><td>6.588</td><td>0.6358796</td><td>2.974</td><td>0.3224966</td><td>5.552</td><td>0.5518947</td><td>2.026</td><td>0.27</td></tr></tbody></table>

＆＃13;

我希望表格如下所示：

＆＃13;

<table><tbody><tr><th> </th><th>Setosa</th><th> </th><th>Versicolor</th><th> </th><th>Virginica</th><th> </th></tr><tr><td> </td><td>Mean</td><td>SD</td><td>Mean</td><td>SD</td><td>Mean</td><td>SD</td></tr><tr><td>Sepal.Length</td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td></tr><tr><td>Sepal.Width</td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td></tr><tr><td>Petal.Length</td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td></tr><tr><td>Petal.Width</td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td></tr></tbody></table>

＆＃13;

我意识到获取第二个标头很可能需要在kable中使用add_header_above函数，但在我到达那里之前，我在构建数据帧时遇到了一些困难。我一直在摆弄铸造和熔化功能，运气不佳。

非常感谢任何建议！

〜杰克

Answer 1

以下是tidyverse和tables包的解决方案。首先，我们使用gather()制作一个窄格式的整洁数据集。窄格式允许我们在表中使用Species和flowerAttribute作为因子变量，并且不需要转置数据。

其次，我们使用tables::tabular()函数生成一个表，该表具有列维度上的Species均值和标准差，以及行维度上的flower属性。

data(iris)
library(tables)
library(tidyverse)
tidyIris <- gather(iris,key=flowerAttribute,value=value,
                 Sepal.Length,Sepal.Width,Petal.Length,Petal.Width)
# factors required for tabular()
tidyIris$flowerAttribute <- as.factor(tidyIris$flowerAttribute)
tabular((flowerAttribute) ~ Format(digits=2)*(Species)*(value)*(mean + sd), 
       data=tidyIris )

...和输出：

> tabular((flowerAttribute) ~ Format(digits=2)*(Species)*(value)*(mean + sd), 
+         data=tidyIris )

                 Species                                    
                 setosa       versicolor      virginica     
                 value        value           value         
 flowerAttribute mean    sd   mean       sd   mean      sd  
 Petal.Length    1.46    0.17 4.26       0.47 5.55      0.55
 Petal.Width     0.25    0.11 1.33       0.20 2.03      0.27
 Sepal.Length    5.01    0.35 5.94       0.52 6.59      0.64
 Sepal.Width     3.43    0.38 2.77       0.31 2.97      0.32

对于之前使用过SAS的人来说，tables包实现了与SAS PROC TABULATE类似的功能。

增强输出

通过对代码的一些调整，我们可以完全复制OP中请求的输出格式。

# key syntax elements
# 1. - renamed flowerAttribute to Attribute using = operator
# 2. - used Heading() to eliminate the printing of "value" and "Species" on columns
tabular((Attribute=flowerAttribute) ~ Format(digits=2)*(Heading()*Species)*Heading()*(value)*(mean + sd), 
        data=tidyIris )

...和输出：

              setosa       versicolor      virginica     
 Attribute    mean    sd   mean       sd   mean      sd  
 Petal.Length 1.46    0.17 4.26       0.47 5.55      0.55
 Petal.Width  0.25    0.11 1.33       0.20 2.03      0.27
 Sepal.Length 5.01    0.35 5.94       0.52 6.59      0.64
 Sepal.Width  3.43    0.38 2.77       0.31 2.97      0.32
 >

生成LaTeX

最后，要获得排版质量输出，可以使用tabular()编写可以使用Sweave编译成PDF文档的LaTeX代码。

latex(tabular((Attribute=flowerAttribute) ~ Format(digits=2)*(Heading()*Species)*Heading()*(value)*(mean + sd), 
        data=tidyIris ))

...生成编译成的LaTeX：

Answer 2

我猜你在找这个？

  `colnames<-`(do.call(rbind,by(t(mydata[-1]),rep(names(iris[-5]),each=2),unlist)),rep(c("Mean","Sd"),3))
              Mean        Sd  Mean        Sd  Mean        Sd
Petal.Length 1.462 0.1736640 4.260 0.4699110 5.552 0.5518947
Petal.Width  0.246 0.1053856 1.326 0.1977527 2.026 0.2746501
Sepal.Length 5.006 0.3524897 5.936 0.5161711 6.588 0.6358796
Sepal.Width  3.428 0.3790644 2.770 0.3137983 2.974 0.3224966

首先，因为我只处理数字列，所以我除了Species column iris[-5]。此外，由于我不需要mydata的第一列，我摆脱了它。为什么我重复两次？有两个功能。我为什么要重复3次，有3种......

转置Dataframe后聚合

2 个答案:

增强输出

生成LaTeX