Question

我有一个如下所示的数据框：

structure(list(ab = c(0, 1, 1, 1, 1, 0, 0, 0, 1, 1), bc = c(1, 
1, 1, 1, 0, 0, 0, 1, 0, 1), de = c(0, 0, 1, 1, 1, 0, 1, 1, 0, 
1), cl = c(1, 2, 3, 1, 2, 3, 1, 2, 3, 2)), .Names = c("ab", "bc", 
"de", "cl"), row.names = c(NA, -10L), class = "data.frame")

列cl表示簇关联，变量ab，bc＆amp; de携带二进制答案，其中1表示是和0 - 否。

我正在尝试创建一个表格交叉表格集群以及数据框中的所有其他列，即ab，bc和de，其中集群成为列变量。所需的输出就像这样

我尝试了以下代码：

with(newdf, tapply(newdf[,c(3)], cl, sum))

这为我提供了一次只交叉一列的值。我的数据框有1600多列，有1个簇列。有人可以帮忙吗？

Answer 1

使用<?php $form=$this->beginWidget('bootstrap.widgets.BsActiveForm', array( 'id'=>'review-business-form', // Please note: When you enable ajax validation, make sure the corresponding // controller action is handling ajax validation correctly. // There is a call to performAjaxValidation() commented in generated controller code. // See class documentation of CActiveForm for details on this. 'enableAjaxValidation'=>false, )); ?> <?php $this->widget('ext.DzRaty.DzRaty', array( 'model' => $reviewmodel, 'attribute' => 'rating', )); ?> </ul> </div> <div class="form-group"> <label>Review Text</label> <?php echo $form->textarea($reviewmodel,'review',array('maxlength'=>500)); ?> </div> <?php echo BsHtml::submitButton('Submit', array('color' => BsHtml::BUTTON_COLOR_PRIMARY)); ?> <?php $this->endWidget(); ?>的一种方法是：

dplyr

输出：

library(dplyr)
df %>% 
  #group by the varialbe cl
  group_by(cl) %>%
  #sum every column
  summarize_each(funs(sum)) %>%
  #select the three needed columns
  select(ab, bc, de) %>%
  #transpose the df
  t

Answer 2

您的数据采用半长半格式，您希望它采用全宽格式。如果我们首先将其转换为完全长格式，这是最简单的：

   [,1] [,2] [,3]
ab    1    3    2
bc    2    3    1
de    2    3    1

然后我们可以使用library(reshape2) df_long = melt(df, id.vars = "cl") head(df_long) # cl variable value # 1 1 ab 0 # 2 2 ab 1 # 3 3 ab 1 # 4 1 ab 1 # 5 2 ab 1 # 6 3 ab 0作为聚合函数将其转换为宽格式：

sum

Answer 3

在base R：

中

t(sapply(data[,1:3],function(x) tapply(x,data[,4],sum)))
#   1 2 3
#ab 1 3 2
#bc 2 3 1
#de 2 3 1

Answer 4

您还可以合并std::discrete_distribution<int> make_distribution(const std::vector<Data>& weights) { const auto n = weights.size(); const auto op = [&weights](double d){ const auto index = static_cast<std::size_t>(d - 0.5); //std::cout << "weights[" << index << "].value == " << weights[index].value << "\n"; return weights[index].value; }; return std::discrete_distribution<int> {n, 0.0, static_cast<double>(n), op}; }或tidyr:gather和reshape2::melt以获得您的竞争表

xtabs

如果您更喜欢使用烟斗

library(tidyr)
xtabs(value ~ key + cl, data = gather(df, key, value, -cl))
##     cl
## key  1 2 3
##   ab 1 3 2
##   bc 2 3 1
##   de 2 3 1

Answer 5

只需按照dickoa编写的代码，使用dplyr的ivot_longer（代替聚集）进行更新，即可：

library(dplyr)

df %>% 
pivot_longer(cols = ab:de,
          names_to = "key",
          values_to = "value") %>% 
xtabs(value ~ key + cl, data = .)

使用R中数据框中的多个列创建列联表

5 个答案: