将n路列联表转换为R中的数据帧

时间:2016-06-27 20:07:00

标签: r

我正在尝试创建一个表格,其中包含按产品名称,年份和地区销售的商品数量。我想要一张如下所示的桌子。有没有办法在R中执行此操作而不是使用sqldf函数编写SQL查询?

$lname=$_POST['cplace'];
if(isset($_POST['cplace'])) {
    foreach($lname as $place){
        echo $place." CHECKED <BR>";
    }
}

以下是生成样本数据的代码。此虚拟数据与上面的样本计数不对应。

<!DOCTYPE html>
<html ng-app="app">
<body>
<div class="container" ng-controller="checkController">
<div class="row">
  <div class="col-md-2"><input type="checkbox" name="item" ng-model="item" value="new" />Add New Item <br /></div>

<div class="col-md-6" id="item-details" ng-if="item == true">
<div class="btn-group" role="group" aria-label="...">
       <button type="button" class="btn btn-default">Left</button>
       <button type="button" class="btn btn-default">Middle</button>
       <button type="button" class="btn btn-default">Right</button>
     </div>
  </div>
</div>
</div>

4 个答案:

答案 0 :(得分:3)

Product_Name <- c("English Muffins","croissants","Kaiser rolls","Bagels","cinnamon puff","strawberry pastry")
Region_ID <- c(1:6)
Transaction_year <- c(2011:2016)

x <- data.frame()
for(i in 1:6)
{
  for (j in 1:6)
  { 
    for(k in 1:6)
    {
      x <- rbind(x, data.frame(Product = Product_Name[i], Region = Region_ID[j], Year = Transaction_year[k]))
    }
  }
}

x$count <- 1

xx <- aggregate(x[,"count"],by=list(x$Product,x$Year,x$Region),sum)
colnames(xx) <- c("Product", "Year", "Region", "Count")
head(xx)

            Product Year Region Count
1   English Muffins 2011      1     1
2        croissants 2011      1     1
3      Kaiser rolls 2011      1     1
4            Bagels 2011      1     1
5     cinnamon puff 2011      1     1
6 strawberry pastry 2011      1     1

答案 1 :(得分:3)

是的,您可以使用thendata.table声明来完成此操作。非常类似于by group-by:

SQL

答案 2 :(得分:3)

这里不需要复杂的代码。您只需要一行代码:

> as.data.frame(table(x))
              Product Region Year Freq
1     English Muffins      1 2011    1
2          croissants      1 2011    1
3        Kaiser rolls      1 2011    1
4              Bagels      1 2011    1
5       cinnamon puff      1 2011    1
6   strawberry pastry      1 2011    1
...

table函数将列联表生成为三维数组,as.data.frame将列联表转换为您想要的格式的数据框。如果x包含其他列,请确保仅将其子集化为要制表的列。

答案 3 :(得分:3)

基本功能as.data.frame.table将执行此操作。我假设您已经或可以沿着这些方向制作一个R列联表:

mt <- with(x, table(Product,Region,Year))

然后,您将获得所需的&#34;长格式&#34;对象:

 str(as.data.frame(mt))

'data.frame':   216 obs. of  4 variables:
 $ Product: Factor w/ 6 levels "English Muffins",..: 1 2 3 4 5 6 1 2 3 4 ...
 $ Region : Factor w/ 6 levels "1","2","3","4",..: 1 1 1 1 1 1 2 2 2 2 ...
 $ Year   : Factor w/ 6 levels "2011","2012",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Freq   : int  1 1 1 1 1 1 1 1 1 1 ...

另一个有用的表格展平函数是ftable。对于三向表,它提供了print.table将产生的更紧凑的显示版本:

ftable(mt)
                         Year 2011 2012 2013 2014 2015 2016
Product           Region                                   
English Muffins   1              1    1    1    1    1    1
                  2              1    1    1    1    1    1
                  3              1    1    1    1    1    1
                  4              1    1    1    1    1    1
                  5              1    1    1    1    1    1
                  6              1    1    1    1    1    1
croissants        1              1    1    1    1    1    1
                  2              1    1    1    1    1    1
                  3              1    1    1    1    1    1
                  4              1    1    1    1    1    1
                  5              1    1    1    1    1    1
                  6              1    1    1    1    1    1
Kaiser rolls      1              1    1    1    1    1    1
                  2              1    1    1    1    1    1
                  3              1    1    1    1    1    1
#-----snipped output--------

另一方面,如果请求是通过Count变量复制行数,那么就可以这样做:

#Makes something like your original dataframe:
orig <- structure(list(Product_Name = structure(c(2L, 1L), .Label = c("Bagel", 
"English_Muffins"), class = "factor"), Region = c(1L, 1L), Year = c(2015L, 
2015L), Count = c(5L, 4L)), .Names = c("Product_Name", "Region", 
"Year", "Count"), class = "data.frame", row.names = c(NA, -2L))

xlong <- orig[ rep(rownames(orig), orig$Count) , ]
    > xlong
       Product_Name Region Year Count
1   English_Muffins      1 2015     5
1.1 English_Muffins      1 2015     5
1.2 English_Muffins      1 2015     5
1.3 English_Muffins      1 2015     5
1.4 English_Muffins      1 2015     5
2             Bagel      1 2015     4
2.1           Bagel      1 2015     4
2.2           Bagel      1 2015     4
2.3           Bagel      1 2015     4