如何在R中使用两个表创建交叉表?

时间:2018-01-27 16:47:21

标签: r pivot-table

我有excel数据集如下:

AllJobs
    JobID
        Name
        Description

我想获得一个权重与数量数据透视表,每个类别的价格总和如下:

Weight Quantity Price
72       5      460
73       8      720
75       20     830
95       2      490
91       15     680
82       14     340
88       30     250
89       6      770
78       27     820
98       24     940
99       29     825

我为各个类别创建了两个表,以便使用 0-10 10-20 20-30 70-80 1180 830 820 80-90 770 340 250 90-100 490 680 1765 包获得平均值和计数,如下所示:

dplyr

现在,如何使用table1和table2创建如上所示的交叉表?

2 个答案:

答案 0 :(得分:5)

也许以下是你想要的。它会像您一样使用cut,然后使用xtabs

Weight = cut(dataset$Weight, breaks = c(70,80,90,100))
Quantity = cut(dataset$Quantity, breaks = c(0,10,20,30))
dt2 <- data.frame(Weight, Quantity, Price = dataset$Price)
xtabs(Price ~ Weight + Quantity, dt2)
#          Quantity
#Weight     (0,10] (10,20] (20,30]
#  (70,80]    1180     830     820
#  (80,90]     770     340     250
#  (90,100]    490     680    1765

答案 1 :(得分:2)

dplyrtidyr解决方案:

library(dplyr)
library(tidyr)

df %>% 
  mutate(Weight = cut(Weight, breaks = c(70,80,90,100)),
         Quantity = cut(Quantity, breaks = c(0,10,20,30))) %>% 
  group_by(Weight, Quantity) %>% 
  summarise(Price = sum(Price)) %>% 
  spread(Quantity, Price)

# A tibble: 3 x 4
# Groups:   Weight [3]
  Weight   `(0,10]` `(10,20]` `(20,30]`
* <fct>       <int>     <int>     <int>
1 (70,80]      1180       830       820
2 (80,90]       770       340       250
3 (90,100]      490       680      1765

数据:

df <- structure(list(Weight = c(72L, 73L, 75L, 95L, 91L, 82L, 88L, 
89L, 78L, 98L, 99L), Quantity = c(5L, 8L, 20L, 2L, 15L, 14L, 
30L, 6L, 27L, 24L, 29L), Price = c(460L, 720L, 830L, 490L, 680L, 
340L, 250L, 770L, 820L, 940L, 825L)), .Names = c("Weight", "Quantity", 
"Price"), class = "data.frame", row.names = c(NA, -11L))