我的数据看起来像 data.frame':833233 obs。 22个变量:
$ ProductId : num 105422 105422 143863 170645 397474 ...
$ Brand : num NA NA NA NA NA NA NA NA NA NA ...
$ Supplier : Factor w/ 788 levels "[00000] 武商量贩",..: 1 113 265 154 99 99 99 99 99 99 ...
$ Mode.of.operations : Factor w/ 3 levels "[1] Distribution",..: 1 1 1 3 2 2 2 2 2 2 ...
$ Category : Factor w/ 27 levels "[01] Fuits and Vegetables",..: 5 5 9 1 22 22 22 22 22 22 ...
$ Profit.margin : num 0 0 237.95 0 1.16 ...
$ Profit.margin.percentage : num 0 0 0.1 0 0.17 ...
我使用xtabs如下
xtabs(Profit.margin~Category+Mode.of.operations,wushang)
现在这给了我每个类别下每个类别的利润率总和。这样的操作
Mode.of.operations
Category [1] Distribution [2] Reseller [4] Joint venture
[01] Fuits and Vegetables 95103.75 0.00 331445.89
[02] Livestocks 282948.03 10982.10 91013.51
[03] Fisheries 21632.49 0.00 114708.34
[04] Food category 14236.32 5289.90 286585.22
[05] Daily distribution category 1039396.38 53995.36 222966.99
[06] Grains 640183.46 150810.26 64068.74
[07] seasoning spices 251716.98 175242.57 156037.71
[08] canned vegetables 15938.47 51549.80 0.00
[09] cigarette, wine and tea 810113.98 550314.93 43743.06
[10] candy cookies 605020.64 92855.09 626064.09
我也有兴趣找到均值,中位数而不是总和。有没有办法xtabs可以做到这一点?或者还有一些其他功能可以达到预期的效果。
我的数据有NA / #NA值,所以我希望其他函数在输出中给我0而不是NA,因为我必须稍后使用rowPerc
并且它只是跳过输出中具有NA的那一行。
编辑1
tapply
函数可以给出平均值和中位数,但其输出中包含NA。
> with(wushang, tapply(Profit.margin,list(Category,Mode.of.operations), mean))
输出
[1] Distribution [2] Reseller [4] Joint venture
[01] Fuits and Vegetables 29.5904636 NA 43.2753480
[02] Livestocks 47.9248018 9.076116 89.9342984
[03] Fisheries 33.5908230 NA 45.7552214
[04] Food category 13.9435064 13.324685 47.7403332
[05] Daily distribution category 27.8942724 58.563297 41.7854179
[06] Grains 35.7464660 14.332851 27.0446349
[07] seasoning spices 11.9870937 8.398877 34.4378084
[08] canned vegetables 5.0566212 8.977673 NA
[09] cigarette, wine and tea 79.4540977 31.158132 146.2978595
[10] candy cookies 18.8974463 9.113268 61.0555968
并在对其应用rowPerc
后,跳过整行
> rowPerc(with(wushang, tapply(Profit.margin,list(Category,Mode.of.operations), mean)))
[1] Distribution [2] Reseller [4] Joint venture Total
[01] Fuits and Vegetables 100.00
[02] Livestocks 32.62 6.18 61.21 100.00
[03] Fisheries 100.00
[04] Food category 18.59 17.76 63.65 100.00
[05] Daily distribution category 21.75 45.67 32.58 100.00
[06] Grains 46.35 18.58 35.07 100.00
[07] seasoning spices 21.86 15.32 62.82 100.00
[08] canned vegetables 100.00
[09] cigarette, wine and tea 30.93 12.13 56.95 100.00
[10] candy cookies 21.22 10.23 68.55 100.00
我怎样才能让它发挥作用? 感谢。