Question

我如何伪 - table()两个变量，但填充第三列/单独列表中的值？

示例：

library(ggplot2) # diamonds data
data(diamonds)
T.matrix <- with(diamonds, table(color, clarity))

产地：

     clarity
color   I1  SI2  SI1  VS2  VS1 VVS2 VVS1   IF
    D   42 1370 2083 1697  705  553  252   73
    E  102 1713 2426 2470 1281  991  656  158
    F  143 1609 2131 2201 1364  975  734  385
    G  150 1548 1976 2347 2148 1443  999  681
    H  162 1563 2275 1643 1169  608  585  299
    I   92  912 1424 1169  962  365  355  143
    J   50  479  750  731  542  131   74   51

除了填充= reference$value而不是table()计数

之外，我想要一个颜色清晰的类似表格

reference <- expand.grid(clarity = c("I1", "SI2", "SI1", "VS2", "VS1","VVS2", "VVS1", "IF"),
                         color = c("D", "E", "F", "G", "H", "I", "J"))
reference$value <- 1:56

所以：[D，I1]的值为1，[SI1，D] = 2，[VS2，H] = 36等。

Answer 1

尝试tapply：

tapply(diamonds$price, list(diamonds$color, diamonds$clarity), mean)

tapply获取所需的变量，按变量列表对其进行分组，然后应用最后一个函数。表输出可能不太有用，具体取决于您的用途。

如果您希望数据采用更有用的格式，则可能需要使用dplyr：

library(dplyr)

diamonds %>% group_by(clarity, color) %>%
             summarise(mean(price))

编辑：它是一样的！

tapply(reference$value, list(reference$color, reference$clarity), FUN = sum)

你需要乐趣或者tapply折叠输出

使用来自另一列的值填充伪表（）

1 个答案: