Question

我想创建一个矩阵列表列，其中每个矩阵的条目都是原始数据集中已经存在的变量的元素。我的目标是为数据集的每一行创建2次2列联表，然后将每个矩阵作为参数传递给fisher.test。

我尝试使用mutate和matrix的组合添加新列，但这会返回错误。我也尝试使用do代替mutate，这似乎是朝着正确方向迈出的一步，但是我知道这也是不正确的，因为元素的尺寸是不正确的，并且只有输出中的一行。

library(tidyverse)

mtcars %>% 
  mutate(mat = matrix(c(.$disp, .$hp, .$gear, .$carb)))
#> Error: Column `mat` must be length 32 (the number of rows) or one, not 128

mtcars %>% 
  do(mat = matrix(c(.$disp, .$hp, .$gear, .$carb)))
#> # A tibble: 1 x 1
#>   mat            
#>   <list>         
#> 1 <dbl [128 x 1]>

^{由reprex package（v0.2.1）于2019-06-05创建}

我期望输出32行，并且mat列包含32个2x2矩阵，这些矩阵由mtcars$disp，mtcars$hp，mtcars$gear和{{ 1}}。

我的意图是使用mtcars$carb来传递map列中的每个条目作为mat的参数，然后提取优势比估计值和p值。但是，当然，主要的重点是创建矩阵列表。

Answer 1

您可以在pmap内的purrr包中使用mutate函数：

library(tidyverse)
mtcars %>% as_tibble() %>% 
  mutate(mat = pmap(list(disp, hp, gear, carb), ~matrix(c(..1, ..2, ..3, ..4), 2, 2)))

# A tibble: 32 x 12
     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb mat              
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <list>           
 1  21       6  160    110  3.9   2.62  16.5     0     1     4     4 <dbl[,2] [2 x 2]>
 2  21       6  160    110  3.9   2.88  17.0     0     1     4     4 <dbl[,2] [2 x 2]>

然后，每个mat项都是具有所需元素的2x2矩阵。希望这会有所帮助。

Answer 2

您有两个问题：

要将矩阵存储在data.frame（小标题）中，只需将其放入列表中即可。
要创建2 x 2矩阵（而不是在每个单元格中重复相同的4 x 32矩阵），您需要逐行工作。当前，当您执行matrix(c(disp, hp, gear, carb))时，您将创建一个4 x 32矩阵！您只需要4 x 1输入，并调整为2 x 2。

使用pmap可以逐行处理行，但也可以使用rowwise按行分组：

library(tidyverse)
df <- 
  mtcars %>% 
    as_tibble() %>%
    rowwise() %>%
    mutate(mat = list(matrix(c(disp, hp, gear, carb), 2, 2)))

编辑：现在您如何实际使用它们？让我们以fisher.test为例。请注意，测试是一个复杂的对象，具有组件（如p.value）和属性，因此我们必须将它们存储在列表列中。

您可以继续使用rowwise，在这种情况下，列表会自动“取消列出”：

df %>%
  # keep in mind df is still grouped by row so 'mat' is only one matrix.
  # A test is a complex object so we need to store it in a list-column
  mutate(test = list(fisher.test(mat)), 
         # test is just one test so we can extract p-value directly 
         pval = test$p.value)

或者，如果您停止逐行工作（您只需要ungroup），那么mat是可以将函数映射到的矩阵列表。我们使用map中的purrr函数。

library("purrr")

df %>%
  ungroup() %>%
  # Apply the test to each mat using `map` from `purrr` 
  # `map` returns a list so `test` is a list-column
  mutate(test = map(mat, fisher.test), 
         # Now `test` is a list of tests... so you need to map operations onto it 
         # Extract the p-values from each test, into a numeric column rather than a list-column
         pval = map_dbl(test, pluck, "p.value"))

您更喜欢哪个是口味问题：)

创建一列包含矩阵的数据框架

2 个答案: