Question

我想得到一个整洁的数据结构，如下所示：

N    | r     | data     | stat
---------------------------------
10   | 0.2   | <tibble> | 0.5
20   | 0.3   | <tibble> | 0.86
...

根据前几列中的参数生成

data，并根据stat计算data。如果我有前两列，如何添加数据集的小标题？

作为一个最小的示例，这是一个创建两个相关列的函数：

correlated_data = function(N, r) {
  MASS::mvrnorm(N, mu=c(0, 4), Sigma=matrix(c(1, r, r, 1), ncol=2))
}

首先对N和r的所有组合运行此操作，

# Make parameter combinations
expand.grid(N=c(10,20,30), r=c(0, 0.1, 0.3)) %>%
  group_by(N, r) %>%
  expand(set=1:100) %>%  # create 100 of each combination

  # HERE! How to add a N x 2 tibble to each row?
  rowwise() %>%
  mutate(data=correlate_data( N, r))

  # Compute summary stats on each (for illustration only; not tested)
  mutate(   
     stats = map(data, ~cor.test(.x[, 1], .x[, 2])),  # Correlation on each
     tidy_stats = map(stats, tidy))  # using broom package

我确实有更多参数（N，r，分布），并且我将计算更多的汇总。如果其他工作流程更好，我也欢迎。

Answer 1

这是针对两个变量进行的：

map2(N, r, correlated_data)

有关更多变量，请使用

pmap(list(N, r), correlated_data)

因此，原始问题中的完整过程变为：

expand.grid(N=c(10, 20, 30), r=c(0, 0.1, 0.3)) %>%
  group_by(N, r) %>%
  expand(set=1:200) %>%  # create 100 of each combination

  # HERE! How to add a N x 2 tibble to each row?
  mutate(
    data = map2(N, r, correlated_data),
    stats = map(data, ~cor.test(.[, 1], .[,2])),
    tidy_stats = map(stats, tidy)
  ) %>%  # using broom package

  unnest(tidy_stats)

在Tidyr中模拟许多数据集

1 个答案: