创建从0到表变量中的值的新列

时间:2019-05-21 17:24:14

标签: r tidyverse

可重现的小贴士:我有一个与以下所示类似的数据库。区别在于我正在使用的数据库更大。

general_tibble <- tibble(gender = c("female", "female", "male"),
                             age = c(18, 19,18),
                             age_partner = c(22,20,17),
                             max_age = c(60, 60, 65), 
                             nrs =c(42,41,47))

general_tibble的结果是:

  gender age age_partner max_age nrs
1 female  18          22      60  42
2 female  19          20      60  41
3   male  18          17      65  47

问题: 如何从上一个表创建一个新表,该表采用nrs的值,并创建一个列变量n,该变量从0变为nrs中的值?

为进一步说明,在general_tibble的第1行中,nrs列等于42,因此该列将从0变为42,在第2行nrs中则等于41,因此该列将从0到41,与第3行相同。

我当前正在使用以下代码。它可以工作,但是当general_tibble太大时,代码的执行速度会非常慢。

general_list <- list()

for(i in 1:NROW(general_tibble)){
  general_list[[i]] <- data.frame(general_tibble[i, ], 
                             n = 0:general_tibble[[i, "nrs"]])
} 

然后我bind_rows() general_list获得general_binded

general_binded <- bind_rows(general_list)

general_binded[c(1:5, 38:42),]的结果是:

   gender age age_partner max_age nrs  n
1  female  18          22      60  42  0
2  female  18          22      60  42  1
3  female  18          22      60  42  2
4  female  18          22      60  42  3
5  female  18          22      60  42  4
38 female  18          22      60  42 37
39 female  18          22      60  42 38
40 female  18          22      60  42 39
41 female  18          22      60  42 40
42 female  18          22      60  42 41

PS:在for循环中,我使用data.frame()而不是tibble(),因为我想回收行。如果您有涉及微动或数据帧的建议,请不要接受。

5 个答案:

答案 0 :(得分:5)

最简单的方法是使用general_tibble函数在nrs列上扩展tidyr::expand()

library(tidyverse)

general_tibble %>% 
        group_by_all()%>% 
        expand(n = 0:nrs)

#> # A tibble: 133 x 6
#> # Groups:   gender, age, age_partner, max_age, nrs [3]
#>    gender   age age_partner max_age   nrs     n
#>    <chr>  <dbl>       <dbl>   <dbl> <dbl> <int>
#>  1 female    18          22      60    42     0
#>  2 female    18          22      60    42     1
#>  3 female    18          22      60    42     2
#>  4 female    18          22      60    42     3
#>  5 female    18          22      60    42     4
#>  6 female    18          22      60    42     5
#>  7 female    18          22      60    42     6
#>  8 female    18          22      60    42     7
#>  9 female    18          22      60    42     8
#> 10 female    18          22      60    42     9
#> # ... with 123 more rows

reprex package(v0.2.1)于2019-05-21创建


另一个仅使用 base R 函数的想法:

expanded_vars <- do.call(rbind,lapply(general_tibble$nrs, 
                                              function(x) expand.grid(x, 0:x)))
names(expanded_vars) <- c("nrs", "n")

merge(y = expanded_vars, x = general_tibble, by = "nrs", all = TRUE)

答案 1 :(得分:3)

关于使用data.tabletidyverse的一件好事是,您不需要考虑操作是mutate,{{1 }}或expand。您可以将所需的内容放到summarize的{​​{1}}部分中,无论要解析到多少行,这就是您所得到的。

j

答案 2 :(得分:2)

我们可以使用void initState() { super.initState(); //creating a file name eg: img_456985.jpg final rand = Math.Random().nextInt(10000); final fileExt = widget.file.path .substring(widget.file.path.lastIndexOf('.'), widget.file.path.length); _fileName = 'image_$rand$fileExt'; final StorageReference storeRef = FirebaseStorage.instance .ref() .child('threads') .child(widget.threadId) .child(_fileName); final uploadTask = storeRef.putFile(widget.file); uploadTask.events.listen((event) { setState(() { _uploadPercentage = event.snapshot.bytesTransferred.toDouble() / event.snapshot.totalByteCount.toDouble(); }); print(_uploadPercentage); }); uploadTask.onComplete.then((snapshot) { setState(() { _uploadStatus = UploadProgressStatus.complete; }); }); }

uncount

另一个选项是library(tidyverse) general_tibble %>% mutate(grp = row_number(), nrsN = nrs + 1) %>% uncount(nrsN) %>% group_by(grp) %>% mutate(n = row_number() - 1) %>% ungroup %>% select(-grp) # A tibble: 133 x 6 # gender age age_partner max_age nrs n # <chr> <dbl> <dbl> <dbl> <dbl> <dbl> # 1 female 18 22 60 42 0 # 2 female 18 22 60 42 1 # 3 female 18 22 60 42 2 # 4 female 18 22 60 42 3 # 5 female 18 22 60 42 4 # 6 female 18 22 60 42 5 # 7 female 18 22 60 42 6 # 8 female 18 22 60 42 7 # 9 female 18 22 60 42 8 #10 female 18 22 60 42 9 # … with 123 more rows

unnest

答案 3 :(得分:1)

使用dplyrtidyr,您还可以执行以下操作:

general_tibble %>%
 group_by(rowid = row_number()) %>%
 mutate(n = nrs) %>%
 complete(n = seq(0, n, 1)) %>%
 fill(everything(), .direction = "up") %>%
 ungroup() %>%
 select(-rowid)

       n gender   age age_partner max_age   nrs
   <dbl> <chr>  <dbl>       <dbl>   <dbl> <dbl>
 1     0 female    18          22      60    42
 2     1 female    18          22      60    42
 3     2 female    18          22      60    42
 4     3 female    18          22      60    42
 5     4 female    18          22      60    42
 6     5 female    18          22      60    42
 7     6 female    18          22      60    42
 8     7 female    18          22      60    42
 9     8 female    18          22      60    42
10     9 female    18          22      60    42

答案 4 :(得分:0)

以R为底的一种方式(减去tibble包)。

首先,按nrs组分组。其次,通过nrs值扩展每个数据框的行。第三,创建一个id列,该列表示0:无论行数如何。第四,将其放回tibble

library(tibble)

df <- tibble(
  gender      = c("female", "female", "male"),
  age         = c(18, 19, 18),
  age_partner = c(22, 20, 17),
  max_age     = c(60, 60, 65), 
  nrs         = c(42, 41, 47)
  )

nrs_split <- split(df, df$nrs)
df_list <- lapply(nrs_split, function(i) i[rep(seq_len(nrow(i)), each=i$nrs + 1), ])
df_renum <- lapply(df_list, function(i) {i$id <- 0:rle(i$nrs)$values; return(i)})
df <- do.call("rbind", df_renum)
df
#> # A tibble: 133 x 6
#>    gender   age age_partner max_age   nrs    id
#>  * <chr>  <dbl>       <dbl>   <dbl> <dbl> <int>
#>  1 female    19          20      60    41     0
#>  2 female    19          20      60    41     1
#>  3 female    19          20      60    41     2
#>  4 female    19          20      60    41     3
#>  5 female    19          20      60    41     4
#>  6 female    19          20      60    41     5
#>  7 female    19          20      60    41     6
#>  8 female    19          20      60    41     7
#>  9 female    19          20      60    41     8
#> 10 female    19          20      60    41     9
#> # … with 123 more rows