根据几列生成一系列新行

时间:2019-07-20 07:11:13

标签: r dplyr seq

我想为数据帧中的一组变量依次创建新行。例如,我有这些虚拟数据

data1 <- data.frame(id = c('JUJ', 'SJD'), 
                    sex = c('male', 'female'),
                    year = c(2000, 2010),
                    age = c(48, 75), blood = c(6.85, 4.6))
data1

| id  | sex    | year | age | blood |
|-----|--------|------|-----|-------|
| JUJ | male   | 2000 | 48  | 6.85  |
| SJD | female | 2010 | 75  | 4.6   |

我想为每个id生成4个观察值(作为行)。对于yearage,每个新行应比上一行大1个单位。对于某些变量(例如在这些数据中),sexblood在所有行中都应保持相同。

我确定R中的seq()函数可以正常工作,但是我可以找到一些正确使用它的方法。我希望解决方案包含tidyverse函数。

最后,数据看起来像这样

data2 <- data.frame(id = c('JUJ', 'JUJ', 'JUJ', 'JUJ', 'SJD', 'SJD', 
                   'SJD', 'SJD'), 
                    sex = c('male', 'male', 'male', 'male', 'female', 
                   'female', 'female', 'female'),
                    year = c(2000, 2001, 2002, 2003, 2010, 2011, 2012, 2013),
                    age = c(48, 49, 50, 51, 75, 76, 77, 78), 
                    blood = c(6.85, 6.85, 6.85, 6.85, 4.6, 4.6, 4.6, 4.6))
data2

| id  | sex    | year | age | blood |
|-----|--------|------|-----|-------|
| JUJ | male   | 2000 | 48  | 6.85  |
| JUJ | male   | 2001 | 49  | 6.85  |
| JUJ | male   | 2002 | 50  | 6.85  |
| JUJ | male   | 2003 | 51  | 6.85  |
| SJD | female | 2010 | 75  | 4.6   |
| SJD | female | 2011 | 76  | 4.6   |
| SJD | female | 2012 | 77  | 4.6   |
| SJD | female | 2013 | 78  | 4.6   |

3 个答案:

答案 0 :(得分:3)

我们可以使用slice将行重复n次,group_by id,并依次递增ageyear列。

library(dplyr)

n <- 4
data1 %>%
  slice(rep(seq_len(n()), each = n)) %>%
  group_by(id) %>%
  mutate_at(vars(year, age), ~. + 0:(n - 1))

#  id    sex     year   age blood
#  <fct> <fct>  <dbl> <dbl> <dbl>
#1 JUJ   male    2000    48  6.85
#2 JUJ   male    2001    49  6.85
#3 JUJ   male    2002    50  6.85
#4 JUJ   male    2003    51  6.85
#5 SJD   female  2010    75  4.6 
#6 SJD   female  2011    76  4.6 
#7 SJD   female  2012    77  4.6 
#8 SJD   female  2013    78  4.6 

答案 1 :(得分:3)

另一种-- pseudocode CREATE TABLE tab( autoincrement_id INT AUTO_INCREMENT, crypto_id <type> GENERATED ALWAYS AS (FN_CRYPTO(autoincrement_id)) STORED ); -- SQL Server example, SHA function is an example and should be replaced CREATE TABLE tab( autoincrement_id INT IDENTITY(1,1), crypto_id AS (HASHBYTES('SHA2_256',CAST(autoincrement_id AS NVARCHAR(MAX)))) PERSISTED ); dplyr的可能性是:

tidyr

答案 2 :(得分:0)

另一种tidyverse解决方案:

library(tidyverse)

data1 %>% 
  mutate_at(vars(year, age), list(~ map(. ,~seq(.x, .x + 4 - 1))))%>% 
  unnest %>% select(-blood, blood)
#>    id    sex year age blood
#> 1 JUJ   male 2000  48  6.85
#> 2 JUJ   male 2001  49  6.85
#> 3 JUJ   male 2002  50  6.85
#> 4 JUJ   male 2003  51  6.85
#> 5 SJD female 2010  75  4.60
#> 6 SJD female 2011  76  4.60
#> 7 SJD female 2012  77  4.60
#> 8 SJD female 2013  78  4.60