在纵向数据集中根据年龄生成随访时间变量?

时间:2019-06-11 11:20:44

标签: r survival-analysis cox-regression

我有一个包含各种连续变量和分类变量的数据集。但是,我没有要创建的后续时间变量。我的数据集目前势在必行,有8个年龄段的年龄和8个年龄段的记录的事件是SELECT GROUP_CONCAT(DISTINCT(firstname) SEPARATOR ' ') , GROUP_CONCAT(DISTINCT(lastname) SEPARATOR ' ') FROM table 。痴呆症变量的二进制编码为1和0。如何将它们用于变量两个会生成事件发生时间的变量?

这是数据的样子:

dementia_1

我希望有一个变量来表示每个人得痴呆症的时间。

1 个答案:

答案 0 :(得分:0)

我不知道这是否对您有帮助。似乎您没有为每个人记录每个波浪的年龄。话虽如此,但前提是您只有在诊断后才有年龄。

library(tidyverse)

# Generate Sample Data

dat <- tibble(id = 1:50,
              age_1 = rnorm(50, 50, 2)) %>%
  mutate(
    age_2 =  age_1 + 5,
    age_3 =  age_2 + 5,
    age_4 =  age_3 + 5,
    age_5 =  age_4 + 5
  ) %>%
  add_column(dementia = rbinom(50, 1, .1))

# Now get data in long format to do the calculations
dat_2 <- dat %>% 
  gather(wave, age, contains("age"))


dat_2 %>% 
  group_by(id, dementia) %>% 
  filter(dementia==1) %>% # Diagnoses
  filter(age == min(age)) %>% 
  rename(age_at_diagnosis = age)# Age first appeared

这将为您提供以下内容:

# A tibble: 5 x 4
# Groups:   id, dementia [5]
     id dementia wave  age_at_diagnosis
  <int>    <int> <chr>            <dbl>
1     7        1 age_1             52.3
2    13        1 age_1             50.6
3    24        1 age_1             50.8
4    34        1 age_1             52.5
5    35        1 age_1             50.3

从理论上讲,您可以使用此数据框,然后将其与已故时间或数据集中的最小年龄合并。

first_diagnosis <- dat_2 %>% 
  group_by(id, dementia) %>% 
  filter(dementia==1) %>% # Diagnoses
  filter(age == min(age)) %>% 
  ungroup() %>% 
  rename(age_at_diagnosis = age)# Age first appeared

age_first_age <- dat_2 %>% 
  group_by(id, dementia) %>% 
  filter(age == min(age)) # Age first appeared

age_first_age %>% 
  left_join(first_diagnosis %>% 
              select(id, age_at_diagnosis), by = "id") %>% 
  mutate(time_to_event = age_at_diagnosis - age)

哪个会给你看起来像这样的东西

# A tibble: 50 x 6
# Groups:   id, dementia [50]
      id dementia wave    age age_at_diagnosis time_to_event
   <int>    <int> <chr> <dbl>            <dbl>         <dbl>
 1     1        0 age_1  45.8             NA              NA
 2     2        0 age_1  46.7             NA              NA
 3     3        0 age_1  49.0             NA              NA
 4     4        0 age_1  53.4             NA              NA
 5     5        0 age_1  47.4             NA              NA
 6     6        0 age_1  49.1             NA              NA
 7     7        1 age_1  52.3             52.3             0
 8     8        0 age_1  49.6             NA              NA
 9     9        0 age_1  52.1             NA              NA
10    10        0 age_1  54.4             NA              NA