我有一个数据集,我想在其中定义“情节”。如果温度升高或降低至少15分钟,则定义为发作。 有没有办法手动构造它?
这是我的数据结构:
Patient Minute temperature
1 0,00 35,65
1 1,00 35,65
1 2,00 35,66
1 3,00 35,67
1 4,00 35,70
1 5,00 35,72
1 6,00 35,71
1 7,00 35,68
1 8,00 35,66
1 9,00 35,67
1 10,00 35,69
1 11,00 35,72
谢谢。
答案 0 :(得分:0)
simplePagination.js
实现的一种可能性是:
dplyr
请注意,我的天数从15天减少到4天(您可以通过修改df %>%
mutate(episode = temperature > lag(temperature, default = first(temperature))) %>%
group_by(rleid = with(rle(episode), rep(seq_along(lengths), lengths))) %>%
mutate(episode = (n() >= 4) * episode) %>%
ungroup() %>%
select(-rleid) %>%
left_join(df %>%
mutate(episode = temperature < lag(temperature, default = first(temperature))) %>%
group_by(rleid = with(rle(episode), rep(seq_along(lengths), lengths))) %>%
mutate(episode = (n() >= 4) * episode) %>%
ungroup() %>%
select(-rleid), by = c("Patient" = "Patient",
"Minute" = "Minute",
"temperature" = "temperature")) %>%
mutate(episode = pmax(episode.x, episode.y)) %>%
select(-episode.x, -episode.y)
Patient Minute temperature episode
<int> <dbl> <dbl> <int>
1 1 0 35.6 0
2 1 1 35.6 0
3 1 2 35.7 1
4 1 3 35.7 1
5 1 4 35.7 1
6 1 5 35.7 1
7 1 6 35.7 0
8 1 7 35.7 0
9 1 8 35.7 0
10 1 9 35.7 0
11 1 10 35.7 0
12 1 11 35.7 0
中的天数来更改),因为您的数据在那几天没有足够的行用于说明。
它的作用是,首先比较一行是否比上一行具有更高/更低的值(针对两个条件分别执行)。其次,它围绕此比较创建行程类型组ID。第三,如果满足n() >= 4
行的条件(在我的代码中为4),它将在名为“ episode”的变量中分配1。最后,它结合了第一步的比较结果。
或者如果您也想区分剧集:
n
在这里,使用2个窗口,“ episode” == 1表示增加,“ episode” == 2表示减少。
我想您想按“患者”分组,所以您可以这样做:
df %>%
mutate(episode = temperature > lag(temperature, default = first(temperature))) %>%
group_by(rleid = with(rle(episode), rep(seq_along(lengths), lengths))) %>%
mutate(episode = (n() >= 2) * episode) %>%
ungroup() %>%
select(-rleid) %>%
left_join(df %>%
mutate(episode = temperature < lag(temperature, default = first(temperature))) %>%
group_by(rleid = with(rle(episode), rep(seq_along(lengths), lengths))) %>%
mutate(episode = ((n() >= 2) * episode + 1) * episode) %>%
ungroup() %>%
select(-rleid), by = c("Patient" = "Patient",
"Minute" = "Minute",
"temperature" = "temperature")) %>%
mutate(episode = pmax(episode.x, episode.y)) %>%
select(-episode.x, -episode.y)
Patient Minute temperature episode
<int> <dbl> <dbl> <dbl>
1 1 0 35.6 0
2 1 1 35.6 0
3 1 2 35.7 1
4 1 3 35.7 1
5 1 4 35.7 1
6 1 5 35.7 1
7 1 6 35.7 2
8 1 7 35.7 2
9 1 8 35.7 2
10 1 9 35.7 1
11 1 10 35.7 1
12 1 11 35.7 1