我正在尝试创建一个名为 cpi2000 的新变量,该变量将 2000 年的 cpi 值用于系列中的所有观察值(我有四个系列,因此是 group_by),以便我可以计算通货膨胀调整因子。但是,以下代码仅替换了 2000 年的值,而将其他年份保留为 NA。基本上,我希望 cpi2000 中有四个重复的数字,每个系列一个。
这是我的数据的头部:
Groups: series_id [1]
year series_id value seasonal_adj series_name cpi2000
<chr> <chr> <dbl> <chr> <chr> <dbl>
1 2000 CPIAUCSL 172. seasonally adjusted US city average, all items, seasonally adjusted 172.
2 2001 CPIAUCSL 177. seasonally adjusted US city average, all items, seasonally adjusted NA
3 2002 CPIAUCSL 180. seasonally adjusted US city average, all items, seasonally adjusted NA
4 2003 CPIAUCSL 184 seasonally adjusted US city average, all items, seasonally adjusted NA
5 2004 CPIAUCSL 189. seasonally adjusted US city average, all items, seasonally adjusted NA
6 2005 CPIAUCSL 195. seasonally adjusted US city average, all items, seasonally adjusted NA
>
cpi_values_tidy_clean <- cpi_values_tidy %>%
separate(date,
into = c("year"),
sep = "-",
extra = "drop") %>% # separate NAM into three variables
group_by(series_id) %>%
mutate(cpi2000 = if_else(year == 2000, value, value[2000])) %>%
glimpse()
输出如下:
[1] 172.192 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 172.200 NA NA NA NA NA NA NA NA NA NA NA NA NA
[36] NA NA NA NA NA NA NA 165.717 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 165.725 NA NA NA NA NA NA
[71] NA NA NA NA NA NA NA NA NA NA NA NA NA NA
我认为最好的方法是使用 if_else 语句(case_when 似乎不起作用)。如果我能弄清楚如何让 if_else 语句中的第二个参数 ("value[2000]) 在 year == 2000 时也取值,这将起作用,但我无法弄清楚如何指定条件第二个说法。
最终目标是创建两个变量 cpi2000 和 cpi2019,这样我就可以创建第三个变量 cpi_adj = (cpi2019/cpi2000) 可用作通胀因素。
任何帮助将不胜感激。
答案 0 :(得分:0)
我意识到我可以在第二个条件 value[year == 2000] 中指定年份,而不是像我用 value[2000] 那样子集括号位置。子集 using 2000 产生了“NA's”,因为没有第 2000 行,而是我会使用 value[1],因为我想要第一个值。或者,按年份过滤更安全,因为它允许我指定我想要的年份。下面是我解决的代码和输出:
cpi_values_tidy_clean <- cpi_values_tidy %>%
separate(date,
into = c("year"),
sep = "-",
extra = "drop") %>% # separate NAM into three variables
group_by(series_id) %>%
mutate(cpi2000 = if_else(year == 2000, value, value[year == 2000])) %>%
mutate(cpi2019 = if_else(year == 2019, value, value[year == 2019])) %>%
glimpse()
head(cpi_values_tidy_clean)
year series_id value seasonal_adj series_name cpi2000 cpi2019
<chr> <chr> <dbl> <chr> <chr> <dbl> <dbl>
1 2000 CPIAUCSL 172. seasonally adjusted US city average, all items, seasonally adjusted 172. 256.
2 2001 CPIAUCSL 177. seasonally adjusted US city average, all items, seasonally adjusted 172. 256.
3 2002 CPIAUCSL 180. seasonally adjusted US city average, all items, seasonally adjusted 172. 256.
4 2003 CPIAUCSL 184 seasonally adjusted US city average, all items, seasonally adjusted 172. 256.
5 2004 CPIAUCSL 189. seasonally adjusted US city average, all items, seasonally adjusted 172. 256.
6 2005 CPIAUCSL 195. seasonally adjusted US city average, all items, seasonally adjusted 172. 256.
如果有人知道如何更优雅地执行此操作或使用 case_when,我很乐意看到它。