如果您运行这些代码,我有一些像这个例子的代码
library(hurricaneexposure)
library(hurricaneexposuredata)
data("hurr_tracks")
storms <- unique(hurr_tracks$storm_id)
storms
然后你会看到“风暴”有一个带有“stormname-year”结构的长字符串列表。
[1] "Alberto-1988" "Beryl-1988" "Chris-1988" "Florence-1988" "Gilbert-1988" "Keith-1988" "Allison-1989" "Chantal-1989"
[9] "Hugo-1989" "Jerry-1989" "Bertha-1990" "Marco-1990" "Ana-1991" "Bob-1991" "Fabian-1991" "Notnamed-1991"
[17] "Andrew-1992" "Danielle-1992" "Earl-1992" "Arlene-1993" "Emily-1993" "Alberto-1994" "Beryl-1994" "Gordon-1994"
[25] "Allison-1995" "Dean-1995" "Erin-1995" "Gabrielle-1995" "Jerry-1995" "Opal-1995" "Arthur-1996" "Bertha-1996"
[33] "Edouard-1996" "Fran-1996" "Josephine-1996" "Subtrop-1997" "Ana-1997" "Danny-1997" "Bonnie-1998" "Charley-1998"
[41] "Earl-1998" "Frances-1998" "Georges-1998" "Hermine-1998" "Mitch-1998" "Bret-1999" "Dennis-1999" "Floyd-1999"
[49] "Harvey-1999" "Irene-1999" "Beryl-2000" "Gordon-2000" "Helene-2000" "Leslie-2000" "Allison-2001" "Barry-2001"
我的问题是如何根据同年拆分这些元素。例如,我想创建一个新的变量“y1988”,它是1998年所有风暴的列表。如果我运行y1988,它将输出:
y1988
[1] "Alberto-1988" "Beryl-1988" "Chris-1988" "Florence-1988" "Gilbert-1988" "Keith-1988"
至于y1989直到2001年。我猜它可能会使用gsub()和for循环,但是,我是R的新秀,所以真的希望你能给我一些建议。
答案 0 :(得分:1)
我们可以将split
与通过删除前缀子字符串创建的分组变量一起使用,包括-
和sub
。
lst <- split(storms, sub(".*-", "", storms))
lst$`1988`
#[1] "Alberto-1988" "Beryl-1988" "Chris-1988" "Florence-1988"
#[5] "Gilbert-1988" "Keith-1988"
storms <- c("Alberto-1988", "Beryl-1988", "Chris-1988", "Florence-1988",
"Gilbert-1988", "Keith-1988", "Allison-1989", "Chantal-1989",
"Hugo-1989", "Jerry-1989", "Bertha-1990", "Marco-1990", "Ana-1991",
"Bob-1991", "Fabian-1991", "Notnamed-1991", "Andrew-1992", "Danielle-1992",
"Earl-1992", "Arlene-1993", "Emily-1993", "Alberto-1994", "Beryl-1994",
"Gordon-1994", "Allison-1995", "Dean-1995", "Erin-1995", "Gabrielle-1995",
"Jerry-1995", "Opal-1995", "Arthur-1996", "Bertha-1996", "Edouard-1996",
"Fran-1996", "Josephine-1996", "Subtrop-1997", "Ana-1997", "Danny-1997",
"Bonnie-1998", "Charley-1998", "Earl-1998", "Frances-1998", "Georges-1998",
"Hermine-1998", "Mitch-1998", "Bret-1999", "Dennis-1999", "Floyd-1999",
"Harvey-1999", "Irene-1999", "Beryl-2000", "Gordon-2000", "Helene-2000",
"Leslie-2000", "Allison-2001", "Barry-2001")
答案 1 :(得分:0)
为什么不直接在原始数据框中提取年份?图书馆dplyr
和tidyr
非常适合这样的问题。
我建议如下:
library(dplyr)
library(tidyr)
hurr_tracks %>%
extract(storm_id, c("storm", "year"),"(.+)-(.+)")
答案 2 :(得分:0)
使用stringr的替代方法
分裂(暴风雨,str_extract(暴风雨,&#34; [0-9] +&#34))