我想提取单词" COUNTY"从以下字符串向量。我希望这可以扩展到不同的情况(上部和下部)以及可能出现的不同间距问题。我有以下向量:
COUNTY=c("LAWRENCE COUNTY", "SALT LAKE", "OCEAN COUNTY", "JASPER COUNTY",
"PIMA", "JACKSON COUNTY", "PORTAGE COUNTY", "SEBASTIAN COUNTY",
"ORANGE", "BERGEN COUNTY")
COUNTY
1 LAWRENCE COUNTY
2 SALT LAKE
3 OCEAN COUNTY
4 JASPER COUNTY
5 PIMA
6 JACKSON COUNTY
7 PORTAGE COUNTY
8 SEBASTIAN COUNTY
9 ORANGE
10 BERGEN COUNTY
我希望这个矢量看起来像这样:
COUNTY
1 LAWRENCE
2 SALT LAKE
3 OCEAN
4 JASPER
5 PIMA
6 JACKSON
7 PORTAGE
8 SEBASTIAN
9 ORANGE
10 BERGEN
我基本上想要删除所说的" COUNTY"。
答案 0 :(得分:2)
使用gsub
,如果已知大小并且间距已知:
> gsub(' COUNTY', '', COUNTY, fixed = TRUE)
## [1] "LAWRENCE" "SALT LAKE" "OCEAN" "JASPER" "PIMA" "JACKSON"
## [7] "PORTAGE" "SEBASTIAN" "ORANGE" "BERGEN"
案件未知:
> gsub(' county', '', COUNTY, ignore.case = TRUE)
## [1] "LAWRENCE" "SALT LAKE" "OCEAN" "JASPER" "PIMA" "JACKSON"
## [7] "PORTAGE" "SEBASTIAN" "ORANGE" "BERGEN"
间距和案例未知:
> gsub('\\s+(county)', '', COUNTY, ignore.case = TRUE)
## [1] "LAWRENCE" "SALT LAKE" "OCEAN" "JASPER" "PIMA" "JACKSON"
## [7] "PORTAGE" "SEBASTIAN" "ORANGE" "BERGEN"
或者,可以使用strsplit
:
> unlist(strsplit(COUNTY, ' COUNTY'))
## [1] "LAWRENCE" "SALT LAKE" "OCEAN" "JASPER" "PIMA" "JACKSON"
## [7] "PORTAGE" "SEBASTIAN" "ORANGE" "BERGEN"