我有一个df $ addr列,我想分成两列df $ str.num和df $ str.name。一些df $ addr出现包含破折号,这使得难以准确提取街道号码(df $ str.num)。我尝试了很多解决方案,但没有做对。
有什么建议吗?
addr <- c("84-86 19th Ave",
"35 Halsey St",
"350 Broad St",
"997 S Orange Ave",
"274 Chestnut St",
"226 Lackawanna Ave",
"99 2nd Ave",
"261 S Orange Ave",
"357 Wilson Ave",
"402 Mount Prospect Ave # Lb2",
"380-2 Mount Prospect Ave",
"105 Lock St # 219",
"451 S 15th St")
df <- data.frame(addr)
答案 0 :(得分:2)
一个选项是使用tidyr::extract
将digit
和-
分隔为str.num
,其余为str.name
:
library(tidyr)
extract(df, addr, c("str.num", "str.name"), regex = "([[:digit:]-]+)\\s(.*)" )
# str.num str.name
# 1 84-86 19th Ave
# 2 35 Halsey St
# 3 350 Broad St
# 4 997 S Orange Ave
# 5 274 Chestnut St
# 6 226 Lackawanna Ave
# 7 99 2nd Ave
# 8 261 S Orange Ave
# 9 357 Wilson Ave
# 10 402 Mount Prospect Ave # Lb2
# 11 380-2 Mount Prospect Ave
# 12 105 Lock St # 219
# 13 451 S 15th St
答案 1 :(得分:1)
与MKR的解决方案非常相似 - 但使用stringr
library(stringr)
pat <- "(^[0-9-]+)[:space:]+([A-Za-z0-9].+)"
str_match(addr, pat)
[,1] [,2] [,3]
[1,] "84-86 19th Ave" "84-86" "19th Ave"
[2,] "35 Halsey St" "35" "Halsey St"
[3,] "350 Broad St" "350" "Broad St"
[4,] "997 S Orange Ave" "997" "S Orange Ave"
[5,] "274 Chestnut St" "274" "Chestnut St"
[6,] "226 Lackawanna Ave" "226" "Lackawanna Ave"
[7,] "99 2nd Ave" "99" "2nd Ave"
[8,] "261 S Orange Ave" "261" "S Orange Ave"
[9,] "357 Wilson Ave" "357" "Wilson Ave"
[10,] "402 Mount Prospect Ave # Lb2" "402" "Mount Prospect Ave # Lb2"
[11,] "380-2 Mount Prospect Ave" "380-2" "Mount Prospect Ave"
[12,] "105 Lock St # 219" "105" "Lock St # 219"
[13,] "451 S 15th St" "451" "S 15th St"
不确定你对正则表达式的熟悉程度,重要的是要注意括号()
用于识别我们想要提取的分组模式。
答案 2 :(得分:1)
使用base R,我们可以使用sub
函数执行此操作:
data.frame(str.num = sub(" .*", "", addr), str.name = sub("[0-9-]* ", "", addr))
# str.num str.name
# 1 84-86 19th Ave
# 2 35 Halsey St
# 3 350 Broad St
# 4 997 S Orange Ave
# 5 274 Chestnut St
# 6 226 Lackawanna Ave
# 7 99 2nd Ave
# 8 261 S Orange Ave
# 9 357 Wilson Ave
# 10 402 Mount Prospect Ave # Lb2
# 11 380-2 Mount Prospect Ave
# 12 105 Lock St # 219
# 13 451 S 15th St